Statistical Methods


There are four main types of interpolation methods available within spatial statistics. Within these four areas there are numerous varieties. For our purposes we looked at the four general types of interpolation methods in order to decide which one would be most effective for our particular data. The four types of interpolation methods available are:

The first step taken when trying to find the most accurate interpolation method was to look at previous cases where different interpolation methods have been used with soils. The findings from these cases were not at all conclusive. In one test kriging was found to have the most accurate results with inverse distance weighted coming in second and last in accuracy was splining (Schloeder, Zimmerman, Jacobs). For another test inverse distance weighted was found to be best with kriging coming in last (Kravchenko, Bullock, 1999) .

The next step taken was to determine which methods would not be appropriate. Kriging was canceled first due to the fact that it takes into account a general trend component. Our data, with a forested area merging into a bare area, and then merging into another forested area, would require us to use three trends instead of just one. The remaining three methods would be acceptable with respect to their properties.

Inverse Distance Weighted (IDW) interpolation uses a desired number of closest known neighbourhood points to find the unknown point. The known points are weighted according to their distance from the unknown point. These values are put into a function that interpolates the unknown point. This is the method of interpolation that we feel is likely to be the most accurate after looking at the studies stated above as well as some spatial statistics texts. Since IDW is not available in the software requested by Todd Redding and Todd does not desire to use this method, it will not be used to do any of the analysis but, we will be doing some examples of IDW to show the possible differences between the methods. By creating these examples we will not be able to prove anything conclusively about which method is more appropriate due to the fact that we do not have any ground truthing for the points we are interpolating.

Triangulation is where the three nearest known points that surround the unknown point in a triangular fashion are used to find the unknown point. This is similar to IDW, except that in IDW the point is found using the three or more closest points regardless of their relationship to each other, whereas, in triangulation the unknown point must be within a triangle formed by the known points. Although our points are randomly spaced, they are close enough to a regular grid to be considered regularly spaced. Triangulation works best on randomly spaced points, making this method undesirable for our purposes.

Splining operates by passing a polynomial function through the known data points in order to interpolate the unknown points. This is the interpolation that our advisor was most interested in using, so we tried to make the results as accurate as possible but many problems using this method are possible to arise. The two main reasons behind these problems are: the data is much to regularly spaced to accurately fit the polynomial function through them and second, much of the data layers have drastic changes in the Y value with respect to the forested and deforested areas. Splines tend to create values much greater than the highest known point and much lower than the lowest known point in these areas with drastic Y value changes (ESRI).

Within the method of splining there are two main classes: splining with and splining without tension. Our first tests were used with no tension. Splining without tension allows the polynomial to be as bell-shaped as required to pass through all of the points. Due to the fact that much of our data has drastic changes the peaks and troughs of the polynomial were returning interpolated values much higher and much lower than the highest and lowest points in the data set (ER Mapper 6.0). This was obviously unacceptable.

With increased tension, the polynomial's peaks and troughs get closer to the data points and at the same time the function becomes less bell-shaped and much more angular. The effect of this is that there is much less chance of points being found higher than the maximum and lower than the minimum known points.

Since splining was the method desired by our advisor, we felt that using 100% tension was the best way to get accurate results from the interpolation. Since our data is so regularly spaced, the computer has trouble placing a polynomial function through them. The best way to reduce the error in this calculation is to have it as edgy (100% tension) as possible (Coburn, 2001). Although the interpolation using splining is likely to be quite inaccurate, this is the best way to reduce these errors while still being constrained to this method.