Consider first the descriptive statistics in Table 3. The grid coverages can basically be divided into two groups based on the statistics. The first group is those grids having cell size ranging from 10m to 500m and the rest three grids go to the second group. They are grouped this way is because the statistics for these grids are quite homogeneous. Within the first group the minimum and maximum are the same, the mean and standard deviation do not fluctuate too much, and this holds for skewness and kurtosis as well. However distribution of values of the grids in the first group are slightly positively skewed and having a longer tail than normal distribution, as shown by the skewness and kurtosis measures. This is because of the extreme value as shown on the graph next page. This graph shows the distribution of values of grid with cell size equals 10 with the red curve showing the normal distribution (all distribution graphs are having approximately the same shape in the first group). Beside the extreme value, the distribution seems to be quite normally distributed.
Figure 3. The distribution cell values of grid having cell size
equals 10m
The second group, on the other
hand having a smaller range and standard deviation and a smaller n as well,
the distribution is not as skewed as the first group because of the removal
of the extreme value as the grid cell size is larger. However, since
the descriptive statistical measures of the second group differ from the
first group, which means less resemblance to the original data, the grids
in the second group can be excluded from the analysis (this is proved in
further discussion).
Just using the descriptive statistics does not provide a very
clear picture as to how rasterization error affects the degree of spatial
autocorrelation. As expected the Moran and Geary statistics demonstrate
that as grid cell size increases spatial autocorrelation decreases towards
negative autocorrelation (refer to Table
1 and Table 2).
The graph of Moran’s I coefficient shows a downward sloping curve while
the curve of Geary’s c is just moving the other way round. Examining
the test statistics of Moran from Table
1 (from this point onward the analysis is mainly focusing on Moran’s
I, the description and the inference from Geary’s c are basically the same)
all are significant except after the grid having cell size of 500m.
The p-values are very low when starting off, then increases to a point
that the value jumps to 0.8131. From the very first grid of cell
size equals 10m to the grid having cell size equals 500m the null hypothesis
of no autocorrelation is rejected because of the values of normal statistic
exceed the cut off value at 95% confidence interval. On the other
hand the last three cases are having insignificant test statistic and therefore
the null hypothesis of no autocorrelation is not rejected. This again
supports the argument made from the descriptive statistics that the last
three grids form a group from the others and they can be excluded from
the analysis.
Looking at the graphs of Moran’s
I and Geary’s c another interesting fact comes out. Not only that
the last three grids do not provide significant measure of spatial autocorrelation,
the curves for both graph actually fluctuate after the grid cell size of
277.78, as opposed to the ordinary decreasing slope (or increasing slope
for Geary) found prior to that. This suggests that grid cell sizes
after this threshold are not giving reliable results as to the degree of
spatial autocorrelation, even though the test statistics proof that they
are significant. This threshold is important in terms of doing vector
to raster conversion. Anything beyond this threshold is not going
to be reliable, or to say not stable in terms of making statistical inference
of spatial autocorrelation to real world phenomena.
GO TO Next Topic -->
<-- Back to Index