The Problem

The Problem

When performing conversion form vector data model to raster data model on spatial data, rasterization error occurs. This is especially true on conversion of polygon and line features from vector data sources. In fact, rasterization error of polygon features has been discussed by literatures such as Burrough et al (1998) and Bregt et al (1991). In Burrough’s book two sources of error of rasterization were identified: the mixed pixel problem and the problem of topological mismatch. The first problem occurs when the grid cell size is larger than the feature to be converted and because of the fact that each grid cell can only contain one attribute. Topological mismatch arises when using grid cell size smaller than the feature so that the smooth boundaries of the polygon are being approximated by the grid cells. Therefore, whichever cell size is chosen, there will be error and this will be either of the two errors discussed. Bregt et al in their paper discuss that rasterization error is influenced by rasterization method, size of raster cell and map complexity. Both literatures describe methods of determining and calculation of these rasterization errors, however this is not the main focus of this project and therefore this project will not go into further detail regarding this matter. The main interest is then looking at rasterization error introduced by grid cell size.

Grid cell size chosen for the raster model is important since it will affect the degree and type of error introduced by rasterization. Using different grid cell sizes can produce different pictures of real world phenomena. Not only that, inference to the nature of real world phenomena is also affected with different grid cell sizes chosen. One of the statistical inferences on spatial data is spatial autocorrelation. Spatial autocorrelation is a statistical measure base on the fact that “everything is related to everything else, but near things are more related than distant things” according to the first law of Geography introduced by Tobler (1970). Therefore it is not surprising that geographic features tend to cluster together over space. Spatial autocorrelation is therefore, as described by Odland (1988, 7), “exists whenever a variable exhibits a regular pattern over space in which its values at a set of locations depend on values of the same variable at other locations.” This provides a very adequate definition of spatial autocorrelation, but what about if the underlying pattern over space is being represented by different data model and this affects the variable being represented? In other words, the measure of spatial autocorrelation is going to be affected by the method in representing variable over space. When raster grid is used, the problem is created as illustrated by rasterization error using different grid cell size. As grid cell size increases, more area is being covered by a single cell and this will increase the chance of getting more and more variables included in the cell. As a result, as grid cell size is increasing, more polygons are being collapsed into one cell and the structure and the nature of the data change over the map. On the other hand the original nature of data is best approximated by the smallest cell size that can be used. As a consequence different spatial autocorrelation measures will arrive. Therefore the question to be asked becomes how rasterization error affects the degree of spatial autocorrelation of spatial data as raster grid cell size differs. The main purpose is to illustrate to what extent rasterization error and grid cell sizes chosen so that statistical inference of spatial data is acceptable.

GO TO next topic -->

<-- Back to Index