Problems and Errors:

As with any GIS project one of the major hurdles that have you have to attempt to overcome and consider is the topic of problems and errors. Problems and errors can occur at any stage of the process but you will encounter them the minute you begin your project. A large portion of problems that you may encounter at the beginning of your project comes from data decision, data collection and errors during the operational stages. These are explained below.


Data Decision and Collection:

A large portion of the problems occurred during the beginning of my project because it was difficult to decide, first, what project topic to do analysis on, second, what type of data should I use, and third what to include and exclude when collecting and manipulating all of the dataset. Data decision has problems because what you choose to show, map, and collect from dataset will reflect in your final output, which can be bad sometimes. For example, for the census information for this project, a lot of the data was collected from the Canadian Census Analyzer. Therefore, what data to include in my project for analysis was based on my own decisions. This can be very subjective and can be argued by many people, because people can debate or disagree with the datasets that I have chosen to use. A lot of data that was collected during data collection from the census analyzer such as population density, age group, ethnic groups and other data were not used in this project because in the end I chose not to include them.

Data collection for this project was also a large problem because majority of the data are from secondary sources. Secondary data are data that is not created or gathered first hand by you, because they are created by other people. Therefore, the problems with using secondary data is that you will not generally know where the original data came from or there maybe errors and problems with the data that you cannot see or know about. These data could be from second, third, or from any other providers. For example, the map on the Surrey Communities District map was not created or digitized by me, it was collected from the GIS Section of Engineering Department for the city of Surrey. The map that I received from them was not projected correctly because the coordinate system was undefined and I had to apply the coordinate system myself. There are also issues with security for the map shapefile provided by the Surrey GIS section. I only need to email them with my SFU email and contact them over the phone in order to retrieve the data. Although I needed to sign a agreement form but this form was scanned and emailed back to them, so they never received the original agreement form from me. There were not much legal issues to deal with, therefore it makes you question how reliable the datasets provided really are. Most of the other secondary data collected for this project was from the SIS Network at SFU, we are provided with only whatever available datasets that are in there.


Operational:

Data problems and errors are not the only errors to consider, there are also operational errors throughout this project. Assigning what values to another value during reclassing for all the raster files is based on my own judgments. One example, is the landuse when preparing it for fuzzy, it was reclassed to show which areas has more importance compared to other areas, therefore whatever amount each landuse type receives solely depends on what I view is important compared to others. Also, for the pairwise comparison operation step, it was entirely based on personal beliefs on what factors are more important compared to others. For example, I gave a heavier weight to such factors as high total crimes, high unemployment rate and existing police station. Other people can disagree with applying heavier weights on these factors because they may feel that unemployment rate, although often linked with crime and policing, may not necessarily be one of the most important because not all unemployed people commit most of the crime. Also, locating close to an existing police may not be too bad of an idea to some people because if a new station is located near an existing one, some people may feel that both police stations can coordinate with one another when needed.

Another operational was that the total crimes collected from the Surrey RCMP website only showed total crimes for each Surrey Community District and not to each dissemination area (DA) or census tract (CT) as I would have liked it to be. Therefore the data about the total crimes for each Surrey district had to be applied to the whole district and not each of the dissemination areas (DA). This will cause errors in the analysis because viewers will think that the total crimes are the same everywhere in that district, but in reality that is the sum for the whole district, where the crimes for some areas within that district can be higher or lower.

There are other problems during the operational stages such as lost in data due to rasterization, especially when rasterizing data that contain some sort of census data. For this project, the dissemination area (DA) was used and because some of the dissemination area is very small, they will be cut out during rasterization. Another operational problem has to deal with timeframe errors because a lot of the datasets used for this project, such as the landuse or the data from the Canadian Census Analyzer were from the year 2001. Therefore, according to most of the data being used, I am basically doing suitability analysis for the year 2001 and a lot of data and factors could have changed from then till now. An example of this is from the 3rd Quarter Statistical Report 2006 from which I retrieved the data for the total crimes per district of Surrey. The statistic reports are from 2006 but the census data for the maps are from 2001, and this does not necessarily correspond with one another on a chronicle bases





Peter Chan . Copyright © 2006.