Data Manipulation

Data Manipulation

This section is divided into two parts. First, you will find a textual summary of the data manipulation processes that were followed, and below that you will see the same processes displayed as a cartographic model.

             Text version
             Cartographic Model
             Back to Contents

Text Version:

Much time was spent in trying to get the 2m_contour file to work in Idrisi. I was looking for a quick way to assign elevation values to these arcs without having to go through them individually - there were nearly one thousand of them, and in many areas they were almost overlapping. Finally I decided not to pursue this any longer as time was short, and no quick way to do this was apparent. (It might have been possible had there not been multiple peaks within the city of Vancouver - however elevation values are not unique to each arc, which complicates matters somewhat!)
The gvrd_dem file was opened. The municipality raster image (derived from ArcView) was reclassed such that Vancouver and the UBC Endowment Lands (listed as an electoral area) held values of 1, and all else 0. This was overlaid on the dem, to produce a dem of Vancouver. After attempting to re-size the image using other modules (including unsuccessful but nonetheless interesting attempts with RESAMPLE,) WINDOW was used to "clip" the now blank outer reaches of the map. At this point, an ortho display was created as a test of this procedure. (not shown on cartographic model.)

At this point, the anomaly on the west side of Vancouver was first noticed. The image to the left shows a vertical stripe of low values running near Crown Street, on the west side of Vancouver. Running a high-pass smoothing filter (a matrix full of 1/9) over the image would be self-defeating at this point; it would lower the presence of the "groove", but would significantly reduce the utility of the remainder of the elevation model. It was therefore not attempted. Various pit removal procedures, such as DESTRIPE and PIT REMOVER were applied. PIT REMOVER was the most successful, but even this seemed to do very little in reducing the impact of this stripe, and may have had unseen negative impacts on the dem's accuracy.
Eventually it was decided that - ultimately - the real analysis would come down to the presence of lost streams, and - since there was only a small portion of the headwaters of one stream crossing the stripe at any point - it was accepted as a fault of the dataset. Despite this fault, the raster file vandem soon became an important part of the analysis.

        After this, various other datasets were added including the bikeways, the parks, the geological data, and the streams themselves. After having problems with georeferencing (although none were immediately apparent - all data was set to UTM 10N), I soon discovered that the previous dataset was referenced to a previous ellipsoid; the Clarke 1886, used in the NAD 1927 system. I backtracked, reprojected the necessary data using an extension of ArcView supplied by the lab administrator, and continued.
        After waiting for the Dept of Fisheries and Oceans to send data, I finally decided to simply digitize the streams myself off of a paper map of theirs. This presented a problem, as the scanner would introduce a new "generation" of abstraction into the dataset. This was minimized through the process by which I digitized in ArcView. I used the (now properly georeferenced) 2m_contour.dxf file as a background, along with the scan image (which itself had been touched up in Photoshop, and referenced). This allowed me to take advantage of the convergence of information principle, giving me relevant data (i.e. contours) to fall back on at times when the scan image was dubious. A good measure of the accuracy of my digitizing is that, in the vast majority of cases, my digitized streams (off the scan image) lined up fairly well with the contour.dxf file I was using as a guide.
        One problem area was the Renfrew Ravine, where it became ambiguous how exactly the water flowed from there into Still Creek. This is one of the few areas where I was forced to rely on the contour map, and my own local knowledge to digitize properly.
        The streams were then imported into Idrisi using SHAPEIDR, examined (georeferencing was checked once more), and rasterized via LINERAS, producing streams, which soon became a base for many other operations. A long, complex process (outlined on the cartographic model) was undertaken to produce a wide series of ORTHO displays of the former drainage network of Vancouver, the current network, and various other drape images. These steps will not be outlined here, but the finished images can be seen in the results section.
        Previously, ORTHO images had been very "wiry" as a result of the low numbers of rows and columns present. as an experiment, the EXPAND module was used to give vandem more values. Note that this expanded dem was not used in analysis, as the expanded dem values would have only been derived from the previous ones in vandem, and therefore not of any greater value. So, vandem was expanded by a factor of 3 (which was too big for ORTHO display), after which the factor was reduced to 2 to enable the display.
        Streams were reclassed in various differing ways to facilitate different types of analyses. Boolean images were created showing lost streams, shorelines, and current streams. Various distance operators were used here, including DISTANCE and BUFFER, to produce a range of boolean and real data images for use in future analysis.

        Soils were located on the S: network drive, however, (as mentioned in the previous section), the dataset exhibits a fundamental problem often encountered in GIS: the need (or perceived need) to discretize data. The real world exhibits continuous variation, and is infinitely variable. Soils are a prime example of this; soil "classes" (if it is proper to speak of such things) are entities with fuzzy boundaries. How are we to judge where sandy clay ends and clayey sand begins, for example?
    The vandirt dataset was derived from the "soils" raster file, which covered the entire GVRD. After clipping this file using a municipal boundaries file, I converted it to a vector file and "played" with it in vector form for a while, getting a feel for where different soil types were most common. This also served to confirm much of my previous knowledge in hydrology and soils.
    The table for the soils database was enormous. It was found that the compilers of this data attempted to discretize the soil classes by applying names to them which were barely differentiable. For example, fields existed for "sandy clay" and "clayey sand", as well as "sandy clayey silt to silt" or similar descriptions. Through SQL queries, I applied values to those classes which would be unsuitable for stream restoration, and then transferred this to raster form, where a simple reclass operation was enough to reduce this to a boolean image, "vandirtb".

Zoning data was available, but - like the block outlines - it continually crashed Idrisi for no apparent reason. Land use was deemed an appropriate surrogate, and may have actually been more effective. The land use data derives from a GVRD dataset, and was clipped, processed, and examined. A boolean image was created for "suitable land uses".

In summary, there was a great deal of data available for this project, however, much of it existed in formats which cause Idrisi to crash. It is unfortunate that this is the case, as much of the remaining analysis had to rely on data from the network drive.

Cartographic Model:

Because of its size, the Cartographic Model is displayed on its own page. Click here to view it.

back to table of contents

forward to next section