Data Collection
 

Data Acquisition

All data used in this project are acquired through the internet.  Data acquisition is the most time consuming and tedious work of the project.  Most of the time used in the preparation for the project is getting the data.  First of all, Canadian data are not distributed freely by the government.  Therefore data have to be bought from Statistic Canada if no other source is available.  However, for research purposes, Canadian data can be obtained from research institutes such as universities.  The Research Data Library of SFU provides the 1996 census data over the internet for students and researchers use.  The website is: http://www.sfu.ca/rdl.  The 1991 census data are obtained in another website since the SFU website does not provide it.  1991 census data are obtained from the University of Toronto website: http://datacentre.chass.utoronto.ca/census/.  The detail of what were obtained is listed below.

SFU website census tract files including:
ctpr1.ivt—Age and Sex and Families
ctpr2.ivt—Immigration & Citizenship
ctpr6.ivt—Labour market activities
ctpr7.ivt—Education, mobility and migration
ctpr8.ivt—Sources of income
ctpr9.ivt—Families
The corresponding attributes of census data of 1991 are obtained from the University of Toronto website.

The files from the SFU website are ivt format, which have to use the Beyond 20/20 program to open and read these files.  Data downloaded from the University of Toronto website are in comma delimited format which have to import into Microsoft Excel in order to use it.

The reason why downloading data from the internet instead of obtaining from other sources such as Statistic Canada Publications from the library is that although it takes a longer time to search and to convert, the data are readily to be imported into Excel to use so that wrong data input by human error is minimized.
 

Data Manipulation

Data downloaded from the internet are converted into Excel as spreadsheets.  However these are just raw data with lots of information that is not needed by this project.  They are census data of all Canada with lots of attributes about the population.  Therefore the first step to do is to extract the useful census data.  The 11 census tracts are selected and the only useful attributes will be included.  (This file can be downloaded following this link: RawData.xls).  These data are put into a new spreadsheet for further manipulation of data.

Both 1991 and 1996 data go through the same procedures.  As all the useful data are put into the spreadsheet, the 1996 data are compared with 1991 data.  In order to see trend of the data, percentage change from 1991 to 1996 of each attribute of each census tract is calculated.  (The spreadsheet can be downloaded following this link: ProjectData.xls or view as html format [it is better to view this page with Microsoft Internet Explorer])  Percentage change is calculated as: (1996 data – 1991 data)/ 1991 data * 100.  Moreover a percentage change of aggregate total of the 11 census tracts of each attribute is also calculated.  The percentage change is calculated by the difference of 1996 eleven census tracts total and 1991 eleven census tracts total divided by the 1991 eleven census tracts total multiply by 100.  The values of these percentage changes are surrounded by thick borders in the spreadsheet.  These aggregate total percentage changes are calculated to create a better picture of the data for better analysis.

Some of the attributes in the raw data are grouped together for analysis because some attributes are divided into too many groups and that grouping of data is needed for certain analysis.  The following attributes have been grouped or regrouped to calculate percentage change (shown in the spreadsheet italicized):

    - Male population is grouped into 3 age groups: 15-29, 30-64, and 65+years
    - Female population is grouped into 3 age groups: 15-29, 30-64, and 65+years
    - Each age group of each sex is added together (e.g. [male 15-29] plus [female 15-29] to produce [male+female 15-29]), then
       population change from 1991 to 1996 of each age group is calculated.
    - Aggregate population change of the 11 census tracts is also calculated.

    - Type of dwelling: semi-detached and row house are grouped together
    - Other types of dwelling other than the above two and single-detached house are grouped as ‘apartments and other types of dwelling’
    - Again, percentage change and aggregate percentage change are calculated to each combined group.

    - Immigrants are grouped into three groups by place of birth: Europe, Asia, and Others.
    - Population percentage change and aggregate percentage change are calculated to each group.

    - Number of recent immigrants by place of birth is only available in 1996 statistics, therefore no percentage change can be calculated
    - Recent immigrants are grouped into 4 groups: ‘Hong Kong, China, Taiwan’, ‘other Asia’, ‘Europe’, and ‘all others’.

    - Total income of population 15 years and over is grouped into 4 groups: ‘under $1000’, ‘$1000-$9999’, ‘$10000-$39999’, and
       ‘$40000 and over’.
    - Population percentage change and aggregate percentage change are calculated to each combined group.

    - Census family income is grouped into 4 groups: ‘under $10000’, ‘$10000-$39999’, ‘$40000-$69999’, and ‘$70000 and over’.
    - Population percentage change and aggregate percentage change are calculated to each combined group.

    - Household income grouping and calculation are the same as 'census family income'.

The rest of the data are used with the original groupings.  Percentage change and aggregate percentage change are calculated for these attributes as well.
 
 
 
 
 
 

Next Topic: Spatial Analysis

Back To Project Title Page

Back To Index