Neighborhood Stress | Methodology

Methodology

The first problem that I had to overcome was to choose between working with census tracts data and dissemination area data. I settled for dissemination areas because I wanted to avoid the Modifiable Areal Unit Problem as much as possible; and because dissemination areas are more likely to define the neighborhood boundaries better since they are smaller than census tracts in size.

However, this choice was made at a cost. There are fewer variables in the dissemination area summary tables than in the census tracts, for example there are no summaries for “low income cut off” at the dissemination area level.

The selection of numerators and denominators is very important otherwise you end up with bad data to start with. For most of the variables the proportion of a characteristic to the total population in the same category was used. In other instances the averages of a neighborhood characteristic were used.

The next step was then to standardize the selected variables so that they are comparable to each other. To standardize the variables the Z-Scores were calculated according to the following formula
The Standard Score is: where

χ is a raw score to be standardized
σ is the standard deviation
μ is the mean

Once standardized, factor analysis was performed on the variables using the Z-Scores. The method of Principle Component Analysis using varimax rotation, a common method (Galster et al, 2005) was used.

In Principal Component Analysis a new set of variables is created as linear combinations of the original set. The linear combination that explains the maximum amount of variation is called the first principal component. A second principal component (another linear combination) is then found, independent of the first, so that it explains as much as possible of the remaining variability

From the factor analysis five principal components whose eigenvalues were greater than one were extracted. The first principle component which correlated with Low Income Incidence, was used as the Index for Neighborhood Stress.
Simply put, this index captures the information from several variables into one composite measure.

The next step was to join the table of the first principle component to the shapefile of the Dissemination Area Geography using the Dissemination Area unique identifier (DAUID) as the foreign key

^ Top