|
As with any GIS project, there are many areas where error
can/may have occurred. As far as my measurements and data are concerned, most
of these were hard numbers with reliable sources, so the initial inputs are
most likely not a cause for significant concern in a project at this level.
However, often those numbers related to Census Tract Areas, and not to the
secondary school boundaries that I wanted them to. To overcome this, as I
mentioned in my methodology, I used the “Split” tool to split the CT’s into
secondary school boundary catchment zones. With population totals, I then
recalculated the “population” in each part of the CT by normalizing the data as
a population density. The problems associated with this are obvious. Apartment
buildings may be present on one side of the CT and not the other, and by
performing this operation I undoubtedly created an ecological fallacy
somewhere, not to mention that I clearly started with MAUP issue by not using Dissemination Areas
instead (although this would have required many more extra recalculations after
the split given that the split would have split more areas, which is why I didn’t
use them). Another two problems also arise where the plit is concerned. As
labeled in the legend, I used the population of 10-19 year-olds in these
calculations, however clearly most 10-12 year-olds and 19 year-olds do not go
to high school. I had hoped to use data for just those youth aged 13-18,
however Statistics Canada makes their data available only in 4-year groupings
of 10-14 and 15-19. Thus, the totals undoubtedly contain students not going to
high school, but since my project was on the changes in these populations, and
not specifically on the hard numbers themselves, and since the bulk of the
numbers were of students going to high school, I did not rate this as too much
of a concern, although it should be noted in a section regarding error. Lastly,
regarding the split, the borders of the secondary school boundaries and CT
areas did not quite perfectly line up where they should have. This created
sliver polygons, which I subsequently checked the attribute table for, and if I
determined that they were large enough (over 0.5km squared, which only 1 was),
I included then in the population re-calculation. However, approximately
0.05km2 was missing from the catchment areas along the ocean-front boundaries
(mostly on the northernmost part of the map), which would have marginal effects
on the population re-count, given that this number was based upon area.
Regarding the analysis of the data, I had originally
intended to spatially analyze the correlations using an Ordinary Least Squared,
and initially did use this tool to compare the results between the schools
changes and the Average Income and FI Scores values, however I quickly realized
this was not an accurate representation of my data. First of all, with only 18 data
points, it is very hard to spatially analyze anything for certain, given that
correlations can develop in larger spatial units that are actually not present
when one breaks them down and looks more carefully at them (in smaller spatial
units). As such, both my spatial autocorrelation and ordinary least squares
were subject to this error. In addition, the ordinary least squares, while
showing a correlation between the average income and changes in school
population, did not take into account that another factor may be present in
this that my analysis was not taking into account. For example, a lot of
parents are concerned about sending their children to schools where there are a
lot of ESL students because they believe the learning will not progress as
quickly in the classroom, or that their English-speaking child will not receive
the attention they need. A stronger correlation might have been present if I
had looked at immigration data in the catchment boundaries. As such, I decided
in the end to simply model that there was a spatial correlation between east
and west side schools, and leave the average income and FI Scores as a
secondary part to my project that was more or less intended to suggest possible
reasons as to why this could be, not hard-line say “this is why there is
spatial correlation”, but to simply cause whomever is reading this project to
question possibilities as to why totals are different on the west and east
sides. One place where an OLS could have come into play is between the 10-19
age difference population and schools population difference, however given that
one showed a very high spatial correlation and the other did not, this would
have been redundant, as the Moran’s Index already showed that these two
datasets were not correlated.
Obviously, a great amount of room is left to further
investigate this topic. As I suggested above, numerous other correlations could
be made, and real data about populations of youth in the catchment areas does
exist (the Ministry of Education has these) and would be very helpful in any
further analyzing of this topic.
|