general methodology

 

The bulk of this study was analyzed within Esri’s GIS software: ArcMap10. However, Microsoft Excel was also used to sort data tables and carry out non-spatial statistics. Within ArcMap, all data was given the same projection. The Universal Transverse Mercator (UTM) geographic coordinate system was used (NAD 1983, zone 10).

generation of Dependent variables

 

The literary review conduced before this study indicated there are three main factors that would have a pronounced negative effect on the sales price of a home. These factors where also chosen as the data type was not only obtainable, but could be analyzed with the scope of the project.

The location of medical care centers (hospitals, clinics and hospices) were retrieved from a data drive in the department of Geography at UBC. They were sourced from The City of Vancouver. The location of funeral homes and crematoriums were retrieved from address and co-ordinates located in an internet search. In order to determine where the T intersections were for the streets of Vancouver, a network analysis was carried out on a street shape file.  From this, a junction point layer was given, and was exported as a shape file. The spatial join of the street and junction files yielded a field with the number of street segments. Where three sections were present, a T intersection point was exported.  The ‘select by attributes’ tool was used to identify those homes which contained a 4 in the address.

The input data set on the Vancouver real estate sales contained many fields, which could likely be a factor in their sale price. Ordinary Least Square Analysis was run to determine which dependent variable best predicted the independent variable: ‘sale price’. Of the factors considered (home size, lot size, bedrooms, land value etc.) home size was the best predictor. This was determined through Ordinary Least Squares (OLS), where home size was defined as square feet.

regression analysis

 

Regression analysis is a method that describes how the value of a dependent variable changes when an independent variable is varied (while the remaining independent variables remain fixed. One assumption of the basic regression model is that the observations should be independent of one another. This does not occur for spatial data, as according to Tobler’s Law, where variables exhibit spatial dependence. The implication of this on the regression model is that there can be biased estimate of the parameters (Charlton & Fotheringham 2009).

Geographically weighted regression (GWR) is a local spatial statistical technique, for the analysis of spatial, non-stationary variables, as they differ in each location (Mennis 2006). The parameters may be estimated anywhere in the study area, given that a dependent variable and a set of independent variables are measured at a known location. Taking Tobler’s Law into consideration then, if parameters are estimated for some location, then observations which area nearer to this location will have a higher weight then observations which are further away. The weights are computed from a weighing scheme known as a kernel (Charlton & Fotheringham 2009).

A few rounds of GWR Analysis were carried out, each time maintaining ‘sale price’ as the dependent variable, and varying the independent variable. First one was conducted with solely ‘home size’ and another with ‘lot size’ to determine the modeling accuracy without taking into consideration the Feng Shui elements.  Before a second GWR could be carried out to include the Feng Shui attributes, they first needed to be given a value that reflected their relation to the surrounding real estate. Moreover, in order to expressed the increasing effect that more than one attribute may have on a given home, the ‘point density’ tool was used. It calculates a magnitude per unit area from point features (Feng Shui element locations) that fall within a neighborhood around each cell.  In order to determine which homes were located at a T intersection, the ‘near’ tool was used. It determines the distance between input and output features. Any homes which fell within 10m of the T intersections were determined to be on said intersections. Although it would have been ideal to include a file that stipulated which home had a 4 in its address, binary data (presence or absence) is not compatible within ArcMap10 ‘GWR’ tool. The GWR results which include these variables were then displayed over a chloropleth map indicating percentage people of Chinese origin.

In order to determine the influence of the unlucky 4 in an address, the real estate sales data attribute table, and its GWR result parameters were split into a category containing homes with 4’s in the address and those without. The GWR local R² values and the standard deviation of the residuals were statistically examined by the T-Test within Excel. Under the assumption that the presence of a 4 should affect the sale price of a home, the results of the GWR (which did not include the 4 as a independent variable) should be different for the category containing 4’s then those homes that didn’t.

Kernel density

 

Lastly, a kernel surface was created to visually revel which areas of Vancouver had the highest density if negative Feng Shui locations. The ‘kernel density’ tool was used on each for the three location types, which were then scaled from 0 to 1 by the ‘raster calculator’ tool. This tool was also used in order to create the final kernel surface by adding each kernel layer together.