Results
A regression model of all 13 human impact and stream environment variables on hybridization levels in westslope cutthroat trout yielded these statisitcs:
| Human Impact Variables (HiV) |
Coefficient |
Probability |
| Rainbow Trout Stocking (1950-1960) | -0.000028 | 0.176195 |
| Rainbow Trout Stocking (1990-2000) | -0.000021 | 0.281110 |
| Pure Rainbow Trout Sites | -0.000200 | 0.000000* |
| Powerlines | -0.000053 | 0.002622* |
| Pipelines | -0.000013 | 0.576095 |
| Railroads | -0.000085 | 0.000853* |
| Access Roads | -0.000176 | 0.010494* |
| Stream Environment Variables (SeV) | Coefficient | Probability |
| Water
Temperature
|
0.020016 | 0.652141 |
| Mean Depth | 0.620871 | 0.487905 |
| Maximum Depth | 0.796318 | 0.041898* |
| Elevation | -0.003979 | 0.013060* |
| Physical Barriers | - 0.054872 | 0.830848 |
| Stream Order | 0.245955 | M0.338436 |
How well does this model predict the pattern of hybridization?
The results of the OLS regression show that this model explains 63.2%
of the observed pattern of hybridization of westslope cutthroat trout
in my study areas.
Coefficients
A positive coefficent represents a positive relationship between hybridization level and the explanatory variable. For example, hybridization levels increase with higer maximum depths. This can be contrasted the negative relationship between hybridization level and distance to an access road. Hybridization levels decrease as distance to the nearest access road increases. This sounds a bit confusing, but makes sense intuitively, the closer you are to an access road (smaller value), the higher the hybridization level (higher value).
The magnitude of the coefficients is a measure of the strength of the relationship between an explanatory variable and hybridization level. The higher the value, the stronger the relationship. The lower the values have weaker relationships.
Statistical significance of the coefficents
The probabilites listed in the table represent the likelihood that the relationship depicted by the coefficient is the result of random chance (p value greater than 0.05), or statistically significant (p value less than or equal to 0.05) and reliable.
Six of the explanatory variables have reliable coefficients which depict its relationship wtih hybridization levels. These 6 variables were used to create the next model.
The 'Best Fit', Six Variable Model; Adjusted R^2 value = 0.648156
| Human Impact | Coefficient | Probability |
| Pure Rainbow Trout Sites | -0.000193 | 0.000000* |
| Powerlines | -0.000053 | 0.000594* |
| Railroads | -0.000103 | 0.000000* |
| Access roads | -0.000190 | 0.015553* |
| Stream Environment | Coefficient | Probability |
| Maximum Depth | 0.004978 | 0.000502* |
| Elevation | -0.569474 | 0.015553* |
* denotes statistical significance of the coefficient
Compared to the first model with all of the variables included, this model does a slightly better job of predicting the observed value of hybridization levels in westslope cutthroat trout at 64.5%.
How good is this best fit model? Can we trust it?
This model yielded the highest R^2 value. To completely access the validity of the model, I performed a spatial autocorrelation analysis on the residuals from this regression. The residuals were not spatially autocorrelated (Moran's I = 0.111445, p value = 0.113055). This means that the residuals from the OLS analysis are randomly distributed and my model is not missing a key variable.
Looking at the coefficients, each variable has the expected positive or negative relationship based off of deduction. For all human impact variables specified in this model, the closer the site is to each variable, the higher the hybridization level. For stream environment variables, the relationships may not be as obvious and are further dissected in the discussion.
Next, I checked for redundancy amongst my variables by examining the variance inflation factor (VIF) from the OLS statistics. None of my variables showed values greater than 7.5, so there is no over-count bias.
All variables in this model show statistically significant coefficients (probability values) which means that the relationships given by the coefficient are 95% reliable and not due to chance.
The Jarques-Bera test also shows that the residuals from the regression analysis are normally distributed which means I am not missing any key variables.
(Click on images to see full maps)
Map of All
Variables Model
Map
of Residuals From
Ordinary Least Squares Regression
in All Variables Model



