Results
Table of Contents
i. Cluster Analyses of Coastline
ii. GWR of Coastline
a. Part i. With Ports
b. Part ii. Without Ports
c. Part iii. Logistic Model
d. Part iv. Inter-species Model
iii. Linear Regression Models
iv. Southern California Analyses
v. British Columbia Analyses
Cluster and HotSpot Analyses of the California Coastline Data
(Click to enlarge)




Results show clear clustering of the invasive species into four regions in California - Northern coast, San Francisco, Los Angeles and San Diego. Bays, Marinas and Population Density have clustering in the same areas as well, but with a little more range. Notice areas of particularly low population having negative Getis-Ord Z-clustering values.

The results of this Multi-Distance Spatial Clustered Analysis for invasive species shows there are some distances where observed > expected up to a certain level. This shows that there is at least some spatial clustering, but not at all distances.

The results of this Moran's I statistic shows that there is a very high level of clustering due to the very high Z score, and that it is very likely this clulstering is not a result of random chance.
Hence, we conclude that clustering exists in California for invasive tunicates, and that the clustering is statistically significant.
Geographically Weighted Regression of the California Coastline Data
Part I: With Ports
Results of this GWR analysis can be seen here.
The important facts:
- 624 observations read
- Global R-square is 0.172
- Local GWR R-square is 0.369
- Global T-statistic is positive for all factors, with port
traffic volume having the highest correlation and population with the
lowest
- ANOVA F-test is 4.2744
- Significance test is only valid for population

This map filters for bays with in relation to the presence of invasive tunicates. This shows that bays are strongly affecting the San Francisco Bay area for tunicates while having only very moderate effects for the rest of the coast line, with an almost negative effect on the presence of tunicates in the Northern Coast.

Filtering for coastline length now, the tunicates San Francisco Bay and Los Angeles are heavily affected by the amount of coast line is in each transect; this makes sense as the coast lines are much windier in those regions than they are along the more isolated regions in the middle of this map, where there is a negligible to negative effect on invasives.

Filtering for marinas, we see the surprising result that the tunicates in the Northern Coast, along with San Diego and parts of the middle regions of the map are highly affected by the presence of marinas, while those in San Francisco Bay and Los Angeles seems unaffected, or may even be experience a negative correlation with marinas.

Filtering for population, we see a similar result although the differences are not as pronounced. Tunicates in the Northern Coast and San Diego and southern Los Angeles are highly affected by the presence of marinas, while those in San Francisco Bay seems unaffected, or even is experience a negative correlation with population.

Filtering for ports, we see a fairly even correlation between tunicate sightings and major ports, which is somewhat strange as there are very few transects that have the port traffic attribute. However, the spatial result is as expected as the northern coast, San Francisco Bay, Los Angeles and San Diego areas - all with port transportation - have a high correlation in the GWR model.
Part II: Without Ports
It is suspected that ports, being only in a few transects but with highly varying levels of trade volume can be skewing the results in the few regions that they exist in. Hence running a GWR model without ports, we get these results:
The results of this regression can be seen here.
The important facts:
- 624 observations read
- Global R-square is 0.06772
- Local GWR R-square is 0.27314
- Global T-statistic is positive for all factors, with marinas,
coast length, and bays having the highest correlation and population
with the lowest relatively
- ANOVA F-test is 4.2435
- Significance test is only valid for population, but at a high 0.1% level.
- Overall significance is improved for all variables compared
with the GWR that included ports as the P values have gotten
dramatically lower overall.
Mapping these results:

Absolutely no visible change here. Ports have no skewing effect on how bays are affecting tunicates in the GWR model.

Very minor difference here - more effect in Los Angeles and slightly less effect in San Francisco. This means port traffic was probably having more of an effect on Los Angeles as removing it rendered the coast line length to have a greater effect (in essence, allocating more of the effects to coast line in Los Angeles), and conversely ports had less of an effect on San Francisco as removing it rendered the coast line to have less of an effect on San Francisco.

Again a minor difference - Marinas have now more positive correlation to tunicates in the San Francisco Bay area. This makes more sense intuitively, as the SF Bay has many tunicates as well as marinas. Likewise this means port traffic was probably having more of a positive effect on the San Francisco Bay area as removing it rendered marinas to have a greater positive effect (in essence, allocating more of the regression effects) on the region's tunicates.

Barely any noticeable difference - Population have now an even more positive correlation to tunicates in the Los Angeles area, meaning the presence of Ports was taking part of that effect away. The San Francisco Bay area is still curiously having a low regression.
With these results we can conclude that the removal of ports had only a very minor effect on the regression results, and that the port volume layer is not strongly or seriously skewing the results of the Geographically Weighted Regression.
Part III: A Logistic Model
GWR Regression Results can be seen here.
With the exception of the above results, this model had completely failed to run as there were no values in the attribute table in its output. Thus, given the data set that this project had to be worked with it is impossible to run GWR with the Poisson or Logistic Model; only the Gaussian model can be run.
Part IV: An Inter-Species Model
This model is where we run a GWR model for individual species to see which ones had particular effects on each other. We will also produce a map for population, given that it is commonly the most significant variable to see how each tunicate evaluates for population.
Styela Regression model: here
Important facts:
- 624 observations read
- Global R-square is 0.540
- Local GWR R-square is 0.651
- Global T-statistic is positive for all factors except population
- Ciona and Didemnum sightings are highly correlated, especially
Ciona
- ANOVA F-test is 4.2185
- Significance test is only valid for population, but at a high 1% level
Maps:

This map shows where Ciona is influencing the presence of Styela. No real surprise to find both of them clustered in the San Francisco Bay and Los Angeles as that is where they are both present (see Methodology)

Again results mostly as expected. Particularly high regions in San Francisco bay where both are observed, moderately high regression in the Northern Coast as there are plenty of Didemnum and a small number of isolated Styela sightings, and light areas indicating where both are nowhere to be found.

It is surprising to see, once again, San Francisco Bay having a low/negative correlation with its population and Styela. Population in Los Angeles seems to be having a strong effect on Styela, however.
Ciona Regression model: here
Important facts:
- 624 observations read
- Global R-square is 0.538
- Local GWR R-square is 0.561
- Global T-statistic is positive for all factors except marinas
- Styela and Didemnum sightings are highly correlated, especially
Styela
- ANOVA F-test is 9.8162
- Significance test is only valid for coast line length
Maps:

These maps are odd, as when we run the GWR where Ciona is to be filtered with Styela the map differs greatly than if we run GWR where Styela is filtered with Ciona. Although the regression results indicate similar global values, the local spatial results if we run the models oppositely produce entirely different results. In both cases, Los Angeles is a place with high regression.
Didemnum Regression model: here
Important facts:
- 624 observations read
- Global R-square is 0.128
- Local GWR R-square is 0.519
- Global T-statistic is negative for bays, population and marinas,
and positive for coastal length, Ciona and Styela
- Ciona and Styela correlations are particularly high
- ANOVA F-test is 9.1964
- Significance test is only valid for population, but at a high 1% level
Maps:

These maps are both similar, although they filter for completely different things (Ciona and human population). It shows that Didemnum has a high regression overall for the northern coast of California. This is evident in that most of the northern coast tunicate observations are from Didemnum, and Didemnum has relatively lower presence in the San Francisco Bay and is non-existent in Los Angeles and San Diego, leading to their low regional regression statistics in this analysis.
Linear Regression Models
The entire Excel linear regression analyses can be found here.
Some of the interesting statistics are as follows.
To provide ease of reading, the statistics are summarized. This is an example of what each linear regression result looks like:
| Invasives with Population | |||||
| Multiple R | 0.133554 | ||||
| R Square | 0.017837 | ||||
| Adjusted R Square | 0.016255 | ||||
| Standard Error | 2.782296 | ||||
| Observations | 623 | ||||
| ANOVA | |||||
| df | SS | MS | F | Significance F | |
| Regression | 1 | 87.30298 | 87.30298 | 11.27775 | 0.000832 |
| Residual | 621 | 4807.268 | 7.741173 | ||
| Total | 622 | 4894.571 | |||
- Low R Square but high F-test
Please view the Excel file for the rest of these detailed statistics if desired.
| Invasives with: | R-Square | F-Test |
| Population | 0.017837 | 11.27775 |
| Coastlength | 0.046695 | 30.41778 |
| Bays/Marinas | 0.052227 | 34.21989 |
| Port Volume of Trade | 0.186089 | 141.9827 |
| Bays/Marinas with: | R-Square | F-Test |
| Population | 0.022768 | 14.4685 |
| Coastlength | 0.084485 | 57.30644 |
| Species Regression | R-Square | F-Test |
| Styela & Ciona | 0.495579 | 610.1139 |
| Styela & Didemnum | 0.018058 | 11.42023 |
| Didemnum & Ciona | 0.074194 | 49.76684 |
Southern California Models
HotSpot analyses of the Southern California data set:


These results are not incredibly meaningful as we know where the observations were already taken, but it does note areas where the author has been doing sampling in relative clusters vs isolation.
Linear Regression Models
| Regression | R-Square | F-Test |
| Species Richness with Population | 0.091202488 | 1.706037132 |
| Species Richness with Coast Length | 0.2749951 | 6.44811739 |
| Species Richness with Bays and Marinas | 0.448968115 | 13.85120928 |
| Species Abundance with Population | 0.06583302 | 1.198031357 |
| Species Abundance with Coastal Length | 0.162724617 | 3.303952964 |
| Species Abundance with Bays and Marinas | 0.449130878 | 13.86032473 |
| Species Abundance with Species Richness | 0.775493281 | 58.72156436 |
| Bays and Marinas with Population | 0.019558389 | 0.339125361 |
| Bays and Marinas with Coast Length | 0.472641614 | 15.23614234 |
| Population with Coast Length | 0.019387932 | 0.336111345 |
Analyses of the British Columbia coastline dataset

Single Kernel Density Analysis of the British Columbia boat and tunicate densities. Murray, 2008. This shows where there is clustering of the boats and tunicates.

Nearest Neighbor Hierarchical cluster analyses on boat density. Murray, 2008
This Nearest Neighbor Hierarchical cluster analysis was done with 10 points minimum per cluster, and its results show that if we zoom in close to the Vancouver Island Region, there is significant clustering (notice the yellow rings) of boat densities in the area, much more so than in northern parts of British Columbia.

Dual kernel density analysis on boats and tunicates in British Columbia. Murray, 2008
This final map shows how the two factors are related in a hotspot/cluster analyses. The points of highest correlation are clearly embedded in southwestern British Columbia in the coast near Vancouver Island. This is where both boats and tunicates are most prevalent in the province.