library(tidyverse)
library(aqm)
library(sf)
library(mapview)
library(mgcv)
library(raster)
library(tmap)
library(giscourse)
Last time we saw that students had measured pines at Arne in two distinct habitats. The main analysis that was carried out in all years focussed on the heatland site (Coombes heath) where the issue from a management perspective concerned unwanted regeneration.
Last year students also spent some time measuring pines in the “restoration” site. Here pines were regenerating after felling of a plantation of pines.
The data have all been added to the aqm package and can be loaded on the server with a line of R.
data(arne_pine_hts)
str(pine_hts)
## Classes 'sf' and 'data.frame': 256 obs. of 6 variables:
## $ Site : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
## $ WP : num 1 1 1 2 2 2 3 3 3 4 ...
## $ Age : num 8 4 7 5 7 7 4 5 4 5 ...
## $ DiMM : num 26.77 6.39 14.41 35 25.65 ...
## $ HeCM : num 100 43 72 125 95 77 95 99 75 141 ...
## $ geometry:sfc_POINT of length 256; first list element: 'XY' num -2.04 50.69
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA
## ..- attr(*, "names")= chr "Site" "WP" "Age" "DiMM" ...
This is a spatial object. The coordinates are already added to the data frome using the code we saw last week. This data frame differs from last weeks data as it concerns individual pine trees rather than densities within a quadrat.
Let’s see the data as a map.
mapview(pine_hts,zcol="Site")
That’s useful to see where the data came from. However we are now going to analyse the data as a non spatial object.
The data can be changed back to a typical data frame easily.
d<-data.frame(pine_hts)
str(d)
## 'data.frame': 256 obs. of 6 variables:
## $ Site : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
## $ WP : num 1 1 1 2 2 2 3 3 3 4 ...
## $ Age : num 8 4 7 5 7 7 4 5 4 5 ...
## $ DiMM : num 26.77 6.39 14.41 35 25.65 ...
## $ HeCM : num 100 43 72 125 95 77 95 99 75 141 ...
## $ geometry:sfc_POINT of length 256; first list element: 'XY' num -2.04 50.69
The variables in the data frame are
Age - number of counted whorls)
Site - A factor wiht two levels
DiMM - Diameter at 10cm above ground level measured in millimeters
HeCM - Height measured in cm
g0<- ggplot(d,aes(x=DiMM, y=HeCM))
g1 <- g0 + geom_point()
g1 + ylab("Height of pines (cm)") + xlab("Diameter of ines (mm)")
Fit a regression line and a regression model to these data. The code is not shown in this handout. You must try to use code from previous examples.
## `geom_smooth()` using formula 'y ~ x'
##
## Call:
## lm(formula = HeCM ~ DiMM, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -89.094 -13.230 -2.461 11.893 105.200
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 29.485 2.345 12.57 <2e-16 ***
## DiMM 3.165 0.108 29.30 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21.75 on 254 degrees of freedom
## Multiple R-squared: 0.7716, Adjusted R-squared: 0.7707
## F-statistic: 858.3 on 1 and 254 DF, p-value: < 2.2e-16
What do you conclude?
When a regression does not produce a convicing fit to the data there are various more advanced analyses that can be used. This code fits a “gam” smoother.
g1 + geom_smooth(method = "gam", formula = y ~ s(x))
The gam fits the data rather better but does not result in a simple formula.
mod<-gam(data=d,HeCM~s(DiMM))
summary(mod)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## HeCM ~ s(DiMM)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 85.453 1.265 67.56 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(DiMM) 4.106 5.096 202.2 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.802 Deviance explained = 80.5%
## GCV = 417.93 Scale est. = 409.6 n = 256
However this can be useful if we want to extract residuals.
d$residuals<- residuals(mod)
What do the residuals mean in this context?
Code not shown
Find out whether the conditions for pine growth differs between the two sites.
Find out whether there is a significant difference between the heights of the pines aftercorrecting for diameters between the two sites using the appropriate statistical technique. (hint, one way ANOVA is equivalent to T test when there are only two levels).
What does this imply? Write this up into your mamnegement report for the assignment as a finding.
The terrain analysis that we will build over the next couple of weeks is based on two raster layers derived from Lidar data.
We can load these layers and look at them using the following code. These layers are held on the server. If you were working independently you would probably first process the layers using a desktop GIS such as QGIS before uploading them to R in the form of GEOTIFF files that can be read in with a line of code. In this case this intial work has already been done for you so that you can concentrate on the meaning of the data.
The principles of terrain analysis are outlined here
http://r.bournemouth.ac.uk:82/books/quant_spat/_book/terrain-analysis.html
data("arne_pines")
arne_pines %>% filter(site=="Heath" ) -> arne_pines
data(arne_lidar)
arne_pines<-st_as_sf(arne_pines,coords = c("lon","lat"),crs=4326)
mapview(arne_pines)
contours<-rasterToContour(dtm, levels=seq(0,60,5))
qtm(dtm) + qtm(contours)
mapview(study_area,alpha.regions = 0,lwd=4)
pol<-as(study_area,"Spatial") ## Convert to the older spatial classes
dsm<-raster::crop(dsm,pol)
dtm<-raster::crop(dtm,pol)
contours<-rasterToContour(dtm, nlevels=10)
map2<- mapview(dsm,col.regions=terrain.colors(1000),legend=TRUE) + mapview(dtm,col.regions=terrain.colors(1000)) + mapview(contours) + mapview(arne_pines)
map2 %>% extras()
## Loading required package: leaflet.extras
## Loading required package: leaflet