More on finding distances

First load up the data from Arne.

data("arne_pines")
d<-arne_pines
## Remember how to produce a spatial object from this data frame
# Make it spatial
d<-st_as_sf(arne_pines,coords = c("lon","lat"),crs=4326)
# Reproject to national grid
d<-st_transform(d,27700)

Now load up the lidar data. We’ll crop and mask the dtm and the dsm to the study area.

data(arne_lidar)
# Just to check. Will reset old format CRS
d<-st_intersection(d,study_area)
study_area<- st_transform(study_area, 27700) 
dtm<-crop(dtm, study_area)
dtm<-mask(dtm, study_area)
dsm<-crop(dsm, study_area)
dsm<-mask(dsm, study_area)

Now to produce a canopy height model we simply subtract the dtm from the dsm

chm<-dsm-dtm
plot(chm)

Notice how this shows up the positions of the taller trees.

Vectorising the trees

In order to produce a vector layer for the outlines of the trees I used a few tricks in R. Its worth noting that there was no bespoke operation in either QGIS nor in Arc to achieve this result easily.

chm[chm<2]<-NA ## Set below 2m as blanks (NA)
chm[chm>=2]<-1 ## Set above 2m to a single value
trees<-rasterToPolygons(chm, dissolve=TRUE) # Turn to vector
trees<-st_as_sf(trees)
trees<-st_cast(trees$geometry,"POLYGON") # Make each polygon distinct
trees<-st_transform(trees,27700)

Notice that the procedure does pick out most of the individual trees on the heathland, including some that have been removed since the lidar was obtained. This can be seen by comparing the results to the satelite image.

library(leaflet.extras)
mp<-mapview(trees)
mp@map %>% addFullscreenControl() 

Finding the distances to the trees

If we ask R for the distances between the trees and the quadrats (the object called d) we will get a large matrix of results containing the distance between every quadrat and every identified tree.

distances<-st_distance(d,trees)
dim(distances)
## [1]  184 1532

So if we want the distance to the nearest tree we need to run down the roes and find the minimum.

d$min_dist<-apply(distances,1,min)

Spatial sampling

It is clear that the students did not place the quadrats randomly over the whole potential study area. There are many reasons for this, some involved logistics and others the decisions made by the students.

What would a representative sample of the area look like?

One possibility is to place the same number of quadrats completely at random within the area. This can be simulated easily in R.

random<-st_sf(name="random", geom=st_sample(study_area, 184))
distances<-st_distance(random,trees)
random$min_dist<-apply(distances,1,min)

One rather surprising element of a truly random spatial sample is that to the eye it appears to be clustered. We can use R to obtain a set of points placed regularly within the area for comparison.

regular<-st_sf(name="regular", geom=st_sample(study_area, 184, type="regular"))
distances<-st_distance(regular,trees)
regular$min_dist<-apply(distances,1,min)

Mapping the results

mapview(d,col.region="orange") +
mapview(regular,col.region="darkblue") + 
  mapview(random,col.region="darkred") +
  mapview(trees, col.region="darkgreen") ->map
map %>% extras()

Exercise

Analyse the distances from the trees from the three sources.

We can turn the results in each object into a single data frame for this exrcise.

d1<-data.frame(source="quadrats",min_dist=d$min_dist)
d2<-data.frame(source="random",min_dist=random$min_dist)
d3<-data.frame(source="regular",min_dist=regular$min_dist)
df<-rbind(d1,d2, d3)
# Hint
g0<-ggplot(df,aes(x=source,y=min_dist))
f1<-g0+geom_boxplot() +labs(title="Distances")
f2<-ci(g0) +labs(title="Mean distance")
multiplot(f1,f2,cols=2)