Chapter 18 Regression examples

To quickly look at regressions with differing sample sizes and “noise to signal” ratio you can use this shiny app.

http://r.bournemouth.ac.uk:8790/stats/oak_leaves/scatterplot/

18.1 1. Fit a calibration regression to the mussels data

library(tidyverse)
library(performance)
library(aqm)
library(mgcv)
library(ggforce)
data(mussels)
d<-mussels
g0<-ggplot(d,aes(x=Lshell, y=BTVolume)) + geom_point() 
 g0 +geom_smooth(method="lm")
mod<-lm(data=d,BTVolume~Lshell)
summary(mod)
check_model(mod)

18.2 2. Fit a regression model to allometric data from oak trees.

The data includes the diamater at breast height measured in cm and the height of the trees in meters. Is a straight line an appropriate description of the relationship? If not, does the relationship approximate to a stright line for some parts of its trajectory?

data(oaks)
d<-oaks
g0<-ggplot(d,aes(x=diam, y=ht)) + geom_point() 
 g0 +geom_smooth(method="lm")
mod<-lm(data=d,ht~diam)
summary(mod)
check_model(mod)
g0 + geom_smooth( method="gam", formula =y~s(x)) + 
  geom_smooth( method="lm", se=FALSE,colour="red")

18.3 Climate data

The CET global temperature data set can be simplified to yearly temperatures since 1880 by the code below.

data(cet)
library(lubridate)
cet$year<-year(cet$date)
cet %>% filter(year>1880) %>% group_by(year) %>% summarise(mtemp=mean(temp)) ->cet

How well is this trend modelled by linear regression?

## Example to get started
ggplot(cet,aes(x=year,y=mtemp)) + geom_point() +
  geom_smooth(method="gam",formula=y~s(x))

Alternative approach the breaks the data into groups. This is provided as an answer to a question asked in class regarding how to achieve this type of figure in R. It is not necessarily good practice, but it can sometimes be useful.

cet$breaks<-as.factor(cut(cet$year,c(1879,1920,1970,2021)))
levels(cet$breaks)<-c("1880-1920","1921-1970","1970-2021")
ggplot(cet,aes(x=year,y=mtemp, col=breaks)) + 
  geom_mark_ellipse(expand=0.02,aes(fill=breaks)) +
  geom_point() +
  geom_smooth(method="lm")