Chapter 18 Regression examples
To quickly look at regressions with differing sample sizes and “noise to signal” ratio you can use this shiny app.
http://r.bournemouth.ac.uk:8790/stats/oak_leaves/scatterplot/
18.1 1. Fit a calibration regression to the mussels data
library(tidyverse)
library(performance)
library(aqm)
library(mgcv)
library(ggforce)
data(mussels)
<-mussels d
<-ggplot(d,aes(x=Lshell, y=BTVolume)) + geom_point()
g0+geom_smooth(method="lm") g0
<-lm(data=d,BTVolume~Lshell)
modsummary(mod)
check_model(mod)
18.2 2. Fit a regression model to allometric data from oak trees.
The data includes the diamater at breast height measured in cm and the height of the trees in meters. Is a straight line an appropriate description of the relationship? If not, does the relationship approximate to a stright line for some parts of its trajectory?
data(oaks)
<-oaks d
<-ggplot(d,aes(x=diam, y=ht)) + geom_point()
g0+geom_smooth(method="lm") g0
<-lm(data=d,ht~diam)
modsummary(mod)
check_model(mod)
+ geom_smooth( method="gam", formula =y~s(x)) +
g0 geom_smooth( method="lm", se=FALSE,colour="red")
18.3 Climate data
The CET global temperature data set can be simplified to yearly temperatures since 1880 by the code below.
data(cet)
library(lubridate)
$year<-year(cet$date)
cet%>% filter(year>1880) %>% group_by(year) %>% summarise(mtemp=mean(temp)) ->cet cet
How well is this trend modelled by linear regression?
## Example to get started
ggplot(cet,aes(x=year,y=mtemp)) + geom_point() +
geom_smooth(method="gam",formula=y~s(x))
Alternative approach the breaks the data into groups. This is provided as an answer to a question asked in class regarding how to achieve this type of figure in R. It is not necessarily good practice, but it can sometimes be useful.
$breaks<-as.factor(cut(cet$year,c(1879,1920,1970,2021)))
cetlevels(cet$breaks)<-c("1880-1920","1921-1970","1970-2021")
ggplot(cet,aes(x=year,y=mtemp, col=breaks)) +
geom_mark_ellipse(expand=0.02,aes(fill=breaks)) +
geom_point() +
geom_smooth(method="lm")