Linear Models in Statistical Rethinking
Summary
Chapter 4 of Statistical Rethinking builds Bayesian linear regression from scratch. Why normal distributions arise from addition (CLT), how to write models in mathematical notation and translate to R code, and how to generate posterior predictions with uncertainty intervals.
Why Normal Distributions Are Normal
The Gaussian distribution arises naturally from addition of many small effects (Central Limit Theorem). McElreath demonstrates this with a soccer field simulation: random steps left/right converge to a bell curve regardless of step size distribution.
Two justifications for using Gaussian likelihoods:
- Ontological: many natural measurements are approximately Gaussian because they arise from additive processes
- Epistemological: the Gaussian is the maximum entropy distribution for a given mean and variance — it assumes the least about the data
The Model Language
A complete Bayesian model specifies likelihood and priors:
The R map function fits this by finding the maximum a posteriori (MAP) estimate and approximating the posterior as multivariate Gaussian.
Prior Predictive Simulation
Always Simulate from Priors First
Before fitting, simulate predictions from the prior to check that your priors produce sensible outcomes. This is a key step in Bayesian workflow.
Generating Predictions
Three-step recipe for any fitted model:
- Use
linkto generate posterior distributions of at each predictor value - Use
mean/HPDI/PIto summarize those distributions - Use
simto generate full posterior predictions (incorporating )
The two kinds of uncertainty:
- Narrow interval (around ): uncertainty about the average outcome at each predictor value
- Wide interval (from
sim): uncertainty about individual observations, including residual variation
Polynomial Regression
Polynomial models can capture curvature but:
- Hard to interpret coefficients
- Better to use a mechanistic model when possible
- Always standardize predictors first for numerical stability
See Also
- Bayesian Linear Regression — BDA3’s treatment (Ch 14), more mathematical
- Regression and the CEF — the frequentist perspective on regression
- Spurious Association and Confounds — Ch 5, extending to multiple predictors
- Overfitting and Information Criteria — Ch 6, when polynomial models go wrong
- Statistical Rethinking - Overview
- Hierarchical Models — the multilevel extension of the Gaussian model introduced here; McElreath’s “parameters all the way down” (Ch. 12–13)