Monsters and Mixtures
Summary
Chapters 9–11 of Statistical Rethinking cover generalized linear models (GLMs) through the lens of maximum entropy, then extend to “monster” models: zero-inflated Poisson, beta-binomial, gamma-Poisson (negative binomial), and ordered categorical outcomes.
Maximum Entropy and GLMs (Ch 9)
Why use exponential family distributions? Because they are the maximum entropy distributions for given constraints:
| Constraint | MaxEnt Distribution | Link |
|---|---|---|
| Known mean and variance | Gaussian | Identity |
| Two outcomes, constant | Binomial | Logit |
| Count of events, constant rate | Poisson | Log |
Nature Loves Entropy
Exponential family distributions arise naturally because there are more ways to produce them than any other distribution with the same constraints. Using them is not an assumption about mechanism — it’s the least informative choice.
The GLM framework:
where is the link function that maps the linear model to the natural parameter.
Counting and Classification (Ch 10)
Binomial Regression (Logistic)
- Model binary or proportion outcomes
- Logit link:
- Interpret on log-odds scale; exponentiate for odds ratios
Poisson Regression
- Model counts when there’s no known maximum
- Log link:
- Offset term for varying exposure:
Monsters and Mixtures (Ch 11)
Ordered Categorical (Ordinal)
- Cumulative logit model: each threshold gets its own intercept
Zero-Inflated Poisson
A mixture: with probability the outcome is always 0 (never even attempts); with probability it follows a Poisson process.
Over-Dispersed Models
When variance exceeds what the simple model predicts:
- Beta-binomial: continuous mixture of binomial probabilities
- Gamma-Poisson (negative binomial): continuous mixture of Poisson rates
These are the observational-level equivalents of multilevel models — they model unexplained heterogeneity without explicitly modeling groups.
See Also
- Generalized Linear Models — BDA3’s treatment (Ch 16)
- Discrete Choice Models — econometric discrete choice, a GLM application
- Overfitting and Information Criteria — model comparison for these models
- Hierarchical Models — multilevel models as an alternative to overdispersion mixtures
- Statistical Rethinking - Overview