Monsters and Mixtures

Summary

Chapters 9–11 of Statistical Rethinking cover generalized linear models (GLMs) through the lens of maximum entropy, then extend to “monster” models: zero-inflated Poisson, beta-binomial, gamma-Poisson (negative binomial), and ordered categorical outcomes.

Maximum Entropy and GLMs (Ch 9)

Why use exponential family distributions? Because they are the maximum entropy distributions for given constraints:

ConstraintMaxEnt DistributionLink
Known mean and varianceGaussianIdentity
Two outcomes, constant BinomialLogit
Count of events, constant ratePoissonLog

Nature Loves Entropy

Exponential family distributions arise naturally because there are more ways to produce them than any other distribution with the same constraints. Using them is not an assumption about mechanism — it’s the least informative choice.

The GLM framework:

where is the link function that maps the linear model to the natural parameter.

Counting and Classification (Ch 10)

Binomial Regression (Logistic)

  • Model binary or proportion outcomes
  • Logit link:
  • Interpret on log-odds scale; exponentiate for odds ratios

Poisson Regression

  • Model counts when there’s no known maximum
  • Log link:
  • Offset term for varying exposure:

Monsters and Mixtures (Ch 11)

Ordered Categorical (Ordinal)

  • Cumulative logit model: each threshold gets its own intercept

Zero-Inflated Poisson

A mixture: with probability the outcome is always 0 (never even attempts); with probability it follows a Poisson process.

Over-Dispersed Models

When variance exceeds what the simple model predicts:

  • Beta-binomial: continuous mixture of binomial probabilities
  • Gamma-Poisson (negative binomial): continuous mixture of Poisson rates

These are the observational-level equivalents of multilevel models — they model unexplained heterogeneity without explicitly modeling groups.

See Also