Garden of Forking Data

Summary

Chapter 2 of Statistical Rethinking introduces Bayesian inference through the metaphor of a “garden of forking data” — counting the ways data could have been produced under each possible parameter value. This chapter covers the globe-tossing example, Bayesian updating, and three computational approaches: grid approximation, quadratic approximation, and MCMC.

Small Worlds and Large Worlds

  • Small world: the self-contained logical world of the model, where all possibilities are known
  • Large world: the real world, where the model is always an approximation

Bayesian inference guarantees optimal answers in the small world. Whether those answers are useful in the large world depends on how well the model captures reality.

The Garden of Forking Data

For each possible parameter value , count the number of paths through the data that are consistent with . More consistent paths → higher plausibility.

This is Bayes’ theorem in counting form:

The globe-tossing example: estimating the proportion of water on Earth by tossing a globe and recording “water” or “land.”

Components of the Model

Every Bayesian model has three components:

  1. Likelihood: — the binomial distribution
  2. Prior: initial plausibility of each value before seeing data
  3. Posterior: updated plausibility after conditioning on data

Bayesian Updating is Sequential

The posterior from one batch of data becomes the prior for the next. The final result is the same regardless of whether you update one observation at a time or all at once.

Three Computational Engines

MethodHow it worksWhen to use
Grid approximationEvaluate posterior at discrete grid pointsSmall number of parameters
Quadratic (Laplace) approximationApproximate posterior as Gaussian at the modeMedium problems; see Approximation Methods
MCMCDraw samples proportional to posteriorComplex models; see MCMC Basics

See Also