HMC and Stan in Practice

Summary

Chapter 8 of Statistical Rethinking introduces Markov chain Monte Carlo through the King Markov parable, then covers Hamiltonian Monte Carlo (HMC) as implemented in Stan. The map2stan function translates map-style model definitions directly to Stan code, bridging the book’s earlier quadratic approximation with full MCMC.

Good King Markov

A parable introducing the Metropolis algorithm: a king visits islands in proportion to their population by flipping a coin to decide whether to move to the neighboring island or stay. Over time, the fraction of time spent on each island converges to the population proportions — sampling from the target distribution without knowing the normalizing constant.

From Metropolis to Hamiltonian Monte Carlo

AlgorithmMetaphorEfficiency
MetropolisRandom walk among islandsLow — random proposals waste many steps
Gibbs (adaptive proposals)Conjugate tricksMedium — requires conjugate structure
HMCFrictionless physics simulationHigh — exploits gradient information to make long, directed jumps

HMC treats the posterior surface as a landscape and simulates a particle sliding across it. The particle’s momentum carries it to distant, high-probability regions efficiently.

NUTS (No-U-Turn Sampler) automates the tuning of HMC’s trajectory length — it’s the default algorithm in Stan.

Using map2stan

The rethinking package’s map2stan function takes the same model formula syntax as map but fits via Stan:

m8.1 <- map2stan(
    alist(
        y ~ dnorm(mu, sigma),
        mu <- a + b*x,
        a ~ dnorm(0, 10),
        b ~ dnorm(0, 1),
        sigma ~ dcauchy(0, 2)
    ),
    data=d
)

Diagnosing Your Markov Chain

Key diagnostics:

  • Trace plots: chains should look like “hairy caterpillars” — well-mixed, stationary
  • (R-hat): ratio of between-chain to within-chain variance; should be
  • (effective sample size): accounts for autocorrelation; want for reliable intervals
  • Divergent transitions: indicate the sampler hit a region of extreme curvature — reparameterize the model

Divergent Transitions

Never ignore divergent transitions. They indicate the posterior geometry is pathological. Common fix: use a non-centered parameterization for hierarchical models (see Efficient MCMC).

See Also