The SBC Algorithm

Summary

The SBC procedure: for each of replications, draw a ground truth from the prior, simulate data from it, fit the algorithm to get posterior draws, then compute the rank of the prior draw within those draws for each one-dimensional quantity of interest. Histogram the ranks across replications; under correct computation they form a discrete uniform . Algorithm 2 adds a thinning step so the procedure works with correlated MCMC samples.

Overview

SBC operationalizes the uniformity theorem (Rank Statistics and Uniformity). Its only requirement is a generative model. It is expensive — you fit simulated datasets before ever touching your real data — but the fits are embarrassingly parallel (clusters/cloud; the paper’s examples ran in at most a few hours). Even a handful of simulations catches gross problems.

Main Content

Algorithm 1 — Simulation-Based Calibration (ideal / independent samples)

Initialize a histogram with bins centered around . for in do

  1. Draw a prior sample .
  2. Draw a simulated dataset .
  3. Draw posterior samples (via the algorithm under test).
  4. for each one-dimensional random variable do: compute the rank statistic and increment that quantity’s histogram.

Analyze each histogram for uniformity against discrete .

Parameter choices.

  • = number of replications (independent datasets). Limited by compute; controls the sensitivity of the histogram. The paper used for the regression/8-schools experiments and for the expensive INLA spatial model.
  • = number of posterior draws per dataset → possible ranks → bins span . The experiments use , so ranks follow . Reducing speeds the procedure at the cost of sensitivity.
  • Choosing as a power of 2 makes re-binning easy: e.g. take draws when compute-limited.
  • Confidence band: each histogram is overlaid with a gray band covering 99% of the variation expected under uniformity. Formally the band runs from the 0.005 to the 0.995 percentile of , so on average only ~1 bin in 100 should poke outside it under correct computation.
  • Re-binning for noise reduction: pair neighboring ranks into (or fewer) coarser bins; experience shows gives a good trade-off between expressiveness and variance reduction.

Algorithm 2 — SBC for correlated MCMC (with thinning)

Initialize a histogram with bins centered around . for in do

  1. Draw ; draw .
  2. Run the Markov chain for iterations to generate correlated posterior samples .
  3. Compute the effective sample size of for the function .
  4. if then rerun the Markov chain for iterations.
  5. Uniformly thin the correlated sample to states and truncate any leftover draws at .
  6. Compute the rank statistic (Eq. 4.1) and increment the histogram.

Analyze the histogram for uniformity.

The thinning in Algorithm 2 restores the independence condition that Theorem 1 requires (see Interpreting SBC Histograms for the autocorrelation rationale). When running SBC over multiple quantities, thin the chain once using the largest thinning value determined over all quantities; the paper recommends a minimum based on empirical quantiles of (e.g. 19 equispaced quantiles).

Examples

Standard experiment configuration

Setup: posterior draws per fit so ranks follow ; replications for the regression and 8-schools studies, for INLA. Result: Each parameter / quantity gets its own rank histogram with a 99% gray band. Interpretation: Uniform histogram ⇒ calibrated computation; structured deviations ⇒ specific failure modes (see Interpreting SBC Histograms, SBC Case Studies).

Connections

See Also