SBC Case Studies

Summary

The paper’s experiments (Sec. 6) show SBC catching real, distinct failure modes across four algorithm/model combinations: a misspecified prior (correct code, wrong model → ∪-shape), biased HMC on a centered hierarchical model (8-schools funnel → sloped + autocorrelation spikes), a grossly biased ADVI variational approximation, and a subtle INLA bias in spatial disease mapping detectable only via the ECDF difference. All use posterior draws ⇒ ranks .

Overview

Each case implements SBC and reads the resulting rank histogram against the 99% band (Interpreting SBC Histograms). Sections 6.1-6.3 used replications; the expensive INLA study (6.4) used . The linear regression data-generating process and inference model are Listings 1-2 (Stan); 8-schools centered/non-centered are Listings 3-4.

Main Content

6.1 Misspecified prior — under-dispersion (∪-shape)

Setup: Linear regression (Listing 2). Prior samples drawn with but the inference prior set to — a common probabilistic-programming mistake (prior used to generate differs from prior used to fit). Result: Even with exact computation, the posterior for is under-dispersed relative to the (wider) generating prior; the SBC histogram shows the characteristic ∪-shape with boundary spikes (Fig. 9). Interpretation: SBC detects model mis-implementation, not just buggy algorithms. The ∪ matches the over-confident / too-narrow signature of Fig. 6.

6.2 Biased MCMC — centered 8-schools (sloped + autocorrelation)

Setup: Hierarchical 8-schools model (Rubin 1981), centered parameterization (Listing 3): , , . This induces a funnel geometry that contracts to strong curvature at small — hard for any MCMC to explore. Fit with Stan’s dynamic HMC; SBC run with Algorithm 1 (un-thinned, deliberately, since the bias dominates and post-thinning to is impractical given the low effective sample rate). Result: Rank histograms for and show HMC samples biased toward larger than were used to generate the data (Fig. 10) — a sloped/asymmetric histogram consistent with the known funnel pathology. The non-centered parameterization (Listing 4) behaves correctly: thinned (Algorithm 2) it is uniform (Fig. 11a); un-thinned it shows large autocorrelation spikes at (Fig. 11b). Interpretation: SBC flags biased MCMC even when general-purpose divergence diagnostics are unavailable — especially valuable for hierarchical models. The centered/non-centered contrast also cleanly separates a true bias (Fig. 10) from an autocorrelation artifact (Fig. 11b), the latter removable by thinning.

6.3 ADVI fails on a simple model — gross bias

Setup: Automatic Differentiation Variational Inference (ADVI) in Stan 2.17.1 applied to the simple linear regression (Listing 2). SBC run with Algorithm 1 (ADVI produces independent, non-autocorrelated draws). Result: ADVI drastically underestimates the posterior for the slope ; the rank histogram is strongly biased toward larger values (Fig. 12), a sharp contrast with HMC’s uniform histogram on the same model (Fig. 2). Interpretation: Even on an easy model a variational approximation can be badly miscalibrated; SBC exposes it immediately.

6.4 INLA — subtle bias in spatial disease mapping

Setup: Spatial HIV-prevalence model for the 2003 Kenya DHS (Corsi et al. 2012; setup from Wakefield, Simpson & Godwin 2016). , with a Gaussian process approximated via SPDE (Lindgren, Rue & Lindström 2011), Matérn-type covariance . Priors: ; penalized-complexity priors on . Quantity of interest: average prevalence . Fit with R-INLA (approximate posterior sampler), . Result: The raw SBC histogram (Fig. 13a) shows no obvious deviation — but the gray band is wide, so the histogram is too noisy to be conclusive. The ECDF and especially the ECDF-difference plots (Fig. 13b,c) reveal that low ranks occur slightly more often than uniform ⇒ a small genuine bias. Interpretation: INLA is a fine approximation where prevalence is moderate (Kenya, ~5.4%) but inaccurate where binomial counts carry little information (near-zero prevalence, e.g. Australia ~0.1%). The subtlety required the ECDF-difference view from Interpreting SBC Histograms (Sec. 5.2) rather than the histogram alone.

Examples

(The four experiments above are the worked examples; the regression data-generating process / inference model are Stan Listings 1-2, the 8-schools centered/non-centered models Listings 3-4 in Appendix A.)

Connections

See Also