Bayesian Workflow

Routing Summary

This folder covers the iterative Bayesian modeling cycle from the Bayesian Workflow paper (Gelman et al. 2020) plus a deep treatment of Simulation-Based Calibration (Talts et al. 2018). Contains 13 notes.

Concept Map

ConceptNoteTypeDepends OnKey Result
Full workflow vs mere Bayesian inference, Figure 1Bayesian Workflow - OverviewoverviewProbability and Bayesian Inference, MCMC Basics, Hierarchical ModelsWorkflow = iterative cycle, not just fitting
Model selection, modular construction, prior predictiveChoosing and Building ModelsconceptBayesian Workflow - Overview, Hierarchical Models, Probability and Bayesian InferenceBuild models modularly, check priors first
Warmup, convergence, fake-data simulation, SBCFitting and Validating ComputationconceptChoosing and Building Models, MCMC Basics, Efficient MCMC, Bayesian Workflow - OverviewSBC validates the full inference pipeline
Folk theorem, reparameterization, multimodalityComputational TroubleshootingconceptFitting and Validating Computation, MCMC Basics, Efficient MCMC, Hierarchical ModelsComputational problems often signal model problems
Posterior predictive checks, cross-validation, prior influenceEvaluating Fitted ModelsconceptFitting and Validating Computation, Computational Troubleshooting, Hierarchical Models, Bayesian Workflow - OverviewEvaluate models against data and domain knowledge
Model modification, topology of models, stackingIterative Model ImprovementconceptEvaluating Fitted Models, Choosing and Building Models, Hierarchical Models, Bayesian Workflow - OverviewExpand or modify models based on evaluation
Version control, testing, reproducibilityModeling as Software DevelopmentconceptBayesian Workflow - Overview, Fitting and Validating Computation, Choosing and Building Models, Evaluating Fitted ModelsTreat model code like software
SBC method, validates correct posterior sampling, generalizes Cook-Gelman-Rubin (2006)Simulation-Based Calibration - OverviewoverviewData-Averaged Posterior Self-Consistency, Bayesian Workflow - OverviewSBC checks computation, complements PPCs
Prior = average of exact posteriors over joint-distribution data (Eq. 1)Data-Averaged Posterior Self-ConsistencytheoremData-averaged posterior equals the prior
Rank of prior draw in posterior sample ~ discrete Uniform[0,L] (Theorem 1)Rank Statistics and UniformitytheoremData-Averaged Posterior Self-ConsistencyRanks uniform iff sampling is exact & independent
SBC procedure: sample θprior, ylikelihood, fit, rank; Algorithm 1 & 2 (thinning)The SBC AlgorithmconceptRank Statistics and Uniformity, Data-Averaged Posterior Self-ConsistencyN parallel fits, L draws → rank histogram
Histogram shapes: ∪=under-dispersed, ∩=over-dispersed, sloped=biased; autocorrelation/thinning; ECDFInterpreting SBC HistogramsconceptThe SBC Algorithm, Rank Statistics and UniformityDeviation shape diagnoses the failure mode
Worked SBC experiments: misspecified prior, centered 8-schools HMC bias, ADVI, INLA spatialSBC Case StudiesexampleThe SBC Algorithm, Interpreting SBC HistogramsSBC catches distinct real failure modes

Notes

  • Bayesian Workflow - Overview — CONTAINS: Full workflow diagram (Figure 1), workflow vs inference distinction, iterative cycle overview
  • Choosing and Building Models — CONTAINS: Initial model selection, modular construction, prior predictive checking, domain expertise integration
  • Fitting and Validating Computation — CONTAINS: MCMC warmup, convergence checks, fake-data simulation, simulation-based calibration (SBC)
  • Computational Troubleshooting — CONTAINS: Folk theorem of statistical computing, reparameterization strategies, multimodality diagnosis
  • Evaluating Fitted Models — CONTAINS: Posterior predictive checks, cross-validation, sensitivity to priors, residual analysis
  • Iterative Model Improvement — CONTAINS: Model expansion, topology of model space, Bayesian stacking, when to stop iterating
  • Modeling as Software Development — CONTAINS: Version control for models, unit testing, reproducibility practices, documentation
  • Simulation-Based Calibration - Overview — CONTAINS: What SBC validates (correct posterior sampling), naive single-dataset check counterexample, relation to Geweke (2004) and Cook-Gelman-Rubin (2006), complement to posterior predictive checks, place in the workflow
  • Data-Averaged Posterior Self-Consistency — CONTAINS: Eq. 1 self-consistency identity (prior = data-averaged posterior), full statement + notation + proof sketch, data-averaged posterior definition
  • Rank Statistics and Uniformity — CONTAINS: Rank statistic construction (Eq. 4.1), Theorem 1 uniformity statement + conditions (independence, exact sampling), Appendix B proof sketch, why ranks beat CDF values
  • The SBC Algorithm — CONTAINS: Algorithm 1 (ideal) and Algorithm 2 (thinned MCMC) step by step, choice of N and L, 99% Binomial confidence band, re-binning, effective-sample-size thinning
  • Interpreting SBC Histograms — CONTAINS: Uniform/∪/∩/sloped shape catalogue and meanings, autocorrelation boundary spikes + thinning correction (N_eff, CLT), ECDF and ECDF-difference for small deviations
  • SBC Case Studies — CONTAINS: Misspecified-prior ∪-shape (6.1), centered 8-schools HMC bias + non-centered autocorrelation (6.2), ADVI gross slope bias (6.3), INLA subtle spatial bias via ECDF (6.4), Stan Listings 1-4

Sources

  • BayesWorkflow.pdf — Gelman et al. (2020), arXiv:2011.01808
  • 1804.06788-Talts-SBC.pdf — Talts, Betancourt, Simpson, Vehtari & Gelman (2018), “Validating Bayesian Inference Algorithms with Simulation-Based Calibration”, arXiv:1804.06788 (shares authors with the Bayesian Workflow paper and BDA3)