Bayesian Experimental Design - Overview
Summary
Bayesian experimental design (BED / BOED) chooses experiment designs that maximize the expected information gain (EIG) about latent variables — the expected reduction in posterior entropy, equivalently the mutual information . The central obstacle is that the EIG is a doubly-intractable nested expectation: this overview maps how three papers progressively solve it — fast variational EIG estimators (Foster et al. 2019), a unified stochastic-gradient scheme that jointly optimizes estimator and design (Foster et al. 2020), and a review of the resulting “computational revolution” up to policy-based adaptive design (Rainforth et al. 2023).
Overview
When experimentation is costly, slow, or dangerous, we want to choose the design that teaches us the most. BED formalizes “the most” information-theoretically: build a Bayesian model — prior and likelihood/simulator — and pick the design . This is a principled, model-based alternative to classical (frequentist) design criteria based on the Fisher information matrix.
The framework dates to Lindley (1956) and Chaloner & Verdinelli (1995). Its modern resurgence is driven by machine-learning tools — amortized variational inference, stochastic gradients, and neural networks — that finally make the EIG cheap enough to optimize in high dimensions and in real time. These three ingested papers (all from the Oxford / Rainforth group) are the methodological core of that resurgence.
Main Content
The four ingested sources
| Source | Role | Key contribution |
|---|---|---|
| Lindley (1956) — On a Measure of the Information… (Ann. Math. Stat.) | foundation | Defines the average information of an experiment (= EIG); non-negativity, additivity, the design rule; determinant criterion → Lindley’s Information Measure |
| Foster et al. 2019 — Variational BOED (NeurIPS) | estimation | Four fast variational EIG estimators with convergence, vs for nested Monte Carlo |
| Foster et al. 2020 — Unified SGD BOED (AISTATS) | estimation + optimization | Replaces the two-stage (estimate-then-optimize) procedure with a single stochastic-gradient ascent on a variational lower bound; introduces the ACE and PCE bounds |
| Rainforth et al. 2023 — Modern BED (Statistical Science) | review | Synthesizes nested estimation, debiasing (MLMC), variational bounds, gradient optimization, and policy-based adaptive design (DAD) |
The central problem the field solves
The EIG (Expected Information Gain) cannot be evaluated directly because both the marginal likelihood and the posterior are intractable — a double intractability requiring nested estimation. The progression across the three papers is:
- Make estimation fast — replace per-outcome nested Monte Carlo with amortized variational approximations that share information across outcomes (Variational BOED - Overview).
- Fuse estimation and optimization — make the variational bound differentiable in both the variational and design parameters, so one SGD loop does everything (Unified SGD BOED - Overview).
- Scale to adaptive, real-time, implicit-model settings — debiasing schemes, implicit-likelihood estimators, and amortized design policies (Modern Bayesian Experimental Design - Overview).
Folder map
- Foundations — the shared conceptual core: Lindley’s Information Measure, Expected Information Gain, Nested Estimation and Nested Monte Carlo, Sequential and Adaptive BED.
- Variational EIG Estimators — Foster 2019: the posterior, marginal, VNMC, and implicit-likelihood estimators.
- Gradient-Based Unified BOED — Foster 2020: BA, ACE, PCE, likelihood-free ACE, and high-dimensional applications.
- Modern BED Review — Rainforth 2023: objectives, the computational revolution, optimization, policies, and open challenges.
Connections
- Generalizes / formalizes classical experimental design (Fisher information, alphabetic A/D/E-optimality) within a coherent Bayesian decision-theoretic framework — see Information-Theoretic Design Objectives.
- Special cases include Bayesian active learning (BALD), Bayesian optimization, and adaptive design optimization in cognitive science.
- Builds on mutual-information estimation from representation learning (InfoNCE, MINE, Barber–Agakov) — the variational EIG bounds are MI bounds repurposed for design.
- Contrasts with frequentist experimental design (power analysis, Type S/M errors) which fixes a design before data and reasons about long-run error rates rather than information.
See Also
- Expected Information Gain — the objective every method optimizes
- Decision Analysis — EIG as the expected utility of an experiment under a KL/log-score utility
- Approximation Methods — variational inference, the engine behind fast EIG estimation
- Experimental Design (frequentist) — the classical counterpart