Gradient-Based Unified BOED

Routing Summary

Foster et al. (2020), A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments (AISTATS). One SGA loop that jointly tightens a variational lower bound on the EIG and optimizes the design — no separate outer optimizer, scales to 100s of design dimensions. Contains 4 notes + overview.

Concept Map

ConceptNoteTypeDepends OnKey Result
Two-stage→one-stage; lower bounds; BA/ACE/PCE; recommend ACEUnified SGD BOED - OverviewoverviewVariational BOED - OverviewJoint SGA on ; ~2× EIG vs BO in high-D
ACE bound; Theorem 1; adaptive contrastive tighteningAdaptive Contrastive Estimation (ACE)theoremVariational Posterior Estimator (Barber-Agakov) tight if posterior or ; monotone in
PCE; prior contrasts; InfoNCE; unnormalized priorPrior Contrastive Estimation (PCE)theoremAdaptive Contrastive Estimation (ACE) ACE with ; tight as ; = InfoNCE
Theorem 2; score/reparam/RB gradientsLikelihood-Free ACE and Gradient EstimationtheoremAdaptive Contrastive Estimation (ACE)Unnormalized keeps a valid lower bound; reparam ≪ score variance
Death process; 400-D regression; docking; CESHigh-Dimensional Design ApplicationsexampleUnified SGD BOED - OverviewGradient methods ~2× EIG vs BO; ACE beats experts (docking)

Notes

  • Unified SGD BOED - Overview — CONTAINS: the two-stage problem; unified lower-bound idea (why lower not upper); BA/ACE/PCE table; Theorems 1–2 summary; gradient-estimator summary; two-stage-vs-one-stage headline; five-experiment summary.
  • Adaptive Contrastive Estimation (ACE) — CONTAINS: (Eq. 11); Theorem 1 (lower bound + KL error, monotone in , exact as or perfect ); BA = case; InfoNCE connection; death-process result.
  • Prior Contrastive Estimation (PCE) — CONTAINS: (Eq. 12); InfoNCE bound (Eq. 13); unnormalized-prior trick (Eq. 15) for iterated design; PCE-vs-ACE selection.
  • Likelihood-Free ACE and Gradient Estimation — CONTAINS: Theorem 2 (unnormalized likelihood → valid lower bound, Eq. 14); score-function (Eqs. 16–17), reparameterization (Eq. 18), Rao–Blackwell (Eq. 19) gradients; Kleinegesse & Gutmann parallel.
  • High-Dimensional Design Applications — CONTAINS: death process (Figs. 1–2), 400-D regression (Table 1), advertising ablation (Fig. 3), 100-D biomolecular docking vs experts (Table 2), CES iterated design (Fig. 4); ACE+VNMC bound-trapping; design-error metric.

Sources

See Also