Variational NMC Estimator
Summary
The variational nested Monte Carlo (VNMC) estimator combines a learned posterior proposal with importance-sampled NMC. It gives an upper bound on the EIG that is tight when is the true posterior or as the number of inner samples . This is the key property: VNMC is the only one of the four estimators that remains asymptotically consistent even when the variational family does not contain the target — it trades NMC’s slow consistency against variational speed.
Overview
The posterior and marginal estimators are fast but converge to a biased answer if the variational family cannot represent the target. NMC has the opposite profile: unbiased in the limit but slow. VNMC interpolates: use a learned proposal to make NMC efficient, keeping NMC’s asymptotic consistency. Think of it as NMC (Nested Estimation and Nested Monte Carlo) where the inner importance-sampling proposal is learned rather than fixed to the prior.
Main Content
Definition: VNMC bound and estimator (Foster 2019, Eqs. 10–11)
The upper bound uses one sample from the model and samples from the proposal:
expectation over . The final EIG estimator uses inner samples after training :
Lemma 1 — Properties of the VNMC bound (Foster 2019)
For any model and valid :
- Monotone tightening: and for .
- Exactness: iff for all .
- Gap as expected KL: .
The defining advantage: consistency without a perfect family
Property 1 means we can obtain asymptotically unbiased EIG estimates even for an imperfect simply by increasing . Training: first run steps of stochastic gradient on with fixed (fast, cost ); then form the final NMC estimator with (slow refinement, cost ), removing residual bias. Standard NMC is the special case where the proposal is naively the prior () — it skips the cheap first stage and so needs a far larger budget for the same accuracy.
Cost and rate
Total cost is . With the NMC allocation, converges at in its second stage — and unlike , it keeps improving past the variational plateau because it removes asymptotic bias (Foster 2019, Fig. 2).
Examples
VNMC pre-training (Foster 2019 §6.2, Fig. 2)
On the A/B-test design point, plotting EIG estimates with and “0 steps” of pre-training corresponds to plain NMC. Spending some budget training (125–2500 steps) gives noticeably better estimates, and increasing continues to improve — VNMC does not plateau like the pure variational estimators.
Connections
- Bridges NMC (consistent, slow) and the variational estimators (fast, biased). Foster 2019 notes the variational/MC interplay is not analogous to standard inference because the NMC EIG estimator is itself inherently biased.
- In Foster 2020, the VNMC upper bound is paired with the ACE lower bound to trap the true EIG when verifying high-dimensional designs.
- Property 3 (gap = expected KL of a product proposal) parallels the importance-weighted autoencoder (IWAE) bound structure.
See Also
- Adaptive Contrastive Estimation (ACE) — the lower-bound contrastive counterpart, also tight as
- Variational Marginal Estimator — the other upper bound (biased if family wrong)
- Convergence Rates and Estimator Selection — full rate analysis (Theorem 1) and selection guidance