Variational NMC Estimator

Summary

The variational nested Monte Carlo (VNMC) estimator combines a learned posterior proposal with importance-sampled NMC. It gives an upper bound on the EIG that is tight when is the true posterior or as the number of inner samples . This is the key property: VNMC is the only one of the four estimators that remains asymptotically consistent even when the variational family does not contain the target — it trades NMC’s slow consistency against variational speed.

Overview

The posterior and marginal estimators are fast but converge to a biased answer if the variational family cannot represent the target. NMC has the opposite profile: unbiased in the limit but slow. VNMC interpolates: use a learned proposal to make NMC efficient, keeping NMC’s asymptotic consistency. Think of it as NMC (Nested Estimation and Nested Monte Carlo) where the inner importance-sampling proposal is learned rather than fixed to the prior.

Main Content

Definition: VNMC bound and estimator (Foster 2019, Eqs. 10–11)

The upper bound uses one sample from the model and samples from the proposal:

expectation over . The final EIG estimator uses inner samples after training :

Lemma 1 — Properties of the VNMC bound (Foster 2019)

For any model and valid :

  1. Monotone tightening: and for .
  2. Exactness: iff for all .
  3. Gap as expected KL: .

The defining advantage: consistency without a perfect family

Property 1 means we can obtain asymptotically unbiased EIG estimates even for an imperfect simply by increasing . Training: first run steps of stochastic gradient on with fixed (fast, cost ); then form the final NMC estimator with (slow refinement, cost ), removing residual bias. Standard NMC is the special case where the proposal is naively the prior () — it skips the cheap first stage and so needs a far larger budget for the same accuracy.

Cost and rate

Total cost is . With the NMC allocation, converges at in its second stage — and unlike , it keeps improving past the variational plateau because it removes asymptotic bias (Foster 2019, Fig. 2).

Examples

VNMC pre-training (Foster 2019 §6.2, Fig. 2)

On the A/B-test design point, plotting EIG estimates with and “0 steps” of pre-training corresponds to plain NMC. Spending some budget training (125–2500 steps) gives noticeably better estimates, and increasing continues to improve — VNMC does not plateau like the pure variational estimators.

Connections

  • Bridges NMC (consistent, slow) and the variational estimators (fast, biased). Foster 2019 notes the variational/MC interplay is not analogous to standard inference because the NMC EIG estimator is itself inherently biased.
  • In Foster 2020, the VNMC upper bound is paired with the ACE lower bound to trap the true EIG when verifying high-dimensional designs.
  • Property 3 (gap = expected KL of a product proposal) parallels the importance-weighted autoencoder (IWAE) bound structure.

See Also