The Computational Revolution in EIG Estimation

Summary

The review’s §3 organizes the recent breakthroughs in EIG estimation into three threads, all aimed at escaping nested Monte Carlo’s biased, trap. (1) Debiasing via Multi-Level Monte Carlo (MLMC) — Goda et al. (2022) produce a fully unbiased, finite-variance EIG (and gradient) estimator recovering the rate. (2) Functional / variational approximation — learn an amortized approximation to the intractable density; a normalized one automatically gives a variational bound. (3) Implicit-likelihood estimation — bounds that need only samples of , enabling simulator-based models.

Overview

Whichever form of the EIG we use, we hit a doubly-intractable nested expectation (Nested Estimation and Nested Monte Carlo). The traditional fixes — nested Laplace approximations (biased) and nested Monte Carlo (consistent but slow, biased at finite , cost ) — both have serious drawbacks. The review frames modern progress as two largely complementary families (debiasing vs functional approximation), plus the special case of implicit models.

Main Content

Thread 1 — Debiasing schemes (Multi-Level Monte Carlo)

Goda et al. (2022) unbiased MLMC EIG (Rainforth 2023, Eqs. 9–11)

Express the EIG as the expectation of the , NMC estimator, then write that as a telescoping sum:

where the level- inner samples are split into two antithetically coupled halves . An importance sampler over levels, with , produces an unbiased estimate of the infinite sum from a single sampled term:

The antithetic coupling gives the estimator (and its -gradient) finite expected variance and cost, recovering the standard (unnested) Monte Carlo rate . Cost per sample can still be significant ( likelihood evaluations), but it needs no variational family — and so removes any family-misspecification bias.

Thread 2 — Functional and variational approximation

Rather than re-estimate the nested term from scratch for each , exploit its smoothness and learn a functional approximation , then plug it into the EIG via standard Monte Carlo. Costs become additive ( achievable), not multiplicative.

Variational bounds from normalized approximations (Rainforth 2023, Eqs. 12–14)

If is a valid normalized density, it produces a variational upper bound: , equality iff . An amortized inference network instead gives a lower bound: , equality iff is exact (this is exactly the classical Barber–Agakov MI bound). The expectation of the importance-sampled NMC estimator is itself a variational upper bound , tightenable by increasing — and the learned can also serve as the NMC proposal.

These are precisely the Foster 2019 estimators ( ↔ upper, ↔ lower) and the ACE / PCE family, since the EIG is a mutual information and any MI bound applies.

Thread 3 — Estimation for implicit models (§3.3.2)

When can be sampled but cannot be evaluated, an extra intractable term appears. Approaches: approximate that density in isolation; estimate the ratio by logistic regression (LFIRE-style); or use variational bounds that allow implicit likelihoods ([[Implicit Likelihood Estimator|]], likelihood-free ACE). Implicit priors are easier than implicit likelihoods — formulations based on the likelihood form of the EIG (Eqs. 3, 7, 12) avoid the prior density.

Debiasing vs variational — the trade-off

MLMC debiasingVariational/functional
Biasnone (unbiased)family-misspecification bias (unless )
Rate (when family contains target)
Per-sample costhigher ( evals)lower
Needs variational family?noyes
Gives a usable gradient?yes ( directly)yes (differentiate the bound)

Connections

See Also