Synthetic Likelihood Construction

Summary

The core estimator. Given summary statistics of the data, assume . For any , simulate replicate data sets, convert each to a statistics vector, and estimate and . The log synthetic likelihood is then the MVN log-density of the observed under those estimates. is much smoother in than the true density, invariant to reparameterization, robust to uninformative statistics, and behaves like a genuine likelihood as — so it is explored by Metropolis–Hastings MCMC, with MLE recovered by quadratic regression and model comparison via AIC/GLRT.

Overview

This note states the synthetic-likelihood algorithm and its statistical properties — the machinery that turns the phase-insensitive statistics into well-founded inference. It is the computational heart of Wood (2010).

Main Content

Multivariate-normal approximation (Wood 2010, Eq. 2)

The chosen summary statistics are taken to be approximately multivariate normal:

The mean and covariance are generally intractable functions of the model parameters , but for any they can be estimated by simulation. Using regression coefficients as statistics promotes the normality that supports this approximation.

Evaluating the synthetic likelihood (Wood 2010, Fig. 2 & Eq.)

For a given parameter vector :

  1. Use the model to simulate replicate data sets and convert each to a statistics vector exactly as was converted to .
  2. Estimate the mean: .
  3. Form and estimate the covariance: (a robust covariance estimator can be advantageous here).
  4. Drop irrelevant constants; the log synthetic likelihood is

Properties

  • Measures fit, but smoothly. Like any likelihood, measures the consistency of with the data — but it is a much smoother function of than the true density , making it optimizable and samplable.
  • Generality. Handles hidden state variables, complicated observation processes, missing data, and multiple data series.
  • Invariance & robustness. is invariant to reparameterization and robust to the inclusion of uninformative statistics, so very careful statistic selection is unnecessary; statistics may be freely transformed to improve the normality approximation (Eq. 2).
  • Asymptotic in . behaves like a conventional likelihood in the limit, giving access to likelihood-based inference machinery.

Exploring by MCMC

Metropolis–Hastings exploration (Wood 2010, Methods summary)

usually displays residual small-scale roughness, so smooth-function optimizers fail; instead use Metropolis–Hastings MCMC. From a parameter guess , iterate for :

  1. Propose , with from a convenient symmetric distribution.
  2. Set with probability ; otherwise .

The chain both locates and quantifies the range of parameter values consistent with the data. (A flat prior makes the acceptance ratio depend only on ; informative priors enter multiplicatively as usual.)

Point estimation, model comparison, and checking

  • MLE via quadratic regression. Near the maximum-likelihood estimate , the limit of is estimated by quadratic regression of the sampled values on the from the converged chain — recovering and the standard likelihood theory for inference.
  • Model comparison. Alternative models compared by AIC or generalized likelihood-ratio testing.
  • Model-checking diagnostic. If the model fits,

Connections

  • Built on the phase-insensitive statistics that make approximately normal.
  • The simulate-statistics-then-score loop parallels MSM / Indirect Inference / the SME (match simulated vs. observed summaries) but yields an explicit parametric likelihood rather than a quadratic moment criterion.
  • Closely related to ABC: both are likelihood-free and summary-statistic-based, but synthetic likelihood replaces ABC’s acceptance threshold with an MVN density, enabling standard MCMC and likelihood theory.

See Also