History Matching for ABMs

Summary

History Matching (HM) is an iterative wave-based procedure that eliminates implausible parameter regions for an ABM. In each wave, parameter samples are scored by an implausibility metric combining model error with all quantified uncertainties. Implausible regions are discarded; the retained non-implausible space is sampled more densely in the next wave. HM stops when the non-implausible space stops shrinking.

Overview

HM originated in climate and physical modeling (Craig et al. 1997) and has been adapted for ABMs. Unlike Bayesian calibration, HM makes no probabilistic statements about parameters — it only labels a region as implausible (“could not plausibly produce the observed data”) or non-implausible (“could”). This binary output is then used as an informed prior for ABC.

Implausibility Score

Definition: Implausibility Score

For a parameter set and observation , the implausibility is:

where:

  • = squared error between simulation output and expected output
  • = ensemble variance (stochastic variability across runs with same parameters)
  • = observation uncertainty
  • = model discrepancy variance

A parameter set is implausible if . By Pukelsheim’s rule, ensures the correct parameter set has with probability .

Wave Structure

Each HM wave:

  1. Sample parameter sets from the current non-implausible space using Latin Hypercube Sampling (LHS)
  2. Run the model times for each sample (ensemble) to estimate
  3. Calculate implausibility for each sample
  4. Discard implausible samples (); retain non-implausible samples
  5. The retained non-implausible region becomes the sampling space for the next wave

Stopping criteria: when all parameters are implausible, or when the non-implausible area does not decrease further between waves.

Model Discrepancy

Definition: Model Discrepancy Variance ( )

where is the average model error across all parameter sets tested. This estimates how much variation in model output arises from imperfect model specification — the gap between the best model and reality.

Key implication: Model discrepancy cannot be reduced by better calibration — it reflects fundamental model imperfection and must be explicitly acknowledged.

Ensemble Variance

Definition: Ensemble Variance ( )

where is the ensemble size and .

Choose by running models across a range of ensemble sizes and selecting the smallest at which variance stabilises. In the SugarScape example, ; in the birds model, .

Multiple Outputs

When the model produces multiple observed outputs (, e.g., small/medium/large farm counts in RISC), a separate implausibility measure is computed for each output and the maximum is used:

Key Differences from Other Methods

AspectHMABCGA / Simulated Annealing
OutputNon-implausible regionPosterior distributionPoint estimate
Probabilistic statementsNoYesNo
Handles uncertainty explicitlyYesImplicitly via No
Computational cost (runs)80–320 (birds)11,000+ (birds, no HM)256–290 (birds)

SugarScape Example

In the SugarScape toy model (2 parameters: metabolism , vision ):

  • Wave 1: Full grid tested; substantial implausible region identified (dark grey in figure)
  • Wave 10: Non-implausible region narrowed to upper-right corner (high metabolism, high vision)
  • HM correctly identifies that the true parameters {metabolism=4, vision=6} lie in this region

Connections

See Also