Variance-Based Sensitivity and Sobol Indices
Summary
Sobol’s method decomposes the output variance into contributions from individual factors and their interactions (the ANOVA / HDMR / Sobol-Hoeffding decomposition), assuming inputs are independent and uncorrelated. The first-order index measures the variance attributable to alone; the total-effect index sums all terms involving , capturing its main effect plus every interaction. The gap quantifies how much of a factor’s influence operates through interactions, and exactly for a purely additive model.
Overview
Variance-based methods rest on the assumption (Saltelli et al.) “that variance is sufficient to describe the output uncertainty.” The rationale: examine the variance of the conditional expected output given a factor. Sobol’s method “relies on decomposition of the model output variance under the assumption that inputs are independent and uncorrelated.” See Global Sensitivity Analysis - Overview.
Main Content
ANOVA / Sobol variance decomposition ^anova-decomposition
For with independent inputs, the total variance decomposes into orthogonal terms of increasing order:
where the partial (first-order) variance of factor is
and the second-order interaction variance of is
is the expected reduction in output variance if were fixed; is the additional joint effect beyond the two main effects.
First-order (main effect) Sobol index ^first-order-index
is the fraction of output variance caused by factor acting alone (its main effect, averaged over all other factors). Used for factor prioritization: a large means determining more precisely most reduces output uncertainty. The analogous second-order index is , the share due to the – interaction.
Total-effect (total-order) Sobol index ^total-effect-index
the sum over all index combinations that contain (its main effect plus every interaction it participates in). Equivalently , the share of variance left when all factors except are fixed. Used for factor fixing: can be frozen anywhere in its range with negligible effect on .
Interaction detection and budget identities ^interaction-identities
Two diagnostic relations follow directly:
- , with equality iff the model is purely additive (no interactions). A deficit measures total interaction strength.
- always; the gap is the portion of ‘s effect mediated by interactions with other factors. means acts additively.
- , with equality iff additive (interactions are counted once per participating factor, so they are double/multiply counted in the total-effect sum).
The paper notes Sobol “assesses the impact of each input parameter, both in isolation and in conjunction with other parameters,” yielding first-, second-, total-, and higher-order indices. In the MNIST case study the Sobol index was among the most reliable importance measures.
Examples
Suppose a 3-parameter ABM gives and .
- → 35% of output variance comes from interactions (non-additive model).
- : → mostly acts alone; it is the dominant main driver.
- : but → almost all of ‘s influence is through interactions. A local OAT scan holding others fixed would wrongly conclude is unimportant (see Local vs Global Sensitivity Analysis).
- has the largest total effect () despite a modest main effect — a key interacting factor that must not be fixed.
Connections
- Global Sensitivity Analysis - Overview — places variance-based methods among the four GSA families.
- Sampling and Estimation for Sobol Indices — how , are estimated (Saltelli Monte-Carlo, FAST spectral).
- Morris Elementary Effects Screening — cheap screening; Morris correlates with for ranking.
- Local vs Global Sensitivity Analysis — total-effect indices are exactly what OAT cannot capture.
- Uncertainty Quantification for ABM Calibration — variance decomposition complements output UQ.
See Also
- History Matching for ABMs; Approximate Bayesian Computation for ABMs — guides which parameters to keep active.
- Sobol (2001), “Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates.”
- Saltelli et al. (2010), “Variance based sensitivity analysis of model output: design and estimator for the total sensitivity index.”