Sampling and Estimation for Sobol Indices
Summary
Sobol indices are integrals that must be estimated from model runs. The Saltelli scheme builds two independent sample matrices and plus hybrid matrices , giving first- and total-order indices at a cost of model evaluations (or if second-order indices are also wanted). FAST (Fourier Amplitude Sensitivity Test) instead encodes each factor on a distinct integer frequency and recovers variance shares from the Fourier spectrum of the output, often converging faster than Monte-Carlo Sobol; RBD/FAST_RBD use a single frequency with random permutations to cut cost in high dimensions. All are implemented in the Python SALib library used by the paper.
Overview
The variance terms behind and (see Variance-Based Sensitivity and Sobol Indices) are conditional-variance integrals with no closed form for a general simulator, so they are estimated numerically. The paper’s case study used the Sensitivity Analysis Library in Python (SALib) for all methods. Two estimation routes dominate: Monte-Carlo with the Saltelli design (Sobol) and spectral analysis (FAST/RBD).
Main Content
Saltelli sampling scheme & cost ^saltelli-scheme
Draw two independent quasi-random (Sobol-sequence) matrices and . For each factor , form = matrix with only column replaced by column of . Running the model on , , and all matrices yields estimators
giving and . Total model evaluations = for first- + total-order; if second-order are also estimated. is the base sample size (often , e.g. 1024).
FAST — Fourier Amplitude Sensitivity Test ^fast
FAST “is based on periodic search sampling using a period search function and applies a decomposition of variance based on Fourier Transform.” Each factor is driven along a search curve at a distinct integer frequency :
By Parseval’s theorem the output variance is recovered from Fourier coefficients :
and the first-order index reads off the spectral power at ‘s frequency and its harmonics:
FAST “achieves a better estimate in terms of robustness and speed of convergence than Sobol” and handles nonlinear, non-monotonic models.
RBD and FAST_RBD (HFR) ^rbd
As grows, classic FAST suffers error and cost from resolving all higher-order harmonics. RBD uses a single frequency for all parameters (set to 1 for simplicity) with random permutation of the sample-point coordinates to restore stochasticity:
Hybrid FAST_RBD (HFR) groups the parameters into equal partitions, assigning one frequency per partition — “a balance between the accuracy of FAST and the computational efficiency of RBD.”
Choosing a budget (case-study sampling sizes) ^budget-table
Table 1 of the paper lists the sample sizes used (MNIST, 784 factors):
Method Samples Morris 50 (in 4 levels) Sobol 300 FAST 100 RBD 400 Delta 1000 DGSM 1000 These were grid-searched “to strike a balance between optimizing performance and minimizing the number of samples required.” General rule: Morris screens cheapest; FAST/RBD are economical for first-order; full Sobol total-effect is the most expensive but most informative ().
SALib ^salib
The Sensitivity Analysis Library (SALib) in Python (https://salib.readthedocs.io) implements Morris, Sobol (Saltelli sampling), FAST, RBD-FAST, Delta (DMIM), and DGSM. Typical pattern: define a
problemdict (names, bounds), call the method’ssample()to generate the design, run the model on every row, thenanalyze()to obtain , (Sobol) or , (Morris).
Examples
Quantifying 5 surviving ABM parameters (after Morris screening, see Morris Elementary Effects Screening) with Sobol at :
- Cost = model runs for first- + total-order.
- Adding second-order indices would cost runs.
- If each ABM run takes 30 s, that is ~60 h serial — motivating either a coarser , a FAST/RBD first-order screen, or a cheap emulator. SALib code:
from SALib.sample import saltelli from SALib.analyze import sobol param_values = saltelli.sample(problem, 1024) # -> N(p+2) rows Y = run_abm(param_values) # one output per row Si = sobol.analyze(problem, Y) # Si['S1'], Si['ST']
Connections
- Variance-Based Sensitivity and Sobol Indices — the indices these schemes estimate.
- Global Sensitivity Analysis - Overview — cost positions Sobol/FAST in the screen-then-quantify pipeline.
- Morris Elementary Effects Screening — cheap pre-screen that shrinks and thus Saltelli cost.
- Uncertainty Quantification for ABM Calibration — emulators/surrogates make large Saltelli budgets feasible for ABMs.
- Approximate Bayesian Computation for ABMs — shared reliance on many simulator runs / quasi-random designs.
See Also
- History Matching for ABMs — emulator-based designs amortize the run budget GSA needs.
- Saltelli (2002), “Making best use of model evaluations to compute sensitivity indices.”
- Tarantola, Gatelli & Mara (2006), “Random balance designs for the estimation of first order global sensitivity indices.”