Sensitivity Analysis in Observational Studies
Summary
Unconfoundedness is untestable in observational studies. Sensitivity analysis assesses how robust causal conclusions are to unmeasured confounding. Two main parametrization classes: (1) distributions involving unmeasured confounders , and (2) distributions of potential outcomes directly. The E-value (VanderWeele & Ding 2017) provides a model-free, easy-to-compute measure of robustness; copula-based methods provide a flexible Bayesian approach.
Overview
Unconfoundedness (no unmeasured confounding, ^def-ignorability) holds by design in randomized experiments but is fundamentally untestable in observational studies. Sensitivity analysis asks: How would conclusions change if there is an unmeasured confounder?
Two broad classes of sensitivity analysis methods differ in how they parametrize the confounding. Both can be used in a Bayesian framework.
Parametrization via Unmeasured Confounders ()
Setup (Cornfield et al. 1959; Rosenbaum & Rubin 1983)
Let be an unmeasured binary confounder such that conditional on , unconfoundedness holds: .
The joint distribution of all variables factorizes as:
Sensitivity parameters: the association between and treatment (), and between and outcome (). The observed-data distribution is identified; the sensitivity parameters are not.
Cornfield Inequality
Theorem: Cornfield Inequality
If a hidden confounder could explain away the observed association between treatment and outcome , the association between and , and between and , must both be at least as large as the observed association between and .
Originally motivated by the debate over whether smoking causes lung cancer — a hidden genetic factor would need an implausibly large association with both smoking and cancer to explain away the observed effect.
Rosenbaum & Rubin (1983) Logistic Model
For a binary and binary with binary (as a stratification variable):
- Logistic model for , logistic for , Bernoulli for
- Sensitivity parameters: log-odds ratio for and
- Treat as fixed (Frequentist) or as priors (Bayesian); compute over a plausible range
Bayesian analogue (Dorie et al. 2016): Straightforward — place priors on sensitivity parameters, use data augmentation to impute , obtain posterior for .
E-Value (VanderWeele & Ding 2017)
Definition: E-Value
The E-value is the minimum strength of association (on the risk ratio scale) that an unmeasured confounder would need to have with both the treatment and outcome — above and beyond measured covariates — to fully explain away the observed treatment-outcome association.
Mathematically, define the sensitivity parameters as the treatment-confounder () association and the outcome-confounder () association. Based on Ding & VanderWeele (2016), the resulting threshold is the E-value.
Advantages:
- Model-free — avoids specifying a model for the unmeasured confounder
- Simple to calculate from summary statistics
- Intuitive: a larger E-value means more robust conclusions
- Avoids the “repeating the analysis” requirement of other sensitivity methods
Limitation: The analysis must make additional (arguably stronger) assumptions than the original analysis to assess unmeasured confounding.
Parametrization via Distributions of Potential Outcomes
Motivation
An alternative parametrization, motivated by an alternative mathematical expression of unconfoundedness:
This says the distributions of potential outcomes in the two treatment arms are comparable (for the same ). Sensitivity analysis in this class models the difference between and , rather than modeling an unobserved .
Copula-Based Sensitivity (Franks, D’Amour, Feller 2020)
Definition: Copula-Based Sensitivity Analysis
Franks et al. (2020) used a copula to connect the two identifiable marginal distributions of outcomes:
- — identifiable from treated units
- — identifiable from control units
The copula parameters are the sensitivity parameters — they parametrize the non-identifiable joint distribution. Bayesian inference places priors on the copula parameters.
- Separates identifiable from non-identifiable parameters clearly — transparent parametrization
- Bayesian framework naturally handles the non-identified copula parameters as sensitivity priors
- Connects to Copula Estimation vault note
Rosenbaum’s Sensitivity Parameter
Rosenbaum’s original framework uses as the sensitivity parameter: the ratio of the odds of treatment for two units with the same observed covariates but potentially different unmeasured factors.
- For a sharp null hypothesis of no treatment effect, repeat the Fisher randomization test with a matched sample to find the threshold at which the p-value crosses significance
- Larger implies more robust conclusions
- Grounded in Fisherian randomization inference; no natural Bayesian analogue
Comparison of Methods
| Method | Parametrization | Bayesian-friendly? | Key feature |
|---|---|---|---|
| Cornfield/Rosenbaum & Rubin | Hidden binary | Yes (prior on parameters) | Directly models confounder |
| E-value | Treatment/outcome associations | Yes (Bayesian E-value) | Model-free; easy to report |
| Copula (Franks et al. 2020) | Potential outcome distributions | Yes (prior on copula params) | Transparent parametrization |
| Rosenbaum’s | Odds ratio for treatment | Difficult | Fisher randomization basis |
Identifiability and Transparent Parametrization
A key insight from §6: sensitivity analysis is a form of transparent parametrization — explicitly separating identified parameters (which the data inform) from non-identified parameters (sensitivity parameters, which require prior information or a range of values).
This connects to the broader theme of identifiability in Bayesian causal inference (see ^warn-prior-dogmatism): even non-identified parameters have posteriors in the Bayesian framework, making it especially important to be explicit about what the data can and cannot tell us.
Connections
- General Structure of Bayesian CI — identifiability and transparent parametrization in Bayesian CI
- Potential Outcomes Framework — unconfoundedness assumption being tested
- Copula Estimation — vault note on copula methods used in the Franks et al. approach
See Also
- Instrumental Variables and Principal Stratification — alternative approach when unconfoundedness is untenable
- Frequentist Causal Estimation — doubly-robust estimators whose bias sensitivity analysis quantifies