Frequentist Causal Estimation
Summary
Three main Frequentist estimator classes exist under the potential outcomes framework: outcome modeling (regression), inverse probability weighting (IPW), and doubly-robust (DR) estimators. Each exploits the ignorability assumption differently. DR estimators are consistent if either the propensity score model or the outcome model is correctly specified — but not necessarily both.
Overview
Under the ignorability assumption, the CATE and PATE are identified from observed data. Frequentist causal estimation operationalizes this identification through three estimator families, reviewed here as context for the Bayesian approach in General Structure of Bayesian CI.
The key identification result: under ignorability,
so causal means equal observable conditional means. See ^eq-identification.
Outcome Modeling (Regression Estimator)
The simplest approach: specify an outcome model , estimate it from data, then impute missing potential outcomes.
The outcome-model PATE estimator:
- Consistent for if the outcome model is correctly specified.
- In poor overlap regions, estimates rely on extrapolation — sensitive to model misspecification.
- A misspecified linear outcome model still gives a consistent estimate in randomized experiments, but not in observational studies.
Inverse Probability Weighting (IPW)
Uses the propensity score to reweight units.
Definition: IPW Estimator
- Consistent if the propensity score model is correctly specified.
- The propensity score is a balancing score: conditioning on balances the multivariate distribution of between treatment groups.
- When is unknown (observational data), it must be estimated, e.g. via logistic regression.
Hájek (normalized) IPW:
The Hájek estimator normalizes weights to sum to 1, reducing variance.
Doubly-Robust (DR) Estimator
Combines outcome modeling and IPW for robustness.
Definition: Doubly-Robust (DR) Estimator
where is the residual from the outcome model.
Theorem: Double Robustness
is consistent for if either:
- the propensity score model is correctly specified, or
- the outcome model is correctly specified (but not necessarily both).
The DR estimator is “doubly robust” because the bias of is a product of the residuals of the propensity score model and outcome model — if either residual is zero (correct specification), the bias vanishes.
Matching and Weighting Methods
Matching methods find pairs of treated and control units with similar covariates (e.g., based on propensity score or Mahalanobis distance) and estimate by the difference in average outcomes between matched groups.
Weighting methods assign weight to each unit so the weighted covariate distribution is balanced, then compute a weighted difference in outcomes. IPW is the canonical weighting method; the Hájek estimator is its normalized version.
These can be viewed as non-parametric versions of , , and based on nearest-neighbor regressions.
Connections to Bayesian Causal Inference
The Bayesian approach (see General Structure of Bayesian CI) treats causal inference as a missing data problem: impute the missing potential outcomes from the posterior predictive distribution, then compute any estimand. This automatically yields uncertainty quantification for any causal functional.
The propensity score — central to Frequentist approaches — has a nuanced role in Bayesian inference:
- Under ignorability, the propensity score drops out of the likelihood for causal estimands (§3 of the paper)
- Yet it is essential for ensuring overlap and balance in the design stage
- See Propensity Score in Bayesian CI for the three strategies to incorporate it
See Also
- Propensity Score in Bayesian CI — Bayesian strategies using the propensity score
- Bayesian Propensity Scores and IPW — existing vault note on Bayesian IPW (Heiss blog)
- Sensitivity Analysis in Observational Studies — what happens when unconfoundedness fails
- Propensity Score Matching - Overview — the matching counterpart to the IPW/DR estimators here