Propensity Score Matching

Classical Rosenbaum-Rubin propensity-score matching framework and its diagnostics, ingested from Stuart (2010), “Matching Methods for Causal Inference: A Review and a Look Forward,” Statistical Science 25(1).

Routing Summary

Concept Map

ConceptNoteTypeDepends OnKey Result
Workflow & four uses of PSPropensity Score Matching - OverviewoverviewPotential Outcomes, Balancing PropertySeparate design (no outcomes) from analysis; PS used for matching, subclassification, weighting, adjustment
PS definition + Rosenbaum-Rubin theoremPropensity Score and the Balancing PropertytheoremPotential Outcomes, Conditional Independence is a balancing score; under strong ignorability, conditioning on scalar removes observed-covariate bias
Distances & matching structuresMatching Methods and Distance MeasuresconceptBalancing PropertyExact/Mahalanobis/caliper distances; k:1, greedy vs. optimal, with/without replacement, full matching, IPTW
Balance diagnosticsCovariate Balance DiagnosticsconceptBalancing Property, Distance MeasuresUse standardized mean diffs & variance ratios (<0.25, 0.5-2), QQ plots; never use balance p-values
Overlap / common supportCommon Support and OverlapconceptBalancing PropertyEstimate effects only on the region of common support; trim outside it; overlap constrains ATE vs. ATT

Notes

  • Propensity Score Matching - Overview — CONTAINS: definition of matching, two settings, outcome-free design vs. analysis, the four implementation steps, ATT/ATE estimands, the four uses of the propensity score, Chapin curse-of-dimensionality example, matching-vs-regression complementarity.
  • Propensity Score and the Balancing Property — CONTAINS: , strong ignorability (unconfoundedness + positivity), balancing-score theorem , ignorability given the PS, liberal variable-selection rule (balance not c-statistic), linear/logit PS, caliper bias-reduction figures (0.2 SD → 98%).
  • Matching Methods and Distance Measures — CONTAINS: four affinely-invariant distances with formulas, Mahalanobis-within-caliper formula, k:1 nearest neighbor & ratio matching, greedy vs. optimal, with/without replacement & frequency weights, subclassification (5-10 → 90% bias), full matching (Hansen SAT example), IPTW & weighting-by-odds formulas, weight trimming, doubly-robust.
  • Covariate Balance Diagnostics — CONTAINS: standardized difference in means formula, Rubin (2001) three balance measures, thresholds (SMD<0.25, var ratio 0.5-2), the balance-test caution (in-sample property; tests conflate balance with power), QQ plots and before/after standardized-difference plots.
  • Common Support and Overlap — CONTAINS: region of common support / positivity, propensity-score trimming and convex-hull (King-Zeng), how calipers vs. weighting handle overlap, estimand implications (ATE vs. ATT), why matching surfaces non-overlap that regression hides.

External Connections

Potential Outcomes Framework · Conditional Independence Assumption · Frequentist Causal Estimation · The Selection Problem · Omitted Variables Bias · Synthetic Control · Bayesian Inverse Probability Weighting · Bayesian Propensity Score Weighting · Activity Bias in Advertising

Sources