Identifying Assumptions for Staggered DiD

Summary

Point-identification of in the Callaway–Sant’Anna framework rests on four assumptions beyond random sampling: limited treatment anticipation (Assumption 3, with horizon ), one of two conditional parallel trends assumptions — based on a never-treated group (Assumption 4) or on not-yet-treated groups (Assumption 5) — and an overlap/common-support condition (Assumption 6). All allow covariate-specific trends, making them strictly weaker than randomization-based or unconditional parallel-trends assumptions used elsewhere.

Overview

These conditions extend the canonical parallel-trends assumption to multiple groups and periods, and crucially allow it to hold only after conditioning on covariates — important when groups differ in observables that drive untreated outcome dynamics (e.g. job-training programs where age, education, employment history differ across participants; Heckman et al. 1997). They also accommodate (bounded) anticipation of treatment. The choice between Assumptions 4 and 5 governs which comparison group is valid; the anticipation horizon governs the reference period.

Main Content

Assumption 3 — Limited Treatment Anticipation

There is a known such that

When this is the standard no-anticipation condition (units do not respond before treatment starts). permits anticipation up to periods (e.g. if units react one period early). Under Assumption 3, for all pre-treatment periods . The parallel-trends assumptions become stronger as grows (Remark 1) — a previously-unnoticed trade-off.

Assumption 4 — Conditional Parallel Trends (Never-Treated comparison)

Let be as in Assumption 3. For each and with ,

Conditional on covariates, group- and the never-treated group () would have followed parallel untreated paths. Favored when a sizable never-treated group exists that is similar to the eventually-treated. Under it places no restriction on observed pre-treatment trends.

Assumption 5 — Conditional Parallel Trends (Not-Yet-Treated comparison)

Let be as in Assumption 3. For each and with and ,

Uses groups not-yet-treated by time as comparison. Favored when no/too-small never-treated group exists, as it exploits more comparison units (more informative inference). Drawback: unlike Assumption 4 it does restrict pre-treatment trends, which can fail when early periods experienced different shocks than later ones (Marcus & Sant’Anna 2020). Practitioners uncomfortable using never-treated units (who may behave differently) can drop them and proceed under Assumption 5 (Remark 2).

Assumption 6 — Overlap (common support)

For each , , there exists with

A positive fraction starts treatment in , and the generalized propensity score is uniformly bounded away from one. Rules out “irregular identification” (Khan & Tamer 2010). Extends the overlap conditions of Heckman et al., Abadie (2005), and Sant’Anna & Zhao (2020).

Conditional vs. unconditional. The unconditional versions of Assumptions 4–5 are still weaker than the parallel-trends conditions in de Chaisemartin & D’Haultfœuille (2020) and Sun & Abraham (2020), and weaker than the randomization-of-adoption-date assumption in Athey & Imbens (2018). Allowing conditioning on permits covariate-specific trends; ignoring them when present biases unconditional DiD. Only pre-treatment covariates may be used — post-treatment covariates can be affected by treatment (Wooldridge 2005b).

Do not pre-test to pick the assumption

It is tempting to use statistical pre-tests to choose between parallel-trends versions, but Roth (2020) shows this distorts inference. The authors recommend choosing based on the application’s context, not data-driven tests (Sec. 2.3, fn. 8).

Examples

Minimum wage: covariates that make parallel trends plausible

County characteristics used to justify conditional parallel trends: census region, county population, median income, fraction white, fraction with HS education, poverty rate. Treated counties differ markedly (much less likely Southern; population ~94k vs ~53k; 89% vs 83% white) — so unconditional parallel trends is suspect and conditioning is warranted. Sant’Anna & Song (2019) propensity-score specification tests fail to reject correct specification.

Connections

See Also

  • The Experimental Ideal — randomization as the strongest (and stronger) benchmark
  • Synthetic Control — alternative when parallel trends is implausible
  • de Chaisemartin & D’Haultfœuille (2020); Sun & Abraham (2020) — stronger parallel-trends variants