Aggregating Group-Time Effects
Summary
Once the family is identified, the Callaway–Sant’Anna framework aggregates it into interpretable summary parameters of the form with researcher-chosen, non-negative weights. The main schemes are event-study/dynamic (effects by length of exposure ), group-specific, calendar-time, and overall ATT. Unlike TWFE coefficients, these use transparent positive weights and so cannot be negative when all underlying effects are positive; the “best” scheme is application-specific.
Overview
With many groups and periods, the raw ‘s are hard to digest. The general aggregation is
where are known/estimable weights chosen to answer a specific question. The section (which assumes , no anticipation, and a never-treated group available for clarity) contrasts these with the static/dynamic TWFE specs whose weights can be negative (Goodman-Bacon 2019; even leads-and-lags event-study TWFE is contaminated per Sun & Abraham 2020). Let event time = periods since adoption.
Main Content
Event-study / dynamic aggregation (by length of exposure)
Average effect periods after adoption, across cohorts observed at that exposure:
is the “on-impact” effect. This is the proper target for event-study plots, without the dynamic-TWFE pitfalls of . Caveat: comparing across mixes the true dynamic effect with two undesirable terms from compositional change — different cohorts enter at different (decomposition 3.5). The fix is the balanced version
which fixes the set of cohorts observed for at least periods, eliminating composition effects at the cost of fewer groups (a robustness/efficiency trade-off). Resembles the common practice of only reporting event-study coefficients for non-compositionally-changing periods (Remark 8).
Group-specific aggregation
Average post-treatment effect for cohort across all its post periods:
Answers: do earlier-treated groups have larger/smaller effects than later-treated ones? Building block for the recommended overall parameter.
Calendar-time aggregation
Average effect in calendar period across cohorts already treated by :
Cumulative version (3.9) gives the cumulative effect among units treated by — useful for business-cycle heterogeneity or e.g. cumulative COVID cases averted by shelter-in-place.
Overall treatment-effect parameters
Simple weighted average (by group size):
. Positive weights → cannot be negative when all effects positive (unlike TWFE ). Drawback: over-weights long-treated groups. Recommended overall ATT averages group effects first, then over groups:
is the average effect experienced by all units that ever participated — the natural multi-period analogue of the canonical 2x2 ATT. Overall event-study/calendar versions: and (3.12); balanced local version (3.13).
Weight table and equality condition
Table 1 (p. 15) gives the explicit for — all non-negative; for they sum to (it is cumulative), otherwise to one. In general none of the overall parameters equal each other except in the special case where is constant across all groups and times — in which case all of them (including TWFE ) coincide.
Examples
Minimum wage: aggregations side-by-side (Table 3)
Under conditional parallel trends (DR), all clustered at the county level:
- Overall (vs. TWFE post-dummy -0.8%, statistically insignificant).
- Group-specific : ≈ -4.4%, ≈ -2.9%, ≈ -2.9%.
- Event study : ≈ -2.4%, ≈ -4.1%, ≈ -5.0%, ≈ -7.1% — effects grow with exposure (a dynamic effect TWFE misses).
- Calendar-time : t=2004 ≈ -3.0%, t=2005 ≈ -2.5%, t=2006 ≈ -3.0%, t=2007 ≈ -4.9%.
- Balanced event study (groups 2004 & 2006 only): ≈ -1.6%, ≈ -4.1%.
Connections
- Aggregates the building blocks from Group-Time Average Treatment Effects, identified via Doubly-Robust Estimands for ATT(g,t).
- Inference on these ‘s (plug-in + multiplier bootstrap) in Simultaneous Inference via Multiplier Bootstrap.
- Directly addresses the negative-weighting pitfall flagged in Difference-in-Differences with Multiple Time Periods - Overview.
See Also
- Goodman-Bacon (2019) — TWFE decomposition motivating positive-weight aggregation
- Sun & Abraham (2020) — cohort-specific (interaction-weighted) event study
- Estimands in Longitudinal Research — choosing the target estimand