Xu (2016) — Generalized Synthetic Control Method

Summary

Yiqing Xu proposes the Generalized Synthetic Control (GSC) method, which unifies difference-in-differences and the synthetic control method under a single interactive fixed effects (IFE) framework. GSC handles multiple treated units and variable treatment periods, produces frequentist uncertainty estimates (standard errors and confidence intervals) via parametric bootstrap, and embeds cross-validation for automatic model selection — addressing the key limitations of both DID (parallel trends) and SC (single treated unit, no inference).

Citation

Xu, Yiqing. 2017. “Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models.” Political Analysis 25 (1): 57–76. https://doi.org/10.1017/pan.2016.2

Overview

DID requires parallel trends — that treated and control units would have followed the same trajectory in the absence of treatment. Synthetic controls require a single treated unit and provide no formal uncertainty estimates. The GSC method addresses both limitations by modeling unobserved time-varying confounders explicitly via an interactive fixed effects (IFE) model.

The key insight: by estimating the IFE model on the control group only, then projecting treated units’ pre-treatment outcomes onto the estimated factor space, the method imputes what the treated units’ outcomes would have been without treatment. DID is the special case where factor loadings are constant (); synthetic control is the special case with one treated unit and no parametric model.

Paper Structure

SectionTitleKey content
1IntroductionDID parallel trends problem; IFE as solution; GSC as unification
2FrameworkAssumption 1 (IFE functional form); ATT estimand; Assumptions 2–5
2.1Identification assumptionsStrict exogeneity (Assumption 2); weak serial dependence
3Estimation Strategy3-step GSC estimator; Algorithm 1 (cross-validation); Algorithm 2 (bootstrap)
3.1Model selectionLOO cross-validation for number of factors
3.2InferenceParametric bootstrap procedure for
4Monte Carlo EvidenceSimulation: GSC vs DID, IFE, SC; Table 1 (bias, SD, RMSE, coverage)
5Empirical ExampleElection Day Registration (EDR) and voter turnout, 47 US states 1920–2012
6ConclusionCaveats: or ; diagnostics recommended

Key Contributions

  1. Unification of DID and SC: The GSC framework encompasses DID (constant factor loadings) and SC (single treated unit) as special cases.

  2. Multiple treated units, variable treatment timing: The IFE model is estimated once on the control group; factor loadings for each treated unit are estimated separately in Step 2. No need to run separate SC optimizations for each treated unit.

  3. Frequentist uncertainty estimates: The parametric bootstrap (Algorithm 2) produces standard errors and confidence intervals, filling the gap left by the permutation-only inference of canonical SC.

  4. Automatic model selection: Cross-validation (Algorithm 1) selects the number of factors without researcher discretion, guarding against specification search.

  5. Performance vs. alternatives: Monte Carlo shows GSC has less bias than DID/IFE when treatment is heterogeneous and less bias than IFE when treatment effect is heterogeneous across units; more efficient than original SC.

Relationship to Existing Notes

ConceptVault NoteRelationship
Basic SC estimatorSynthetic ControlGSC generalizes: multiple units, parametric bootstrap
Linear factor modelSynthetic Control Bias TheorySame model; GSC estimates it directly on controls
Bias boundSynthetic Control Bias TheoryGSC bias → 0 as (Remark 2)
DID/parallel trendsDifferences-in-DifferencesDID is the special case in Assumption 1
Multiple treated unitsSynthetic Control ExtensionsGSC is the primary multi-unit extension (vs. penalized SC)
Bayesian IFE/TSCSBayesian Difference in DifferencesBayesian analog; Pang (2014) cited as a complement

Caveats (from Conclusion)

  1. Small samples: Bias increases when or due to imprecise estimation of factor loadings (“incidental parameters” problem)
  2. Common support requirement: If treated and control units do not share common support in factor loadings, GSC may extrapolate — diagnostics essential (plot factor loadings as in Figure 3b)
  3. Limitations vs. complex DGPs: Cannot accommodate dynamic treatment–outcome relationships, structural breaks, multiple treatment intensities, or random coefficients for observed covariates

Notes Generated from This Paper

See Also