Synthetic Control Requirements

Summary

Synthetic controls are appropriate tools for causal inference only when specific contextual and data requirements are met. Abadie (2021) identifies five contextual requirements and three data requirements that researchers should verify before applying the method. When these conditions fail, the article describes how to adapt the design or why the method should be avoided entirely.

Overview

The interpretability of synthetic controls is their greatest strength — the counterfactual is explicit, sparse, and subject to domain scrutiny. But this transparency also reveals failures that are hidden in regression-based methods. A poorly fitted synthetic control, visible in the pre-treatment period, signals that the counterfactual is not credible. The requirements in this note describe the conditions under which the synthetic control can produce credible estimates.

Why Use Synthetic Controls? (Advantages over Regression)

Before the requirements, it is useful to understand what synthetic controls offer:

Property	Synthetic Control	Linear Regression
Extrapolation	Precluded (convex combination)	Allowed (unconstrained weights)
Transparency	Explicit: named donor units + weights	Opaque: dense, often negative weights
Sparsity	Yes — bounded by $k$ predictors	No
Pre-analysis plan	Weights registerable pre-outcomes	Cannot preregister
Specification search safeguard	Yes — weights fixed before outcomes	No
Inference	Permutation (exact, small $J$ )	Asymptotic

The safeguard against specification searches is important: because synthetic control weights are computed from pre-intervention data only, all design decisions (donor pool, predictors) can be made and locked in before post-treatment outcomes are observed. This mimics the pre-analysis plan of a randomized trial.

Contextual Requirements

1. Size of Effect and Volatility

Requirement 1: Effect Must Be Large Relative to Outcome Volatility

Synthetic control inference detects effects that are extreme relative to the distribution of placebo effects. If the true treatment effect is small or the outcome variable is highly volatile, the effect will be indistinguishable from the placebo distribution.

The relevant quantity is $∣ τ_{1 t} ∣/ SD (ε_{j t})$ — the signal-to-noise ratio. High volatility from unit-specific transitory shocks $ε_{j t}$ cannot be eliminated by synthetic control matching (only the common-factor component $λ_{t} μ_{j}$ is controlled).

Practical implication: When substantial volatility is present in the outcome, Abadie (2021) suggests removing it via filtering (e.g., seasonal adjustment, HP filter) from both the treated unit and donor pool before estimation.

2. Availability of a Comparison Group

Requirement 2: Suitable Donor Pool Must Exist

The donor pool must contain units that:

Were not affected by the intervention (not exposed to treatment, and not subject to spillovers from the treated unit)

Have similar characteristics to the treated unit on the predictors $Z_{j}$ and $μ_{j}$

Did not adopt similar interventions during the study period

Units with idiosyncratic shocks idiosyncratic to the treated unit should be excluded. Units from a different structural regime than the treated unit should be excluded.

Contaminating the donor pool with units affected by spillovers from the treatment biases the synthetic control. For example, if the intervention benefits neighboring regions, those regions should be excluded from the donor pool (otherwise the synthetic control would underestimate the counterfactual, overstating the treatment effect).

3. No Anticipation

Definition: No Anticipation Assumption

The potential outcomes $Y_{j t}^{N}$ and $Y_{j t}^{I}$ in the setting of subsection 3.1 are defined only in terms of the treatment status for unit $j$ at time $t$ . This is the stable unit treatment value assumption (SUTVA) applied to time: the outcome is invariant to the history of treatment status and there are no anticipation effects.

Violation: If economic agents anticipate the intervention and adjust behavior before $T_{0}$ , the pre-treatment data contain anticipation effects. The synthetic control would then attribute some of the treatment effect to the pre-treatment period.

Remedy: If anticipation is present, backdate the intervention date to a period before anticipation effects can plausibly occur (see ^def-backdating). Note that backdating does not mechanically bias the estimator — the synthetic control will simply show the treatment effect starting from the backdated date, with the actual effect materializing after the formal intervention date.

4. No Interference (SUTVA)

Definition: No Interference (SUTVA)

Unit $j$ ‘s outcome $Y_{j t}$ depends only on unit $j$ ‘s own treatment status, not on the treatment status of other units:
$Y_{j t} = (1 - D_{j t}) Y_{j t}^{N} + D_{j t} Y_{j t}^{I}$
where $D_{j t} \in {0, 1}$ is unit $j$ ‘s treatment indicator. Equivalently, there are no spillover effects from treated to untreated units.

Violation and remedy: If spillover effects are plausible (e.g., neighboring regions benefit from or are harmed by the intervention), exclude potentially affected units from the donor pool. This creates a tension with Requirement 2 (needing a large, representative donor pool).

When spillovers are expected but cannot be excluded, the synthetic control estimate provides a lower bound on the treatment effect magnitude (if the spillover benefits the donor units, the synthetic counterfactual is inflated, understating the true effect).

5. Convex Hull Condition

Requirement 5: Treated Unit Must Be Inside (or Near) the Convex Hull

The sparsity theorem shows that synthetic controls are projections of $X_{1}$ onto the convex hull of $X_{0}$ . When $X_{1}$ is far outside the convex hull, the synthetic control must average over donors whose characteristics are substantially different from the treated unit.

Consequence: Interpolation biases from large $X_{1} - X_{0} W^{*}$ discrepancies can dominate the estimate. Abadie, Diamond, and Hainmueller (2010, 2015) advise against using synthetic controls when $X_{1} \neq \approx X_{0} W^{*}$ .

Practical check: Compute $X_{1} - X_{0} W^{*}$ (the discrepancy between the treated unit’s predictors and the synthetic control’s predictors). Table 1 in Abadie (2021) illustrates this for the German reunification example: West Germany’s predictors are closely matched by the synthetic control across all six predictors, validating the approach.

Data Requirements

1. Aggregate Data on Predictors and Outcomes

The synthetic control method requires data on both the outcome variable and predictors $Z_{j}$ for both the treated unit and all donor units. These are often aggregate statistics (state-level, country-level) reported by government agencies, multilateral organizations, or private entities.

When micro-data exist, they can be aggregated. For example, Card (1990) uses CPS micro-data to construct aggregate labor market outcomes for Miami and comparison cities in the Mariel Boatlift study.

2. Sufficient Pre-Intervention Information

Data Requirement: Long Pre-Treatment Window

The bias bound is inversely proportional to $T_{0}$ . The synthetic control must be able to track the trajectory of the outcome variable for the treated unit over an extended period before the intervention.

Rule of thumb: The more volatile the outcome (large $ε_{j t}$ ), the longer the pre-treatment window needed to achieve a close match. Short pre-treatment windows combined with volatile outcomes are a recipe for spurious results.

Structural stability caveat: A long $T_{0}$ may span structural breaks that change the data-generating process. When this is a concern, up-weighting the most recent pre-treatment periods via $v_{h}$ (the V matrix) can alleviate instability concerns.

3. Sufficient Post-Intervention Information

A sufficient post-treatment window is needed to:

Detect effects that accumulate gradually over time (e.g., human capital effects, institutional changes)
Avoid false negatives from effects that take time to materialize
Permit the placebo distribution to be estimated with reasonable precision

When only short post-treatment windows are available, surrogate outcomes or leading indicators of the outcome of interest may be used.

When NOT to Use Synthetic Controls

Abadie (2021, Section 9) emphasizes that mechanical application without regard for these requirements produces misleading results. Do not use synthetic controls when:

The treated unit cannot be approximated by a convex combination of donor units ( $X_{1} ≫ X_{0} W^{*}$ )
The donor pool has too few suitable units ( $J < 5$ ) for permutation inference to be meaningful
The intervention is not aggregate-level (for individual-level data, see Differences-in-Differences or Regression Discontinuity Designs)
The effect is expected to be small relative to outcome volatility and the post-treatment window is short
Anticipation effects contaminate the pre-treatment period and backdating is not plausible

Connections

Differences-in-Differences: DiD requires parallel trends (a special case of the linear factor model with constant loadings); SC requires convex hull condition instead. Both require no anticipation and no interference.
The Selection Problem: SC addresses selection on time-varying unobservables (the $μ_{j}$ factors) — going beyond what regression and even DiD can handle
Instrumental Variables: An alternative when the donor pool is inadequate; IV exploits exogenous variation rather than constructing a counterfactual unit
Synthetic Control Bias Theory: The formal theory underlying Requirements 3, 4, and 5

Second Brain

Explorer

Synthetic Control Requirements

Synthetic Control Requirements

Overview

Why Use Synthetic Controls? (Advantages over Regression)

Contextual Requirements

1. Size of Effect and Volatility

2. Availability of a Comparison Group

3. No Anticipation

4. No Interference (SUTVA)

5. Convex Hull Condition

Data Requirements

1. Aggregate Data on Predictors and Outcomes

2. Sufficient Pre-Intervention Information

3. Sufficient Post-Intervention Information

When NOT to Use Synthetic Controls

Connections

See Also

Graph View

Table of Contents

Backlinks