Synthetic Control Bias Theory

Summary

The formal justification for synthetic controls rests on the linear factor model — a generalization of difference-in-differences that allows unobserved confounders to have time-varying loadings. Under this model, Abadie, Diamond, and Hainmueller (2010) derive a bias bound: the bias of the synthetic control estimator is inversely proportional to the number of pre-treatment periods $T_{0}$ , provided the synthetic control closely tracks the treated unit’s pre-treatment trajectory. Variable selection (via the V matrix and cross-validation) is the mechanism that enforces this fit.

Overview

The basic synthetic control estimator in Synthetic Control is intuitive — find a weighted combination of donor units that matches the treated unit before the intervention, then attribute post-treatment divergence to the treatment. But why is this a valid identification strategy? The answer is the linear factor model.

The key insight: a synthetic control that reproduces the treated unit’s pre-treatment outcomes implicitly matches on the unobserved common factors $μ_{j}$ — exactly those confounders that would bias a regression estimator. The more pre-treatment periods available, the more constraints the matching imposes, and the better the identification.

The Linear Factor Model

Definition: Linear Factor Model (Abadie et al. 2010, Eq. 10)

The potential outcome without treatment for unit $j$ at time $t$ follows:
$Y_{j t}^{N} = δ_{t} + θ_{t} Z_{j} + λ_{t} μ_{j} + ε_{j t}$
where:

$δ_{t}$ = common time trend (constant factor loading)

$Z_{j}$ = observed covariates (time-invariant or unaffected by treatment), with time-varying coefficients $θ_{t}$

$μ_{j}$ = vector of unobserved unit-specific factors

$λ_{t}$ = vector of time-varying factor loadings (common factors)

$ε_{j t}$ = zero-mean idiosyncratic shocks

This model generalizes the standard panel data fixed-effects model. The difference-in-differences / fixed-effects restriction $λ_{t} = λ$ (constant over time) is obtained by restricting $λ_{t}$ to be time-invariant. The linear factor model allows $λ_{t}$ to change over time, so the “parallel trends” assumption of DiD is a special case.

Connection to DiD

DiD assumes $λ_{t} = λ$ (constant factor loadings), so unobserved confounders affect all units equally across time. The linear factor model relaxes this: each unit $j$ has its own loading $μ_{j}$ on common factors $λ_{t}$ that may drift over time. Synthetic control handles this by matching on the trajectory, not just the level.

The Bias Bound

Theorem: Bias Bound for Synthetic Controls (Abadie, Diamond, and Hainmueller 2010)

Under the linear factor model, suppose a synthetic control with weights $W^{*} = (w_{2}^{*}, \dots, w_{J + 1}^{*})^{'}$ reproduces the characteristics of the treated unit:
$X_{1} \approx X_{0} W^{*}$
where $X_{1}$ and $X_{0}$ include pre-intervention outcomes and predictors. Then for $t > T_{0}$ , the bias of $\overset{τ}{^}_{1 t}$ is bounded by a function that is:

Inversely proportional to $T_{0}$ (the number of pre-treatment periods)

Increasing in $J$ (the size of the donor pool)

Controlled by the quality of fit: $X_{1} - X_{0} W^{*}$

Implication: A large $T_{0}$ alone does not guarantee low bias — the synthetic control must also achieve a close pre-treatment fit. Conversely, a close fit with small $T_{0}$ may still produce substantial bias if the idiosyncratic transitory shocks $ε_{j t}$ are large.

Key practical implications of the bias bound:

Long pre-treatment windows are valuable. More pre-treatment periods impose more matching constraints on $μ_{j}$ .
Imperfect fit is a warning sign. If $X_{1} \neq = X_{0} W^{*}$ , Abadie, Diamond, and Hainmueller (2010) advise against using synthetic controls.
Large donor pools can increase bias. A large $J$ gives more flexibility to fit the pre-treatment data but may introduce interpolation biases between the treated unit and distant donor units.

Sparsity: The Geometry of Synthetic Controls

One of synthetic control’s most important properties is sparsity — typically, only a small number of donor units receive nonzero weight.

Theorem: Sparsity of Synthetic Controls (Abadie 2021, Section 3.2)

When $X_{1}$ falls outside the convex hull of the columns of $X_{0}$ , and the columns of $X_{0}$ are in general quadratic position, the synthetic control is unique and sparse — with the number of nonzero weights bounded by $k$ (the number of predictors in $X_{1}$ ).

Geometric interpretation: The synthetic control is the projection of $X_{1}$ onto the convex hull of $X_{0}$ . Since $X_{1}$ is typically outside this hull (curse of dimensionality), the projection touches the hull at a face with at most $k$ vertices.

Sparsity is a feature, not a bug:

Interpretability: The synthetic counterfactual is a named weighted average of specific donor units (e.g., 42% Austria + 22% United States + …)
Transparency: The contribution of each donor unit is explicit, allowing subject-matter evaluation of the counterfactual’s plausibility
Contrast with regression: Regression weights are dense (all units contribute), unrestricted (can be negative), and allow extrapolation — obscuring potential biases

The V Matrix and Variable Selection

The synthetic control optimization (Eq. 7 in Abadie 2021) requires choosing a $k \times k$ positive definite matrix $V = diag (v_{1}, \dots, v_{k})$ that weights the relative importance of each predictor:

W min ∥ X_{1} - X_{0} W ∥ = h = 1 \sum k v_{h} (X_{h 1} - j = 2 \sum J + 1 w_{j} X_{hj})^{2}^{1/2}

Definition: Cross-Validation for V Matrix Selection

Split the pre-intervention periods $t = 1, \dots, T_{0}$ into a training period $t = 1, \dots, t_{0}$ and a validation period $t = t_{0} + 1, \dots, T_{0}$ (with $t_{0} = T_{0} /2$ as a default). Then:

For each candidate $V$ , compute weights $\tilde{W} (V)$ using training period data only

Evaluate the MSPE of the resulting synthetic control on the validation period:

$t = t_{0} + 1 \sum T_{0} (Y_{1 t} - w_{2} (V) Y_{2 t} - \dots - w_{J + 1} (V) Y_{J + 1, t})^{2}$

Select $V^{*}$ that minimizes the validation-period MSPE

Use $W^{*} = W (V^{*})$ for estimation

Simple alternatives to cross-validation:

$v_{h} = 1/ Var (X_{h 1}, \dots, X_{h J + 1})$ : rescale all predictors to unit variance (equivalent to standardizing)
Equal weights $v_{h} = 1/ k$ : appropriate when predictors are on similar scales

Pre-Intervention Outcomes as Predictors

Including pre-intervention outcome values of $Y_{j t}$ in $X_{1}$ and $X_{0}$ is often the single most important variable selection decision. Pre-treatment outcomes are powerful predictors of post-treatment outcomes (via the factor model), and they are automatically absorbed into $μ_{j}$ in the factor model framework. However, including many individual time-period outcomes (rather than summary measures) can lead to overfitting during the training period. Use aggregate summaries (means, subperiod averages) as default.

Contrast: Synthetic Control vs. Regression

Property	Synthetic Control	Regression
Weight constraints	$w_{j} \geq 0$ , $\sum w_{j} = 1$	Unconstrained
Extrapolation	Precluded by convex combination	Allowed (negative weights)
Sparsity	Yes (bounded by $k$ )	No (dense weights)
Fit transparency	Explicit: $X_{1} - X_{0} W^{*}$	Hidden: regression forces $X_{0} W^{re g} = X_{1}$ exactly
Bias when units dissimilar	Visible in large $X_{1} - X_{0} W^{*}$	Hidden by extrapolation
Pre-analysis plan	Weights registerable before outcomes observed	Cannot preregister

The regression estimator forces a perfect fit of the covariates ( $\overset{ˉ}{X}_{0} W^{re g} = \overset{ˉ}{X}_{1}$ ) even when the untreated units are completely dissimilar to the treated unit, allowing extrapolation. Regression weights in table 3 of Abadie (2021) include negative values for four OECD countries in the German reunification example — synthetic control weights in table 2 are all nonnegative.

Second Brain

Explorer

Synthetic Control Bias Theory

Synthetic Control Bias Theory

Overview

The Linear Factor Model

The Bias Bound

Sparsity: The Geometry of Synthetic Controls

The V Matrix and Variable Selection

Contrast: Synthetic Control vs. Regression

See Also

Graph View

Table of Contents

Backlinks