Simultaneous Inference via Multiplier Bootstrap

Summary

The DR estimators of $A TT (g, t)$ are $n$ -asymptotically linear and jointly normal (Theorem 2), with a doubly-robust influence function. Rather than plug-in standard errors, the paper uses a fast multiplier bootstrap (Theorem 3, Algorithm 1) that perturbs the influence function by random weights — no propensity re-estimation per draw, always has observations from every group, and yields simultaneous (uniform) confidence bands covering the entire path of $A TT (g, t)$ ‘s with probability $\geq 1 - α$ (Corollary 1), avoiding multiple-testing distortions. The minimum-wage application shows this matters: the heterogeneity-robust approach finds a clear negative employment effect where TWFE finds none.

Overview

Inference is in the large- $n$ , fixed- $T$ paradigm. Because researchers typically plot many $A TT (g, t)$ ‘s (or $θ (e)$ , $θ_{se l} (\tilde{g})$ , etc.), pointwise bands would understate joint uncertainty and ignore multiple testing. The multiplier bootstrap produces bands that hold uniformly and account for the dependence across $(g, t)$ estimates — better suited to visualizing overall estimation uncertainty than pointwise intervals.

Main Content

Theorem 2 — Asymptotic linearity & joint normality of DR estimators

Under Assumptions 1–4, 6–8, for each $g \in G_{δ}$ , $t \in {2, \dots, T - δ}$ with $t \geq g - δ$ , and provided the DR consistency claim (4.5) holds (either the propensity working model OR the never-treated outcome-regression working model is correctly specified):
$n (A TT_{d r}^{n e v} (g, t; δ) - A TT (g, t)) = \frac{1}{n} i = 1 \sum n ψ_{g, t, δ}^{d r, n e v} (W_{i}; κ_{g, t}^{*, n e v}) + o_{p} (1) .$
Stacking over $(g, t)$ , $n (A TT_{t \geq (g - δ)}^{d r, n e v} - A T T_{t \geq (g - δ)}) d N (0, Σ)$ with $Σ = E [Ψ (W) Ψ (W)^{'}]$ . The influence function $ψ^{d r, n e v}$ has three pieces — a treated-weight term, a comparison-weight term, and an estimation-effect term $ψ^{es t}$ correcting for first-step nuisance estimation — making the limiting variance account for estimating $\overset{p}{^}_{g}$ and $\overset{m}{^}_{g, t, δ}$ . Assumptions 7–8 require the nuisances to be smooth parametric models with $n$ -asymptotically-linear estimators (logit/probit/(N)LS all qualify) plus weak integrability.

Theorem 3 — Validity of the multiplier bootstrap

Under the assumptions of Theorem 2, define a bootstrap draw by perturbing the empirical influence function with iid mean-zero, unit-variance, finite-third-moment weights ${V_{i}}$ (e.g. Mammen (1993) two-point: $P (V = 1 - κ) = κ / 5$ , $P (V = κ) = 1 - κ / 5$ , $κ = (5 + 1) /2$ ):
$A TT_{t \geq (g - δ)}^{*, d r, n e v} = A TT_{t \geq (g - δ)}^{d r, n e v} + E_{n} [V \cdot Ψ_{t \geq (g - δ)}^{d r, n e v} (W)] . (4.6)$
Then $n (A TT^{*, d r, n e v} - A TT^{d r, n e v}) d_{*} N (0, Σ)$ conditional on the sample, and for any continuous functional $Γ$ , $Γ (\cdot)$ converges likewise. Advantages: (1) trivial/fast — just reweight, no per-draw propensity re-estimation; (2) every group always represented (the empirical bootstrap can drop a group); (3) simultaneous bands are easy; (4) extends to clustering by drawing cluster-level $V$ ‘s (Remark 10).

Algorithm 1 — Studentized simultaneous confidence band

Draw ${V_{i}}$ ; 2. compute $A TT^{*}$ via (4.6); form $\hat{R}^{*} (g, t) = n (A TT^{*} (g, t) - A TT (g, t))$ . 3. Repeat $B$ times. 4. Estimate $\hat{Σ}^{1/2} (g, t) = (q_{0.75} (g, t) - q_{0.25} (g, t)) / (z_{0.75} - z_{0.25})$ (interquartile range of the $B$ draws, normalized by the normal IQR — robust scale). 5. Form $t - t es t = max_{(g, t)} ∣ \hat{R}^{*} (g, t) ∣ \hat{Σ} (g, t)^{- 1/2}$ per draw; let $\overset{c}{^}_{1 - α}$ = empirical $(1 - α)$ -quantile of these. 6. Band: $\hat{C} (g, t) = [A TT_{d r}^{n e v} (g, t; δ) \pm \overset{c}{^}_{1 - α} \hat{Σ} (g, t)^{1/2} / n]$ .

Corollary 1 — Uniform coverage

Under the assumptions of Theorem 2, for any $0 < α < 1$ ,
$P (A TT (g, t) \in \hat{C} (g, t) \forall t \in {2, \dots, T}, g \in G_{δ} : t \geq g - δ) \to 1 - α .$
The band covers all $A TT (g, t)$ simultaneously — no multiple-testing inflation. (Remark 11: setting $\hat{Σ} \equiv 1$ gives a valid but wider constant-width band.)

Inference for summary parameters & pre-testing

Corollary 2 (Sec. 4.2): plug-in estimators $\hat{θ} = \sum_{g} \sum_{t} \overset{w}{^} (g, t) A TT_{d r}^{n e v} (g, t; 0)$ of any aggregation $θ$ are $n$ -asymptotically linear and normal, so the same bootstrap delivers (multiple-testing-robust) bands across event-times, groups, or calendar-times. Remark 12: pre-treatment “placebo” $A TT (g, t)$ for $t < g - δ$ (which equal 0 under the assumptions) can be estimated by swapping the long difference $Y_{t} - Y_{g - δ - 1}$ for the short difference $Y_{t} - Y_{t - 1}$ ; plotting them lets one assess the parallel-trends assumption.

Examples

Minimum wage on teen employment — full findings (Sec. 5)

Data/design. 2,284 counties, 29 states, 2001–2007 (federal MW flat at $5.15). Groups $g \in {2004, 2006, 2007}$ = year state first raised MW; never-raisers = comparison. Outcome: county teen employment (QWI). Covariates: region, population, % white, % HS grads, poverty rate, median income (2000 County Data Book). DR estimation = logit generalized propensity score (quadratics in population, median income) + OLS outcome regression; 1000 multiplier-bootstrap iterations clustered at the county level; runs in ~3 s. Result — group-time effects (Fig. 1). Under unconditional parallel trends, 5 of 7 $A TT (g, t)$ are significantly negative (range -2.3% to -13.6%); simple group-size-weighted average -5.2%; overall $θ_{se l}^{O}$ ≈ -3.9%. Under conditional parallel trends (DR, Panel b), 3 of 7 significant, range -0.9% (insignificant) to -7.1%; overall $θ_{se l}^{O}$ ≈ -3.1%. Result — the TWFE contrast. A TWFE post-treatment dummy with unit + region-year FE gives only -0.008 (insignificant) under conditional design (and -0.037 unconditional) — i.e. TWFE says “no/weak effect.” Interpretation. The heterogeneity-robust CS estimates find a clear, dynamically-growing negative effect of the minimum wage on teen employment that TWFE conceals. Caveats: some pre-treatment placebo $A TT (g, t)$ differ from zero (mild evidence against parallel trends), and the size of MW increases varies across states. Key takeaway: in a textbook-complicated application, the choice of estimation method changes the qualitative conclusion. ^min-wage-full

Connections

Provides inference for the estimators in Doubly-Robust Estimands for ATT(g,t) and the summaries in Aggregating Group-Time Effects.
Validates the empirical conclusions previewed in Difference-in-Differences with Multiple Time Periods - Overview.
Pre-treatment placebo plots test Identifying Assumptions for Staggered DiD.

Second Brain

Explorer

Simultaneous Inference via Multiplier Bootstrap

Simultaneous Inference via Multiplier Bootstrap

Overview

Main Content

Examples

Connections

See Also

Graph View

Table of Contents

Backlinks