MCMC Inference for CausalImpact

Summary

Posterior inference in the BSTS model uses a Gibbs sampler that alternates between a data-augmentation step (sampling states $α$ given parameters $θ$ , using the Kalman filter and fast mean smoother) and a parameter-simulation step (sampling $θ$ given states, with the spike-and-slab Gibbs draw for variable selection). The algorithm is linear in the number of time points and runs in < 30 seconds for typical datasets.

Overview

The posterior $p (θ, α ∣ y)$ is not available in closed form due to the spike-and-slab prior. The Gibbs sampler alternates two steps:

Gibbs Sampler Steps

Step 1 — Data Augmentation (State Simulation)

Sample the full state sequence $α = (α_{1}, \dots, α_{m})$ given parameters $θ$ and data $y_{1 : n}$ :

(α ∣ y_{1 : n}, θ)

Algorithm: Uses the simulation smoother of Durbin & Koopman (2002), which improves on the earlier forward-filtering, backward-sampling algorithms (Carter & Kohn 1994, Frühwirth-Schnatter 1994).

Key property: Because $p (y_{1 : n} ∣ α, θ)$ is jointly multivariate Gaussian, the variance of $p (α ∣ y_{1 : n}, θ)$ does not depend on $y_{1 : n}$ . The sampler:

Generates $(\tilde{y}_{1 : n}^{*}, α^{*}) \sim p (y_{1 : n}, α ∣ θ)$
Subtracts $E (α^{*} ∣ \tilde{y}_{1 : n}^{*}, θ)$ to get zero-mean noise
Adds $E (α ∣ y_{1 : n}, θ)$ (via Kalman filter) to restore the correct mean

Computational complexity: Linear in $m$ (total time points, pre + post), quadratic in $d$ (state dimension). For $m = 500$ , $J = 10$ covariates, 10,000 iterations: < 30 seconds.

Step 2 — Parameter Simulation

Sample $θ = (σ_{μ}^{2}, σ_{δ}^{2}, \dots, ϱ, β, σ_{ε}^{2})$ given states $α$ and data.

For variance parameters ( $σ_{μ}^{2}$ , $σ_{δ}^{2}$ , etc.): Because error terms $η_{t} = α_{t + 1} - T_{t} α_{t}$ are available given $α$ , the posterior is Gamma by conjugacy (from the inverse-Gamma prior in Eq. 2.7).

For static regression coefficients ( $ϱ$ , $β_{ϱ}$ , $σ_{ε}^{2}$ ): Gibbs sampling from the spike-and-slab posterior (see Spike-and-Slab Prior for Covariate Selection). Each $ϱ_{j}$ is drawn independently given $ϱ_{- j}$ , then $β_{ϱ} ∣ ϱ, σ_{ε}^{2}$ and $1/ σ_{ε}^{2} ∣ ϱ, β_{ϱ}$ are drawn using conjugate formulae.

Posterior Predictive Simulation

After fitting the model on pre-intervention data $y_{1 : n}$ , the key quantity is the posterior predictive distribution over counterfactuals:

p (\tilde{y}_{n + 1 : m} ∣ y_{1 : n}, x_{1 : m}) (2.14)

This is the distribution of what would have happened had no intervention occurred. It:

Is conditioned only on pre-intervention outcomes and all control series (not on parameter estimates)
Integrates out all $β$ and $σ_{ε}^{2}$ — no commitment to any particular set of covariates
Is a joint distribution over all post-intervention time points (not a collection of marginals) — preserves serial correlation

Sampling: Each Gibbs iteration draws a complete counterfactual trajectory $\tilde{y}_{n + 1 : m}^{(τ)}$ using the Kalman filter run forward through the post-intervention period.

Why Integrating Out Parameters Matters

Integrating out $β$ and $σ_{ε}^{2}$ means no arbitrary covariate selection — the posterior predictive averages over all candidate subsets weighted by posterior probability
Integrating out $σ_{ε}^{2}$ means no commitment to point estimates of noise — full propagation of uncertainty
The result is wider but properly calibrated uncertainty intervals

Connections

Uses Local Linear Trend and Seasonality Kalman filter/smoother
Uses Spike-and-Slab Prior for Covariate Selection Gibbs update for $ϱ$
Output feeds Counterfactual Impact Estimation
Same MCMC paradigm as MCMC Basics (Gibbs sampling)

Second Brain

Explorer

MCMC Inference for CausalImpact

MCMC Inference for CausalImpact

Overview

Gibbs Sampler Steps

Step 1 — Data Augmentation (State Simulation)

Step 2 — Parameter Simulation

Posterior Predictive Simulation

Why Integrating Out Parameters Matters

Connections

See Also

Graph View

Table of Contents

Backlinks