Bayesian Structural Time-Series Model

Summary

The Bayesian Structural Time-Series (BSTS) model is a state-space model that decomposes a time series into a local trend, seasonal components, and a regression on contemporaneous covariates (control series). Inference via MCMC produces the posterior predictive distribution over the counterfactual, from which causal impact is derived.

Overview

The model is defined by two equations: an observation equation linking observed data to a latent state, and a state equation governing state evolution.

Main Content

Definition: BSTS State-Space Form

The model is defined by:

Observation equation:

State equation:

Dimensions:

  • : scalar observed outcome at time
  • : -dimensional latent state vector
  • : -dimensional output vector
  • : transition matrix
  • : control matrix
  • : scalar observation noise with variance
  • : -dimensional system error with diffusion matrix , where

Key property: The state vector is assembled from independent components (trend, seasonality, regression), making the model modular. Adding a new component = adding blocks to , , , .

State Components

The full state vector is a concatenation of:

ComponentNotes
Local trendSee Local Linear Trend and Seasonality
SeasonalitySee Local Linear Trend and Seasonality
Static regressionFixed regression coefficients on covariates
Dynamic regressionTime-varying coefficients on covariates

Static Regression Component

A static linear regression on control series :

This writes the regression contribution in state-space form with and zero variance.

Key advantage of Bayesian treatment: The spike-and-slab prior (see Spike-and-Slab Prior for Covariate Selection) allows automatic selection of which controls to include from potentially tens or hundreds of candidates.

Dynamic Regression Component

Time-varying coefficients evolve as independent random walks:

where . Written in state-space form: and .

When to use: When the relationship between treatment and control series is believed to change over time.

Prior Distribution on State Variance

Most state components depend on a small set of diffusion variance parameters. The default prior:

where is Gamma with expectation . Thus is a prior estimate of , and is the prior weight in units of sample size.

Default choice: , (i.e., prior diffusion variance is about 10% of the sample variance).

Graphical Model (Fig. 2 in paper)

The model shows:

  • Pre-intervention period: observed along with controls
  • Post-intervention period: are the unobserved counterfactuals
  • The model is fit on pre-intervention data; posterior predictive distribution gives counterfactuals

Connections

See Also