Missing Data Models

Summary

Chapter 18 of BDA3 presents the Bayesian framework for handling missing data. Multiple imputation — drawing multiple plausible completions of the data from the posterior predictive distribution — propagates missing-data uncertainty into final inferences.

Missing Data Mechanisms

MCAR (Missing Completely At Random): missingness independent of all data
MAR (Missing At Random): missingness depends only on observed values — mechanism is ignorable
MNAR (Missing Not At Random): missingness depends on the missing values — requires explicit modeling of the mechanism

Multiple Imputation

Draw $M$ completed datasets from $p (y_{mis} ∣ y_{obs})$
Analyze each completed dataset separately
Combine results using Rubin’s rules:
- Point estimate: $\overset{ˉ}{Q} = \frac{1}{M} \sum_{m = 1}^{M} \hat{Q}_{m}$
- Variance: $T = \overset{ˉ}{U} + (1 + 1/ M) B$ where $\overset{ˉ}{U}$ is within-imputation variance and $B$ is between-imputation variance

Tip

In a fully Bayesian analysis, missing data are simply additional unknown parameters — they are sampled alongside model parameters in each MCMC iteration. Multiple imputation approximates this for non-Bayesian analyses.

Key Applications

Polls with missing demographic data: imputing covariates for poststratification
Counted data: handling partially observed counts (e.g., election data with missing precincts)

Second Brain

Explorer

Missing Data Models

Missing Data Models

Missing Data Mechanisms

Multiple Imputation

Key Applications

See Also

Graph View

Table of Contents

Backlinks