Bayesian Non-parametric Causal Inference

Summary

Non-parametric Bayesian methods (especially BART) offer flexible causal inference without strong functional form assumptions. They combine propensity score models with outcome models to estimate average treatment effects (ATE) and average treatment effects on the treated (ATT).

The Problem with Parametric Causal Models

Standard causal inference approaches (e.g., regression adjustment) assume a particular functional form for the relationship between confounders, treatment, and outcome. Misspecification of this form leads to biased treatment effect estimates. Non-parametric approaches avoid this by letting the data determine the functional form.

Propensity Scores

The propensity score is the probability of receiving treatment given observed covariates . Key results:

  • Balancing property: conditioning on balances the covariate distributions between treatment and control groups
  • Rosenbaum-Rubin theorem: if , then

Practically, the propensity score reduces a high-dimensional covariate adjustment problem to a single dimension.

Strong ignorability assumption

Non-parametric methods still require the no unmeasured confounders (strong ignorability) assumption: all variables affecting both treatment and outcome must be measured and included in .

BART for Causal Inference

Bayesian Additive Regression Trees (BART) fit the outcome model as a sum of shallow decision trees with Bayesian regularization priors. In PyMC, this is implemented via pymc-bart.

Two-Model Approach

import pymc_bart as pmb
 
# Step 1: Model the outcome under treatment/control
with pm.Model() as outcome_model:
    mu = pmb.BART("mu", X=X_with_treatment, Y=y, m=50)
    sigma = pm.HalfNormal("sigma", 1)
    pm.Normal("obs", mu=mu, sigma=sigma, observed=y)
 
# Step 2: Predict counterfactuals
# Predict Y(1) for all units, then Y(0) for all units
X_treat = X.copy(); X_treat["T"] = 1
X_ctrl  = X.copy(); X_ctrl["T"]  = 0

Treatment Effect Estimation

Because BART is Bayesian, these estimates come with full posterior distributions.

Comparison with Parametric Alternatives

MethodFunctional FormUncertaintyConfounders
OLS regression adjustmentLinearFrequentist CIsFull covariate control
Propensity score matchingNone for outcomeLimitedBalances covariates
BART (non-parametric Bayes)Flexible treesFull posteriorFull covariate control
DiDLinear trendsPosteriorParallel trends assumption

Connections

Source