Bayesian Causal Inference

Routing Summary

This folder covers Bayesian and ML-based causal inference. Contains 39 notes across 7 sub-topics.

Sub-topics

Sub-topicNotesDomain
Foundations3Potential outcomes, SUTVA, ignorability, causal estimands, frequentist methods
Bayesian Inference3Bayesian CI structure, outcome models (BART/GP/BCF), propensity score strategies
Sensitivity and Complex Mechanisms3E-value, copula sensitivity, IV/principal stratification, g-formula, time-varying treatments
Knowledge Elicitation9Interactive (Yamashita 2020) and LLM-based (Shaposhnyk 2025) causal structure elicitation
Treatment Effect Estimation6S/T/X-learner metalearners for CATE; minimax rates; voter turnout & transphobia applications
Time Series Causal Inference7BSTS model; spike-and-slab; Gibbs sampler; CausalImpact; advertising application
Dynamic Treatment Regimes5Optimal multi-stage treatment rules; potential-outcomes framework; backward induction (Q/value functions); Q-learning; A-learning, g-estimation & double robustness (Schulte, Tsiatis, Laber & Davidian 2014)

Paper Overview

  • Li et al 2022 - Overview — “Bayesian causal inference: a critical review”, Li, Ding & Mealli (2022), Phil. Trans. R. Soc. A 381
  • Yamashita 2020 - Overview — “Interactive method to elicit local causal knowledge”, Yamashita et al. (2020), HCII 2020
  • Shaposhnyk 2025 - Overview — “Can LLMs assist expert elicitation for probabilistic causal modeling?”, Shaposhnyk et al. (2025), arXiv
  • Künzel 2019 - Overview — “Metalearners for estimating heterogeneous treatment effects”, Künzel et al. (2019), PNAS 116(10)
  • Brodersen 2015 - Overview — “Inferring causal impact using Bayesian structural time-series models”, Brodersen et al. (2015), Ann. Appl. Stat. 9(1)

Key Concept Dependency Chain

Potential Outcomes Framework
  └─► Causal Estimands (ITE, SATE, CATE, PATE, MATE)
        └─► Frequentist Causal Estimation (IPW, DR, matching)
        └─► Metalearners for CATE [Künzel 2019]
              ├── S-Learner (single model)
              ├── T-Learner (separate models; minimax rate)
              └── X-Learner (cross-imputation; optimal for unbalanced groups)
  └─► General Structure of Bayesian CI (factorization, Assumption 3.2)
        └─► Bayesian Outcome Models (BART, BCF, GP, regularization-induced confounding)
        └─► Propensity Score in Bayesian CI (3 strategies)
  └─► Sensitivity Analysis in Observational Studies (E-value, copula)
  └─► Instrumental Variables and Principal Stratification (CACE, compliance strata)
  └─► Time-Varying Treatments and G-computation (g-formula, sequential ignorability)
  └─► Counterfactual Inference [Brodersen 2015]
        └─► Bayesian Structural Time-Series Model (BSTS)
              ├── Local Linear Trend + Seasonality
              ├── Spike-and-Slab Prior (variable selection)
              ├── MCMC Inference (Gibbs + Kalman smoother)
              └── Counterfactual Impact Estimation (pointwise, cumulative, running avg)
Causal Structure Learning
  └─► Knowledge Elicitation [Yamashita 2020]
        ├── Cause-Precondition-Effect Model
        ├── Interactive GUI Workshop Method
        └── NLP Causal Extraction (Method A + B, Word2Vec)
  └─► LLM Expert Elicitation [Shaposhnyk 2025]
        ├── Dual-LLM Architecture (GPT-4o + Claude)
        ├── BN Construction Comparison (LLM vs BIC vs Human)
        └── Entropy-Based BN Evaluation

Cross-Cutting Themes

Sources