Causal Discovery - Overview
Summary
Causal discovery (a.k.a. causal structure search/learning) infers causal relations among variables by analyzing the statistical properties of purely observational data, when controlled interventions or randomized experiments are too costly, slow, unethical, or impossible. The output is a directed graphical causal model (DGCM) — usually not a single DAG but an equivalence class of DAGs (a CPDAG or PAG). Glymour, Zhang & Spirtes (2019) review three families: constraint-based methods (PC, FCI), score-based methods (GES), and methods built on functional causal models (PNL).
Overview
Almost all of science is about identifying causal relations. Two procedures have existed since the 17th century: (1) manipulation — intervene and see what changes; and (2) observation — watch natural variation without intervening. Randomized experiments are the gold standard for the first, but are frequently infeasible. Causal discovery pursues the second route computationally, recovering causal structure from observational (or mixed observational/experimental) data.
The objects recovered are directed graphical causal models (DGCMs), equivalent to causal Bayesian networks / structural equation models (SEMs) / functional causal models (FCMs). A DGCM has three components:
- a set of random variables (nodes);
- a set of directed edges, where asserts that fixing all other variables and exogenously varying would change — i.e., is a direct cause of ;
- a joint distribution over the variables that satisfies the Markov condition relative to the graph.
A key conceptual point: causal discovery is “nothing but statistical estimation of parameters describing a graphical causal structure,” but with the twist that the structure itself (which edges exist and their orientation) is what is being estimated. Because several distinct DAGs can imply exactly the same conditional independencies, observational data alone generally cannot pin down a unique DAG — only a Markov equivalence class.
Main Content
Goal of causal structure learning
Given i.i.d. samples from an unknown joint distribution generated by an unknown DGCM , recover as much of as the data permit — typically a CPDAG (when no latent confounders, via PC/GES), a PAG (when latent confounders are possible, via FCI), or even a fully oriented DAG plus functional model (under FCM identifiability conditions, via LiNGAM/ANM/PNL).
Three families of methods
- Constraint-based — exploit conditional independence constraints in the data to recover the equivalence class. Examples: PC (assumes no latent confounders) and FCI (tolerates latent confounders). Asymptotically correct given a reliable CI test.
- Score-based — search the space of equivalence classes to optimize a model-fit score (e.g., BIC). Example: Greedy Equivalence Search (GES).
- Functional causal model (FCM) based — assume with and exploit noise asymmetries to identify causal direction beyond the equivalence class. Examples: LiNGAM, ANM, post-nonlinear (PNL).
Comparison of the fundamental methods (Table 1)
Property PC FCI GES LiNGAM/PNL/ANM Faithfulness required? Yes Yes Weaker condition No Special assumptions on data distribution? No No Yes (often linear-Gaussian or multinomial) Yes (e.g., non-Gaussian / nonlinear) Handles confounders? No Yes No No Output Markov equivalence class (CPDAG) Partial ancestral graph (PAG) Markov equivalence class (CPDAG) DAG + causal model (under identifiability) In the large-sample limit, PC and GES converge to the same Markov equivalence class.
Practical challenges
Reliable causal discovery must also confront: causality in time series (subsampling/aggregation, Granger causality limits), measurement error, nonstationary/heterogeneous data, selection bias, missing data, and the deterministic case. The paper notes there is no consensus on parameter (penalty) choice and that bootstrapping edge frequencies is a useful diagnostic of stability — though stable output is not necessarily correct.
Examples
- The authors illustrate biological applications: recovering the protein-signaling network of Sachs et al. (2005) with FASK, and ranking Arabidopsis flowering-time genes with a PC + IDA “Causal Stability” pipeline (top 25 genes contained 5 known causes plus 4 newly confirmed novel causes).
- The contrast with NOTEARS - Overview: NOTEARS recasts DAG learning as a smooth continuous optimization with an algebraic acyclicity constraint, whereas the methods here are combinatorial (CI-test search or greedy score search). Both target the same underlying structure-learning problem (vault gap #9).
Connections
- Builds on Directed Acyclic Graphs and Summary Causal DAGs (the DAG / d-separation machinery).
- Foundational assumptions detailed in Markov and Faithfulness Assumptions.
- Method deep-dives: PC Algorithm and Constraint-Based Discovery, GES and Score-Based Discovery, Functional Causal Models (LiNGAM, ANM).
- Continuous-optimization alternative: NOTEARS - Overview.
- Relation to expert/knowledge-driven structure building: BN Construction Methods Comparison.