NOTEARS - Overview
Summary
NOTEARS (Zheng, Aragam, Ravikumar & Xing, 2018) reformulates the NP-hard problem of learning the structure of a directed acyclic graph (DAG / Bayesian network) from a combinatorial search over graphs into a purely continuous optimization problem over real matrices. The key device is a smooth, exact algebraic characterization of acyclicity — — which replaces the discrete acyclicity constraint with one equality constraint. The resulting program is solved with off-the-shelf numerical solvers (augmented Lagrangian + L-BFGS) in ~50 lines of Python, and matches or beats specialized state-of-the-art structure-learning algorithms.
Overview
Learning DAGs from data is NP-hard ([Chickering, 1996; Chickering et al., 2004]), mainly because of the acyclicity constraint: the space of DAGs on nodes is combinatorial and grows superexponentially in . Essentially all prior score-based methods cope with this by local heuristics — order search, greedy edge addition, coordinate descent — that add/remove one edge at a time and check acyclicity incrementally, often requiring structural assumptions (bounded in-degree or treewidth) that are impossible to verify in real data.
NOTEARS makes a fundamentally different move. It converts the combinatorial program (left) into a continuous one (right):
Here is the weighted adjacency matrix of a linear structural equation model (SEM), is a (regularized least-squares) score, and is the smooth acyclicity function. Because the two programs are equivalent, NOTEARS eliminates the need for any graph-specialized search machinery — it just runs a numerical solver over .
Main Content
The four contributions
- Smooth acyclicity function with computable derivatives over , replacing the combinatorial constraint with a smooth equality constraint. See Smooth Characterization of Acyclicity.
- An equality-constrained program (ECP) that jointly estimates the structure and parameters of a sparse DAG from (possibly high-dimensional) data, solvable by standard numerical solvers to stationarity. See NOTEARS Algorithm.
- Empirical effectiveness against state-of-the-art baselines (FGS, PC, GES, LiNGAM). See NOTEARS Experiments.
- Comparison to the exact global minimizer (GOBNILP): NOTEARS attains scores close to global optimality in practice despite only being guaranteed to find stationary points.
Why the name
NOTEARS = Non-combinatorial Optimization via Trace Exponential
and Augmented lagRangian for Structure learning. The method is deliberately simple — implementable in ~50 lines of Python — and requires no background in graphical models. Code: https://github.com/xunzheng/notears.
Intellectual lineage / analogy
The authors frame NOTEARS by analogy to undirected graphical models: there, recasting structure learning as a continuous log-det convex program ([Banerjee et al., 2008]) sparked rapid progress. DAG learning had never benefited similarly because of the acyclicity constraint. NOTEARS supplies the missing “closed-form, continuous program” for the directed case — though unlike the undirected case the resulting program is nonconvex.
Connections
- Generalizes the modeling target: operates on the continuous space rather than the discrete DAG space — see DAG Structure Learning Problem.
- Contrast with local search: prior methods (FGS, GES, hill-climbing, MMHC) do local edge-at-a-time updates; NOTEARS does global updates of the entire matrix each step.
- Relates to linear SEM: the weighted adjacency matrix is the coefficient matrix of a linear SEM — see Confirmatory Factor Analysis and SEM for the Bayesian-SEM treatment of structural equation models, and Spurious Association and Confounds for DAG semantics in causal inference.
See Also
- DAG Structure Learning Problem — the score-based / SEM formulation NOTEARS builds on
- Smooth Characterization of Acyclicity — the central theorem ()
- NOTEARS Algorithm — how the continuous program is actually solved
- NOTEARS Experiments — empirical results and benchmarks
- Causal Discovery - Overview — the broader constraint-/score-based discovery landscape
- Causal Discovery Index