NOTEARS - Overview

Summary

NOTEARS (Zheng, Aragam, Ravikumar & Xing, 2018) reformulates the NP-hard problem of learning the structure of a directed acyclic graph (DAG / Bayesian network) from a combinatorial search over graphs into a purely continuous optimization problem over real matrices. The key device is a smooth, exact algebraic characterization of acyclicity — $h (W) = tr (e^{W \circ W}) - d = 0$ — which replaces the discrete acyclicity constraint with one equality constraint. The resulting program is solved with off-the-shelf numerical solvers (augmented Lagrangian + L-BFGS) in ~50 lines of Python, and matches or beats specialized state-of-the-art structure-learning algorithms.

Overview

Learning DAGs from data is NP-hard ([Chickering, 1996; Chickering et al., 2004]), mainly because of the acyclicity constraint: the space $D$ of DAGs on $d$ nodes is combinatorial and grows superexponentially in $d$ . Essentially all prior score-based methods cope with this by local heuristics — order search, greedy edge addition, coordinate descent — that add/remove one edge at a time and check acyclicity incrementally, often requiring structural assumptions (bounded in-degree or treewidth) that are impossible to verify in real data.

NOTEARS makes a fundamentally different move. It converts the combinatorial program (left) into a continuous one (right):

W \in R^{d \times d} min F (W) subject to G (W) \in D ⟺ W \in R^{d \times d} min F (W) subject to h (W) = 0

Here $W$ is the weighted adjacency matrix of a linear structural equation model (SEM), $F$ is a (regularized least-squares) score, and $h$ is the smooth acyclicity function. Because the two programs are equivalent, NOTEARS eliminates the need for any graph-specialized search machinery — it just runs a numerical solver over $R^{d \times d}$ .

Main Content

The four contributions

Smooth acyclicity function with computable derivatives over $R^{d \times d}$ , replacing the combinatorial constraint $G \in D$ with a smooth equality constraint. See Smooth Characterization of Acyclicity.
An equality-constrained program (ECP) that jointly estimates the structure and parameters of a sparse DAG from (possibly high-dimensional) data, solvable by standard numerical solvers to stationarity. See NOTEARS Algorithm.
Empirical effectiveness against state-of-the-art baselines (FGS, PC, GES, LiNGAM). See NOTEARS Experiments.
Comparison to the exact global minimizer (GOBNILP): NOTEARS attains scores close to global optimality in practice despite only being guaranteed to find stationary points.

Why the name

NOTEARS = Non-combinatorial Optimization via Trace Exponential

and Augmented lagRangian for Structure learning. The method is deliberately simple — implementable in ~50 lines of Python — and requires no background in graphical models. Code: https://github.com/xunzheng/notears.

Intellectual lineage / analogy

The authors frame NOTEARS by analogy to undirected graphical models: there, recasting structure learning as a continuous log-det convex program ([Banerjee et al., 2008]) sparked rapid progress. DAG learning had never benefited similarly because of the acyclicity constraint. NOTEARS supplies the missing “closed-form, continuous program” for the directed case — though unlike the undirected case the resulting program is nonconvex.

Connections

Generalizes the modeling target: operates on the continuous space $R^{d \times d}$ rather than the discrete DAG space $D$ — see DAG Structure Learning Problem.
Contrast with local search: prior methods (FGS, GES, hill-climbing, MMHC) do local edge-at-a-time updates; NOTEARS does global updates of the entire matrix $W$ each step.
Relates to linear SEM: the weighted adjacency matrix is the coefficient matrix of a linear SEM — see Confirmatory Factor Analysis and SEM for the Bayesian-SEM treatment of structural equation models, and Spurious Association and Confounds for DAG semantics in causal inference.

Second Brain

Explorer

NOTEARS - Overview

NOTEARS - Overview

Overview

Main Content

The four contributions

Why the name

Intellectual lineage / analogy

Connections

See Also

Graph View

Table of Contents

Backlinks