Entropy-Based BN Evaluation

Summary

Entropy of each node’s posterior marginal distribution is used as an information-theoretic measure of structural quality across Bayesian networks. Lower entropy indicates more structured, less uncertain predictions. LLM-generated BNs achieve lower mean and minimum entropy than both BIC-based and human expert BNs.

Overview

When comparing BN structures generated by different methods, one needs an objective metric. The authors use Shannon entropy of the marginal posterior distribution at each node, then summarize across nodes.

Main Content

Definition: Node Entropy in a BN

For a discrete random variable $X_{i}$ with marginal probability distribution $P (x_{i})$ , the entropy is:
$H (X_{i}) = - x \sum P (x_{i}) lo g P (x_{i})$
This is computed from the posterior marginal distribution at each node, given the observed data.

Interpretation:

Low entropy → the node’s distribution is sharply peaked → the BN effectively learns structure, identifying clear patterns and dependencies
High entropy → the node’s distribution is flat/uncertain → the BN captures less signal; more randomness in the model

Note: Entropy computed from the posterior marginal, not the prior — it reflects how the BN propagates evidence.

Summary Statistics Compared

Five descriptive statistics are computed per BN (across all nodes):

Statistic	Meaning
Mean	Average entropy across all nodes
Min	Node with lowest entropy (most certain)
25th percentile	Lower quartile
Median (50%)	Middle node
75th percentile	Upper quartile
Max	Node with highest entropy (most uncertain)

Results (Table 5 in paper)

	LLM	BIC	Expert
Mean	1.4237	1.4770	1.4775
Min	0.8897	0.9119	0.9282
25%	1.1654	1.1919	1.1473
50%	1.2884	1.3226	1.2075
75%	1.4882	1.5410	1.5555
Max	2.9855	3.0144	3.2357

Key finding: LLM-BN has the lowest mean and minimum entropy, meaning it produces the most structured overall model. The Expert BN has a slightly lower median than BIC, suggesting that in some nodes the human expert is more precise, but the LLM is uniformly better across the distribution.

Interpretation

The LLM-based approach leads to models with overall lower uncertainty — consistent with a “strong performer” against traditional methods
BIC graphs show more logical inconsistencies (revealed by expert review) which likely inflate entropy by introducing backward or confounded edges
The LLM’s ability to reason about causal direction (not just statistical association) produces a more coherent structure

Limitations of This Metric

Entropy measured on a single small dataset (400 rows); results may not generalize
A BN with overly confident (low-entropy) CPTs could be overfitting, not genuinely more informative
Does not directly measure causal accuracy — a correctly structured BN could still have high entropy if the domain has genuine uncertainty

Connections

Builds on BN Construction Methods Comparison — three-way structural comparison
Used to evaluate BN III from LLM Expert Elicitation for Bayesian Networks

Second Brain

Explorer

Entropy-Based BN Evaluation

Entropy-Based BN Evaluation

Overview

Main Content

Summary Statistics Compared

Results (Table 5 in paper)

Interpretation

Limitations of This Metric

Connections

See Also

Graph View

Table of Contents

Backlinks