Nuisance Parameter Bias Simulation

Summary

A simulation study by A. Jordan Nafa (2022) demonstrates empirically that nuisance parameters in multivariable regression are almost never recoverable as causal effects, and that this bias does not shrink with sample size. Using 3,000 simulated datasets fit with Stan/HMC, the study shows 90% credible intervals for confounder coefficients fail to contain the true value 70–99% of the time.

Overview

This simulation provides a concrete, quantitative demonstration of the Table 2 Fallacy. The key insight: if one would not present and interpret the coefficient for without a valid identification strategy for , one should not interpret any other coefficient without an equally valid strategy for that variable. Since most applied regression models lack such strategies for their confounders, essentially all nuisance parameter interpretations in the literature are invalid.

Simulation Design

Causal Graph

The data-generating process (DGP) is a DAG with:

  • Observed: treatment , outcome , measured confounders
  • Unobserved: confounders (fixed at 1.0) and (fixed at 0.5)
  • Biasing paths: and

The valid adjustment set for identifying is . However:

  • is confounded by (potentially)
  • is confounded by (always)
  • is confounded by (always, since )
  • is confounded by both and (always)

Experimental Manipulation

The path is toggled by a binary condition cond:

  • Z Unconfounded (cond = FALSE): does not flow into , so and are independent → is identifiable
  • Z Confounded (cond = TRUE): is unidentifiable

Data Generation Equations

Treatment propensity:

Outcome DGP:

where , , true coefficients: , , , , .

Scale: 3,000 datasets (500 repetitions × 3 sample sizes [2,500, 5,000, 10,000] × 2 conditions [Z confounded, Z unconfounded]).

Code: Data Simulation (R)

sim_dag_data <- function(N, a, b, cond, conf) {
  g <- rnorm(10, 0.5, 0.5)          # path coefficients
  V <- conf[1] + rnorm(N, 0, 0.1)
  U <- conf[2] + rnorm(N, 0, 0.1)
  W <- g[1] * U + rnorm(N, 0, 0.1)
  J <- g[2] * V + rnorm(N, 0, 0.1)
  L <- g[3] * U + g[4] * V + rnorm(N, 0, 0.1)
  Z <- g[5] * W + (U * cond) + g[6] * L + rnorm(N, 0, 0.1)
  logit_theta <- g[7]*Z + g[8]*W + g[9]*J + g[10]*L + rnorm(N, 0, 0.01)
  theta <- exp(logit_theta)/(1 + exp(logit_theta))
  X <- rbinom(N, size = 1, prob = theta)
  mu <- b[1]*X + b[2]*Z + b[3]*L + b[4]*J + b[5]*W
  Y <- a + mu + U + V + rnorm(N, 0, 0.2)
  data.table(X, Z, L, J, W, Y)
}

Code: Data Simulation (Python)

def sim_dag_data(N, a, b, cond, conf):
    g = rnorm(0.5, 0.5, 10)
    V = conf["V"] + rnorm(0, 0.1, N)
    U = conf["U"] + rnorm(0, 0.1, N)
    W = g[0] * U + rnorm(0, 0.1, N)
    J = g[1] * V + rnorm(0, 0.1, N)
    L = g[2] * U + g[3] * V + rnorm(0, 0.1, N)
    Z = g[4] * W + (U * cond) + g[5] * L + rnorm(0, 0.1, N)
    logit_theta = g[6]*Z + g[7]*W + g[8]*J + g[9]*L + rnorm(0, 0.01, N)
    theta = inv_logit(logit_theta)
    X = rbinom(n=1, p=theta, size=N)
    mu = b["X"]*X + b["Z"]*Z + b["L"]*L + b["J"]*J + b["W"]*W
    Y = a + mu + U + V + rnorm(0, 0.2, N)
    return DataFrame({"X": X, "Z": Z, "L": L, "J": J, "W": W, "Y": Y})

Model Specification

A Bayesian linear regression with weakly informative priors scaled to the data (Gelman, Hill & Vehtari 2021):

Code: Stan Model

data {
  int<lower=1> N;
  int<lower=1> K;
  vector[N] Y;
  matrix[N, K] P;
  vector[K] truth;
}
transformed data {
  matrix[N, K] X;
  vector[K] X_means;
  vector[K] X_sd;
  for (i in 1:K) {
    X[, i] = P[, i] - mean(P[, i]);
    X_means[i] = mean(P[, i]);
    X_sd[i] = sd(P[, i]);
  }
  real mu_alpha = mean(Y);
  real sigma_alpha = 2 * sd(Y);
  vector[K] beta_sd = 2 * (sd(Y)/X_sd);
  real sigma_prior = 1/sd(Y);
}
parameters {
  real alpha;
  vector[K] beta;
  real<lower = 0> sigma;
}
model {
  target += normal_id_glm_lpdf(Y | X, alpha, beta, sigma);
  target += exponential_lpdf(sigma | sigma_prior);
  target += normal_lpdf(alpha | mu_alpha, sigma_alpha);
  target += normal_lpdf(beta | 0, beta_sd);
}
generated quantities {
  real Intercept = alpha - dot_product(beta, X_means);
  vector[K] bias = truth - beta;
}

Fitting: 4 HMC chains × 2,000 iterations (1,000 warmup), adapt_delta=0.9, max_treedepth=12. Parallelized via {furrr} (3 workers); wall time ≈ 35 minutes on 12-core Ryzen 9 5900X.

Results

Coverage Probabilities for 90% Credible Intervals

ParameterTrue ValueZ Confounded (n=2,500)Z Confounded (n=5,000)Z Confounded (n=10,000)Z Unconfounded (n=2,500)Z Unconfounded (n=5,000)Z Unconfounded (n=10,000)
−0.50.890.890.910.890.900.89
0.00.010.010.000.930.910.90
0.50.160.110.070.060.050.03
0.00.130.070.060.110.080.06
0.50.290.210.150.140.130.09

Key Takeaways from Coverage Table

  1. recovers well in both conditions: 90% CIs capture the true value at nominal rates (~89–91%).
  2. is correctly recoverable only when unconfounded by (coverage 90–93%). When is active, coverage collapses to essentially zero (0–1%).
  3. , , are almost never recovered — in either condition. Coverage rates of 3–29% mean the CI misses the true value 71–97% of the time.
  4. Coverage for nuisance parameters decreases as grows — bigger data makes the wrong inference more confidently wrong.

Error of Magnitude

Among models where the 90% CI fails to capture the true value, the average Root Mean Squared Error (RMSE) of the bias is substantial:

  • For (confounded condition): high RMSE, comparable to the true value’s scale
  • For , , : average bias RMSE is often worse in the unconfounded condition for than in the confounded condition — the residual confounding (via and ) is severe regardless

Error rates by parameter:

  • : wrong 88.5% of the time (average across conditions and sample sizes)
  • : wrong 91.6% of the time
  • : wrong 78.4% of the time
  • (confounded): wrong 99.6% of the time

Practical Implication

The "Big Data" Fallacy

Since bias in nuisance parameters does not decrease with , increasing sample size does not mitigate the Table 2 Fallacy. Large observational datasets produce narrower credible intervals around the wrong value. “Big data” is not a substitute for causal reasoning and experimental design.

Connections

See Also