Researcher Degrees of Freedom

Summary

The concept of “researcher degrees of freedom” describes how the many decision points in data analysis — each seemingly innocuous — create an enormous space of possible analyses. Even without intent to deceive, this flexibility inflates false positive rates.

Sources of Analytic Flexibility

From the examples in Garden of Forking Paths, degrees of freedom include:

Variable and Comparison Choices

  • Which main effects vs. interactions to examine
  • How to define subgroups (e.g., “single” vs. “married” definitions)
  • Which covariates to include or exclude
  • Whether to combine or separate samples

Data Processing Decisions

  • Inclusion/exclusion criteria (e.g., which days count as “peak fertility”)
  • How to handle outliers or missing data
  • How to code categorical variables
  • Whether to transform variables

Statistical Modeling Choices

  • Parametric vs. nonparametric tests
  • Whether to pool across studies or analyze separately
  • Fixed vs. random effects
  • One-tailed vs. two-tailed tests

The Combinatorial Explosion

Each decision multiplies the number of possible analyses. With just 5 binary choices, there are possible analysis paths. In real studies, the number is far larger. The probability that at least one path yields is much higher than 5%.

Connection to the Replication Crisis

This mechanism explains why:

  • Published findings often fail to replicate
  • Effect sizes shrink dramatically in replication attempts
  • The problem is worst with small samples, noisy measurements, and small effects
  • Pre-registration helps but cannot eliminate all flexibility

See Also