Omitted Variables Bias

Summary

The OVB formula describes the mechanical relationship between regression coefficients in models with different sets of control variables. It shows how omitting relevant variables biases the coefficients on included variables.

The Formula

If the “long” (correct) regression is:

and the “short” (omitted variable) regression is:

Then:

where is the coefficient from regressing the omitted variable on .

Interpreting the Bias

The bias has two components:

  1. — the effect of the omitted variable on the outcome
  2. — the correlation between the omitted variable and the included regressor
Bias direction
PositivePositiveUpward (overestimate)
PositiveNegativeDownward (underestimate)
NegativePositiveDownward
NegativeNegativeUpward

Schooling Example

For returns to schooling where “ability” () is omitted:

  • Ability likely has positive effect on wages ()
  • Ability is positively correlated with schooling ()
  • Therefore OLS without ability controls likely overestimates the return to schooling

The OVB Formula is Mechanical

It describes the relationship between short and long regressions whether or not either has a causal interpretation. It applies to any pair of nested regression specifications.

See Also