Robbins Formula and Poisson Empirical Bayes

Summary

Robbins’ formula is the foundational nonparametric empirical Bayes result: in the Poisson model, the Bayes posterior mean of an unobserved rate can be expressed entirely through the marginal density $f$ of the observed counts, with no need to specify the prior $g$ . For an observed count $x$ , the EB estimate of the true rate is $(x + 1) f (x + 1) / f (x)$ , where $f$ is replaced by the empirical frequency of counts across the many parallel cases. This is the most general expression of Efron’s theme that the prior can be learned from the experience of others.

Overview

Herbert Robbins was the principal developer of the program explicitly named “empirical Bayes.” His branch aimed to show how a frequentist, by exploiting the marginal distribution of pooled data across many parallel cases, could achieve full Bayesian efficiency — without ever specifying the prior. Where Stein’s branch (the James-Stein Estimator) is parametric (estimate one hyperparameter $A$ ), Robbins’ Poisson construction is nonparametric: the entire posterior-mean function is read off from observed marginal frequencies.

The setup: many independent units, unit $i$ has an unknown Poisson rate $θ_{i}$ drawn from an unknown prior $g$ , and we observe a count $x_{i} ∣ θ_{i} \sim Poisson (θ_{i})$ . We want the Bayes estimate $E [θ ∣ x]$ for a unit observed to have count $x$ .

Main Content

Poisson marginal density ^poisson-marginal

With prior $g (θ)$ and likelihood $f_{θ} (x) = e^{- θ} θ^{x} / x!$ , the marginal probability of observing count $x$ is
$f (x) = \int_{0}^{\infty} e^{- θ} \frac{θ ^{x}}{x !} g (θ) d θ .$
Crucially, $f (x)$ is directly estimable from the data as the observed fraction of units having count $x$ .

Robbins' formula (Poisson empirical Bayes) ^robbins-formula

The Bayes posterior mean of $θ$ given an observed count $x$ depends on the prior $g$ only through the marginal $f$ :
$E [θ ∣ x] = (x + 1) \frac{f ( x + 1 )}{f ( x )} .$
Proof idea. Using $θ \cdot e^{- θ} θ^{x} / x! = (x + 1) e^{- θ} θ^{x + 1} / (x + 1)!$ , the numerator of the posterior mean $\int θ f_{θ} (x) g (θ) d θ$ equals $(x + 1) f (x + 1)$ , while the denominator is $f (x)$ . The prior $g$ cancels out entirely.

Empirical Bayes estimator ^robbins-eb-estimator

Replace the unknown marginal $f$ by the empirical frequencies $\hat{f} (x) = (# {i : x_{i} = x}) / N$ across the $N$ parallel units:
$\hat{E} [θ ∣ x] = (x + 1) \frac{f ^ ( x + 1 )}{f ^ ( x )} = (x + 1) \frac{# { i : x _{i} = x + 1 }}{# { i : x _{i} = x }} .$
No parametric form for the prior is ever assumed — the “prior may exist only as a motivational device.” This is the purest realization of the empirical Bayes principle (see Empirical Bayes - Overview).

The same marginal-density logic powers the parametric (Gaussian) branch: there the marginal $z \sim N (0, (A + 1) I)$ lets one estimate $1/ (A + 1)$ via $(N - 2) / S$ , giving the James-Stein Estimator. Robbins’ version is more general because it estimates the whole posterior-mean curve rather than a single hyperparameter.

Examples

Insurance claims (classic Robbins application)

Suppose a large portfolio of auto-insurance policyholders each have an unknown accident rate $θ_{i}$ , and last year we observed $x_{i}$ claims for policyholder $i$ . To predict next year’s expected claims for someone who filed $x$ claims, Robbins’ formula gives $(x + 1) \hat{f} (x + 1) / \hat{f} (x)$ . If $\hat{f} (0) = 7840$ , $\hat{f} (1) = 1317$ , $\hat{f} (2) = 239$ policyholders, then a customer with $0$ claims has predicted rate $(0 + 1) \cdot 1317/7840 \approx 0.168$ , and one with $1$ claim has $(1 + 1) \cdot 239/1317 \approx 0.363$ — each shrunk relative to the raw count, learned purely from the marginal counts.

Relation to Efron's batting data

While Efron’s worked Chapter 1 example (18 baseball players, James-Stein Estimator) uses the Gaussian/parametric branch, it shares Robbins’ driving idea: each player’s prediction is improved by exploiting the marginal distribution of all players’ early-season averages rather than that player’s data alone.

Connections

Empirical Bayes - Overview — the broader EB program and the marginal-density view this specializes.
James-Stein Estimator — the parametric Gaussian counterpart; estimates one hyperparameter from the marginal.
Empirical Bayes Interpretation of Shrinkage — both branches are instances of estimating the prior from data.
Multiple Comparisons - Bayesian Perspective — Robbins’ testing branch (effects piling up at $0$ ) connects to large-scale multiplicity.

Second Brain

Explorer

Robbins Formula and Poisson Empirical Bayes

Robbins Formula and Poisson Empirical Bayes

Overview

Main Content

Examples

Connections

See Also

Graph View

Table of Contents

Backlinks