Fisher Randomization Test and the Sharp Null

Summary

The Fisher Randomization Test (FRT), formulated by Fisher (1935) and recast in potential-outcomes terms by Rubin (1980), tests a sharp null hypothesis — one that, together with the observed data, recovers all missing potential outcomes. Under such a null any test statistic has a known distribution: cycle through all assignments $W$ , recompute the statistic, and the exact $p$ -value is the fraction of re-assignments yielding a statistic at least as extreme as the observed one. This makes the FRT finite-sample exact regardless of the statistic, sample size, or data-generating process.

Overview

A null hypothesis is sharp (Rubin 2005) if it determines every entry of the “Science Table” ${Y_{i} (j) : i = 1, \dots, N; j = 1, \dots, J}$ given the observed data. The canonical example is Fisher’s sharp null of no effect:

H_{0 F} : Y_{i} (1) = Y_{i} (2) = \dots = Y_{i} (J) for all i = 1, \dots, N .

For $J = 2$ this is $Y_{i} (1) = Y_{i} (0)$ for every unit — the treatment changes no unit’s outcome. Once we know this, the unobserved potential outcomes are filled in directly from the observed ones ( $Y_{i}^{*} (j) = Y_{i}^{obs}$ for all $j$ ), so the entire Science Table is known and the statistic’s null distribution is computable by enumeration.

The deep point (Rubin 1980; Kempthorne & Doerfler 1969): the validity comes from the assignment mechanism, not from any exchangeability or i.i.d. assumption on the outcomes. Randomization alone justifies the test.

Main Content

Sharp null hypothesis ^def-sharp-null

A null hypothesis is sharp if, combined with the observed data ${W_{i}, Y_{i}^{obs}}$ , it determines all missing potential outcomes in the Science Table. Fisher’s sharp null $H_{0 F} : Y_{i} (1) = \dots = Y_{i} (J)$ for all $i$ is the leading case. Under a sharp null, every test statistic $T$ has a known distribution over the randomization of $W$ .

Finite-sample exactness of the FRT under the sharp null ^thm-frt-exact

Under a sharp null hypothesis, for any test statistic $T$ and any data-generating process for the potential outcomes, the FRT $p$ -value
$p = (N!)^{- 1} π \in Π_{N} \sum 1 (T_{π} \geq T)$
is finite-sample exact: $P (p \leq α) \leq α$ for every level $α$ (with equality up to the discreteness of the randomization distribution). The $p$ -value is a right-tail probability — larger $T$ means greater deviation from the null.

The FRT procedure (FRT-1 to FRT-4) ^def-frt-steps

Given a sharp null used to impute outcomes and a statistic $T$ :

FRT-1. Compute $T$ from the observed ${W_{i}, Y_{i}^{obs}}$ .

FRT-2. Impute the full potential-outcome vector $Y_{i}^{*}$ for each unit using the (compatible) sharp null. Under Fisher’s $H_{0 F}$ this is just $Y_{i}^{*} (j) = Y_{i}^{obs}$ for all $j$ .

FRT-3. For each permutation $π \in Π_{N}$ , form the re-assigned observed data $Y_{π, i}^{obs} = \sum_{j} W_{π (i)} (j) Y_{i}^{*} (j)$ and recompute $T_{π}$ .

FRT-4. Report $p = (N!)^{- 1} \sum_{π} 1 (T_{π} \geq T)$ . When $N!$ is too large, draw i.i.d. permutations $π \sim Unif (Π_{N})$ to approximate $p$ up to Monte Carlo error — the test remains valid.

FRT reduces to the classical permutation test under $H_{0 F}$

Under Fisher’s sharp null of no effect, all imputed potential outcomes equal $Y_{i}^{obs}$ , so FRT-3 simply permutes the treatment labels $W$ while the outcomes stay fixed ( $Y_{π, i}^{obs} = Y_{i}^{obs}$ ). In this case the FRT and the classical permutation test are numerically identical. In general, the FRT admits a broader class of nulls and designs than the permutation test.

A subtlety exploited later: the FRT can be aimed at a weak null by choosing an artificial but compatible sharp null for the imputation step (treatment-unit additivity, i.e. constant effects), then using a statistic that detects departures from the weak null. The imputed Science Table satisfies $Y_{i}^{*} (W_{i}) = Y_{i}^{obs}$ , so it is consistent with the data. See Sharp vs Weak Null Hypotheses and Studentized Randomization Tests.

Examples

Exact permutation $p$ -value, by hand. Take $N = 6$ , $N_{1} = 3$ treated, $N_{2} = 3$ control. Observed outcomes: treated $= (9, 7, 8)$ , control $= (4, 6, 5)$ . Observed statistic $T = \overset{τ}{^} = \overset{ˉ}{Y}_{treated} - \overset{ˉ}{Y}_{control} = 8 - 5 = 3$ .

Under $H_{0 F}$ the six numbers ${9, 7, 8, 4, 6, 5}$ are fixed; only which three are labeled treated varies. There are $(3 6) = 20$ equally likely label assignments. For each, compute $\overset{τ}{^}$ . The observed split $(9, 7, 8)$ vs $(4, 6, 5)$ gives the largest possible treated mean, so $\overset{τ}{^} = 3$ is the maximum over all 20 assignments. Exactly one of the 20 (the observed one) attains $T_{π} \geq 3$ , so the one-sided exact $p$ -value is $1/20 = 0.05$ . (A two-sided test on $∣ \overset{τ}{^} ∣$ would count both the observed split and its mirror image, giving $2/20 = 0.10$ .) No distributional assumption was used — only the 20 equally likely randomizations.

Connections

Randomization Inference - Overview — places the FRT within design-based inference.
Sharp vs Weak Null Hypotheses — contrasts Fisher’s sharp null with Neyman’s weak null and explains why the FRT must be modified for the latter.
Studentized Randomization Tests — uses the FRT machinery with an artificial sharp null plus a studentized statistic to test weak nulls.
Permutation Tests and Exact Inference — the FRT specializes to the permutation test under the sharp null of no effect.

Second Brain

Explorer

Fisher Randomization Test and the Sharp Null

Fisher Randomization Test and the Sharp Null

Overview

Main Content

Examples

Connections

See Also

Graph View

Table of Contents

Backlinks