Fisher Randomization Test and the Sharp Null
Summary
The Fisher Randomization Test (FRT), formulated by Fisher (1935) and recast in potential-outcomes terms by Rubin (1980), tests a sharp null hypothesis — one that, together with the observed data, recovers all missing potential outcomes. Under such a null any test statistic has a known distribution: cycle through all assignments , recompute the statistic, and the exact -value is the fraction of re-assignments yielding a statistic at least as extreme as the observed one. This makes the FRT finite-sample exact regardless of the statistic, sample size, or data-generating process.
Overview
A null hypothesis is sharp (Rubin 2005) if it determines every entry of the “Science Table” given the observed data. The canonical example is Fisher’s sharp null of no effect:
For this is for every unit — the treatment changes no unit’s outcome. Once we know this, the unobserved potential outcomes are filled in directly from the observed ones ( for all ), so the entire Science Table is known and the statistic’s null distribution is computable by enumeration.
The deep point (Rubin 1980; Kempthorne & Doerfler 1969): the validity comes from the assignment mechanism, not from any exchangeability or i.i.d. assumption on the outcomes. Randomization alone justifies the test.
Main Content
Sharp null hypothesis ^def-sharp-null
A null hypothesis is sharp if, combined with the observed data , it determines all missing potential outcomes in the Science Table. Fisher’s sharp null for all is the leading case. Under a sharp null, every test statistic has a known distribution over the randomization of .
Finite-sample exactness of the FRT under the sharp null ^thm-frt-exact
Under a sharp null hypothesis, for any test statistic and any data-generating process for the potential outcomes, the FRT -value
is finite-sample exact: for every level (with equality up to the discreteness of the randomization distribution). The -value is a right-tail probability — larger means greater deviation from the null.
The FRT procedure (FRT-1 to FRT-4) ^def-frt-steps
Given a sharp null used to impute outcomes and a statistic :
- FRT-1. Compute from the observed .
- FRT-2. Impute the full potential-outcome vector for each unit using the (compatible) sharp null. Under Fisher’s this is just for all .
- FRT-3. For each permutation , form the re-assigned observed data and recompute .
- FRT-4. Report . When is too large, draw i.i.d. permutations to approximate up to Monte Carlo error — the test remains valid.
FRT reduces to the classical permutation test under
Under Fisher’s sharp null of no effect, all imputed potential outcomes equal , so FRT-3 simply permutes the treatment labels while the outcomes stay fixed (). In this case the FRT and the classical permutation test are numerically identical. In general, the FRT admits a broader class of nulls and designs than the permutation test.
A subtlety exploited later: the FRT can be aimed at a weak null by choosing an artificial but compatible sharp null for the imputation step (treatment-unit additivity, i.e. constant effects), then using a statistic that detects departures from the weak null. The imputed Science Table satisfies , so it is consistent with the data. See Sharp vs Weak Null Hypotheses and Studentized Randomization Tests.
Examples
Exact permutation -value, by hand. Take , treated, control. Observed outcomes: treated , control . Observed statistic .
Under the six numbers are fixed; only which three are labeled treated varies. There are equally likely label assignments. For each, compute . The observed split vs gives the largest possible treated mean, so is the maximum over all 20 assignments. Exactly one of the 20 (the observed one) attains , so the one-sided exact -value is . (A two-sided test on would count both the observed split and its mirror image, giving .) No distributional assumption was used — only the 20 equally likely randomizations.
Connections
- Randomization Inference - Overview — places the FRT within design-based inference.
- Sharp vs Weak Null Hypotheses — contrasts Fisher’s sharp null with Neyman’s weak null and explains why the FRT must be modified for the latter.
- Studentized Randomization Tests — uses the FRT machinery with an artificial sharp null plus a studentized statistic to test weak nulls.
- Permutation Tests and Exact Inference — the FRT specializes to the permutation test under the sharp null of no effect.
See Also
- Potential Outcomes Framework — the Science Table and the imputation logic.
- The Experimental Ideal — Fisher’s randomization principle.
- Power Analysis and Sample Size — power of exact tests depends on the number of distinct randomizations.