Horseshoe and Regularized Horseshoe Priors
Summary
Piironen & Vehtari (2017) fix two long-standing problems with the horseshoe prior for sparse Bayesian regression: (1) there was no principled way to set the global shrinkage scale , and (2) the horseshoe leaves large coefficients completely unregularized, which is harmful under weak likelihoods (e.g. separable logistic regression). Their solutions are the effective number of nonzeros that turns prior beliefs about sparsity () into a concrete prior for , and the regularized (Finnish) horseshoe, which adds a Student- slab of scale to softly cap the largest coefficients.
Overview
This is the hub note for an Obsidian cluster on global-local shrinkage priors. The paper is an extension of Piironen & Vehtari (2017a) and targets regression/classification with many predictors of which only a few are expected to be nonzero.
The four companion notes break the contribution into pieces:
- Global-Local Shrinkage Priors — the scale-mixture-of-Gaussians framework, the shrinkage factor , and where ridge/lasso/horseshoe sit in -space.
- The Horseshoe Prior — the Carvalho–Polson–Scott horseshoe with half-Cauchy local scales and the characteristic “horseshoe” density on .
- Choosing the Global Scale and Effective Nonzeros — and the prior-guess formula.
- Regularized Horseshoe (Finnish Horseshoe) — the slab scale , the regularized local scale , and why it behaves like a continuous spike-and-slab.
The two main theoretical advances are summarized below; details live in the companion notes.
Main Content
The model is the standard linear Gaussian regression with a horseshoe prior on the coefficients.
Horseshoe prior for linear regression
For , , the horseshoe prior is the global-local scale mixture
where is the global scale (pulls all coefficients toward 0) and the half-Cauchy local scales have heavy tails that let some escape the shrinkage. An intercept gets a relatively flat prior (no reason to shrink it).
Shrinkage factor
Assuming uncorrelated predictors with (so ), the posterior mean satisfies where is the MLE and
is the shrinkage factor: is complete shrinkage to zero, is no shrinkage. As , ; as , .
Regularized (Finnish) horseshoe
Replace the local scale by a slab-truncated version:
When (small coefficient) and we recover the original horseshoe; when (large coefficient) so the prior approaches — a Gaussian slab of width that “soft-truncates” the heavy Cauchy tails. Letting recovers the unregularized horseshoe.
Prior guess for the global scale
If is the prior guess for the number of relevant predictors out of , set the global scale so that the prior mean of equals :
must scale as to keep prior beliefs about consistent — which is exactly why the default is a dubious choice (it ignores and and puts far too much mass on large ).
The paper also shows the regularized horseshoe is the continuous counterpart of the spike-and-slab prior with a finite slab width, whereas the original horseshoe corresponds to spike-and-slab with an infinitely wide slab. See Spike-and-Slab Prior for Covariate Selection.
Examples
- Setting : With predictors, observations, , and a prior guess relevant variables: . This is far from the scale 1 used by the naive default.
- Logistic regression / separation: When data are separable the likelihood is flat, the MLE diverges, and the Cauchy-tailed horseshoe lets the largest , making posterior means vanish. The slab scale (e.g. via giving a Student- slab) caps this. For binary classification a workable plug-in is , e.g. .
Connections
- Generalizes the horseshoe (The Horseshoe Prior) within the global-local family (Global-Local Shrinkage Priors).
- The problem and its solution are in Choosing the Global Scale and Effective Nonzeros.
- The slab regularization is in Regularized Horseshoe (Finnish Horseshoe).
- Builds on Bayesian Linear Regression; competes with Spike-and-Slab Prior for Covariate Selection.
- Connected to model-size control and the bias–variance view in Overfitting and Information Criteria.
See Also
- Global-Local Shrinkage Priors — the framework these priors belong to
- The Horseshoe Prior — the base prior being fixed
- Choosing the Global Scale and Effective Nonzeros — how to set
- Regularized Horseshoe (Finnish Horseshoe) — the slab fix
- Spike-and-Slab Prior for Covariate Selection — the discrete-mixture counterpart
- Bayesian Linear Regression — the underlying regression model