Bayesian Moderation Analysis
Summary
Moderation analysis tests whether the relationship between a predictor and outcome changes as a function of a third variable (the moderator). It is implemented as multiple regression with an interaction term. The Bayesian approach yields a posterior over the moderation coefficient , enabling probabilistic statements about the moderation effect.
What is Moderation?
A moderator changes the slope of on — not the mean level of (that would be a main effect). Schematically:
m ──→ [x → y] (m moderates the x-y relationship)
Moderation vs. Mediation
- Moderation: changes the strength of the relationship. No causal path implied.
- Mediation: affects (partly) through . Requires causal DAG reasoning.
See the PyMC mediation analysis example for contrast.
The Model
Interpretation of parameters:
| Parameter | Interpretation |
|---|---|
| Intercept (value of when , ) | |
| Effect of on when | |
| Moderation coefficient: how much the slope changes per unit of | |
| Main effect of on (controlling for ) | |
| Residual SD |
The total effect of on at a given level of moderator is:
PyMC Implementation
with pm.Model() as model:
x = pm.ConstantData("x", training_hours)
m = pm.ConstantData("m", age)
β0 = pm.Normal("β0", mu=0, sigma=10)
β1 = pm.Normal("β1", mu=0, sigma=10)
β2 = pm.Normal("β2", mu=0, sigma=10)
β3 = pm.Normal("β3", mu=0, sigma=10)
σ = pm.HalfCauchy("σ", 1)
mu = β0 + β1*x + β2*x*m + β3*m
pm.Normal("y", mu=mu, sigma=σ, observed=muscle_pct)Visualisation: Spotlight Graph
The spotlight graph plots as a function of , at selected percentiles of the moderator. This directly visualises how the slope changes:
- If : slope decreases as increases (training less effective with age)
- If : no moderation — the - relationship is constant across
# Posterior estimate of the moderation effect
xi = np.linspace(min(age), max(age), 20)
rate = posterior.β1 + posterior.β2 * xi # how β1 varies with mMulticollinearity and the Interaction Term
Including alongside and introduces multicollinearity — the interaction term is correlated with its component variables. Options:
- Mean-centering and before computing the product reduces multicollinearity
- Despite common concern, multicollinearity in the interaction term does not bias estimates of the moderation effect — it only increases uncertainty (widens credible intervals)
- McClelland et al. (2017): multicollinearity is a “red herring in the search for moderator variables”
Applied Example: Training × Age on Muscle Mass
- = weekly training hours
- = age (moderator)
- = muscle percentage
Finding: (credibly), meaning training becomes less effective at building muscle mass in older individuals.
Connections
- Spurious Association and Confounds — interaction effects and multivariate regression
- Bayesian Linear Regression — priors as regularization for correlated predictors
- Generalized Linear Models — moderation extends naturally to logistic/Poisson regression
- Bayesian Non-parametric Causal Inference — non-parametric alternative when the interaction form is unknown
Source
- Bayesian moderation analysis — PyMC example by Benjamin T. Vincent (2021–2023)
- Hayes (2017): Introduction to Mediation, Moderation, and Conditional Process Analysis