Bayesian Media Mix Modeling - Overview

Summary

Media mix models (MMM) are regression models advertisers use to measure media effectiveness and guide budget allocation. Jin, Wang, Sun, Chan & Koehler (Google, 2017) propose an MMM with flexible functional forms for two phenomena linear regression cannot capture — carryover (advertising’s lagged effect) and shape (saturation / diminishing returns) — estimated in a Bayesian framework so prior knowledge can compensate for the low information content of a single MMM dataset. Key finding: the model recovers parameters well on large data, but for typical small samples (a couple of years of weekly data) the priors dominate and estimates can be biased; the optimal media mix derived from the model has large variance and must be trusted cautiously.

Overview

An MMM relates aggregated sales (weekly or monthly, national or geo-level) to media spend across channels plus control variables (price, distribution, seasonality, macro factors). It descends from the marketing “4Ps” tradition (Borden 1964; McCarthy 1978) and is fundamentally a regression that infers causation from observational correlation. Randomized experiments across media are expensive and rarely feasible, so observational regression remains the workhorse despite its causal fragility (see Activity Bias in Advertising for one such confound).

Two well-documented features of advertising response break the classic linear decision model (Guadagni & Little 1983):

  1. Carryover / lag effect — a portion of an ad’s impact occurs in periods after the exposure (delayed consumer response, inventory effects, word-of-mouth). Modeled via the adstock transformation. See Carryover (Adstock) Functional Forms.
  2. Shape / saturation effect — response is not linear in spend; high spend yields diminishing returns (the “shape effect”, Tellis 2006). Modeled via a curvature function. See Shape (Saturation) Effects.

Because these transformations make the model nonlinear in the parameters, ordinary least squares / MLE is awkward, and the paper turns to Bayesian estimation via MCMC (see Bayesian Estimation and Priors for MMM). The Bayesian framing is motivated less by philosophy than by data scarcity: as Chan & Perry (2017) note, the information content within a single MMM dataset is low relative to the number of parameters, so priors drawn from industry experience or prior/related media-mix models are essential.

Main Content

The MMM regression equation (combined model)

For weekly national data over weeks , with media channels and control variables, the response (sales, or log-sales) is modeled as

where

  • is the carryover-transformed spend of channel (Eq. 1, see Carryover (Adstock) Functional Forms);
  • is the shape/saturation transform with shape (slope) and half-saturation (see Shape (Saturation) Effects);
  • is the regression coefficient (maximum effect) of channel ;
  • is baseline sales (intercept), the effect of control variable ;
  • is white noise, uncorrelated, constant variance.

Media effects are assumed additive (no synergy/interaction between channels — a simplification, cf. Zhang & Vaver 2017). Carryover is applied before shape (adstock then Hill), which is appropriate when per-period spend is small relative to cumulative spend.

Why Bayesian

Bayesian inference treats parameters as random variables with a posterior . The prior injects external knowledge to offset the weak signal in a single dataset, and the full posterior (not just a point estimate) supplies credible intervals and propagates parameter uncertainty into downstream attribution metrics (ROAS, mROAS, optimal mix).

Examples

Shampoo advertiser case study (preview)

The model is applied to 2.5 years of weekly volume-sales data for a shampoo advertiser (TV, magazines, display, YouTube, search), with price/distribution/promotion as controls. Four functional-form specifications are compared by BIC; the most parsimonious (geometric adstock + reach transformation) wins. The optimal TV/magazine budget split has a bimodal, high-variance posterior — the data cannot reliably guide allocation. Full treatment in MMM Model Selection and Application.

Connections

See Also