Modeling Philosophy
A good model is not one that produces the results you want. A good model is one whose predictions you can trust—including when those predictions are uncomfortable.
This guide walks you through a complete, pre-specified modeling workflow using the MMM Framework. Each step includes both the conceptual reasoning and the code to implement it. The goal is not just to produce a model, but to produce a model you can defend.
The Core Principle
Every modeling decision should be made before seeing results, or if made after, it should be clearly documented as post hoc and its effect on inference acknowledged. This is not a limitation—it is what separates measurement from storytelling.
Workflow Overview
The modeling process has four phases. Each phase has specific outputs and checkpoints.
Phase 1: Plan
Define the Business Question
Before touching data or code, clearly articulate what you are trying to learn. Different questions require different models and different levels of rigor.
Attribution Questions
“How much of our sales are driven by each media channel?” This requires the standard BayesianMMM and careful handling of confounders.
Optimization Questions
“How should we reallocate our $10M budget across channels?” This requires attribution plus saturation curve estimation and honest uncertainty propagation.
Mechanism Questions
“Does TV drive sales through awareness or directly?” This requires the NestedMMM with mediation pathways.
Portfolio Questions
“Does promoting our multipack cannibalize single-pack sales?” This requires the MultivariateMMM with cross-effects.
Write It Down
Document your business question in a brief that specifies: (1) what outcome you are measuring, (2) what decisions the model will inform, (3) what level of uncertainty is acceptable for those decisions, and (4) what validation data is available.
Identify Variables Using Causal Reasoning
Variable selection in MMM is a causal reasoning task, not a statistical one. The question is not “which variables improve model fit” but “which variables must be included for the media effect estimates to be unbiased.”
List your treatment variables (media channels)
These are the variables whose effects you want to estimate. They are never optional—every channel you want to measure must be in the model.
Identify confounders
Variables that affect both media spending and your outcome. Common examples: seasonality, promotions, economic indicators. Omitting confounders biases media estimates. These are never candidates for variable selection.
Identify precision controls
Variables that affect only your outcome, not media spending. Weather, competitor actions (if they don’t affect your spending), and supply disruptions. Including these reduces noise but omitting them does not bias media estimates. These can be candidates for Bayesian variable selection.
Identify potential mediators
Variables on the causal pathway between media and your outcome (awareness, consideration, search volume). Never control for mediators in a standard model unless you are explicitly modeling mediation with a NestedMMM.
Identify potential colliders
Variables caused by both media and your outcome (e.g., brand tracking scores that reflect both ad exposure and purchase behavior). Never include colliders—they introduce bias.
The Critical Distinction
Variable selection should only be applied to precision controls. Confounders must always be included. Mediators and colliders must never be included (in a standard model). Getting this wrong is a form of specification shopping.
See the Variable Selection and Causal Inference guides for detailed treatment.
Pre-Register Your Specification
Before fitting the model, document the following decisions. This is your analysis plan.
# analysis_plan.py — Document this BEFORE fitting
"""
Analysis Plan: Q4 2025 Media Effectiveness
==========================================
Business Question: What is the incremental ROI of each media channel?
Decision: Inform 2026 budget allocation
Outcome Variable: Weekly national sales ($)
Media Channels:
- TV (national, geometric adstock, L_max=8)
- Digital Display (national, geometric adstock, L_max=4)
- Paid Search (national, geometric adstock, L_max=2)
Confounders (always included):
- Price index
- Distribution (% ACV)
- Holiday indicator
- Seasonality (Fourier, order 2)
Precision Controls (Bayesian variable selection):
- Weather (heating degree days)
- Competitor A spend
Trend: Linear
Sampler: NumpyRo, 4 chains, 2000 draws, 1000 tune
Validation: Compare TV ROI posterior to Q3 geo lift test
(TV ROI from geo test: 1.1-1.5, 90% CI)
Decision Criteria:
- If TV 80% CI includes geo-test estimate -> validated
- If TV 80% CI does not overlap -> investigate discrepancy
"""
Phase 2: Build
Prepare Your Data
Good data preparation is the foundation of a trustworthy model. The MMM Framework uses the Master Flat File (MFF) format for consistency and flexibility.
import pandas as pd
from mmm_framework import MFFConfigBuilder, load_mff
# Configure data structure per analysis plan
mff_config = (
MFFConfigBuilder()
.with_kpi_name("Sales")
.add_national_media("TV", adstock_lmax=8)
.add_national_media("Digital_Display", adstock_lmax=4)
.add_national_media("Paid_Search", adstock_lmax=2)
.add_control("Price_Index")
.add_control("Distribution")
.add_control("Holiday")
.add_price_control()
.build()
)
# Load and validate
panel = load_mff(pd.read_csv("data/weekly_mff.csv"), mff_config)
# Data quality checks
print(f"Date range: {panel.dates.min()} to {panel.dates.max()}")
print(f"Observations: {panel.n_obs}")
print(f"Missing values: {panel.missing_summary()}")
print(f"\nMedia channel summary:")
print(panel.media_summary()) # Zero-spend weeks, distribution, etc.
Data Quality Checklist
- At least 2 years of weekly data (104+ observations) for stable estimation
- No media channels with >50% zero-spend weeks (insufficient variation)
- Control variables present for the entire time series
- No obvious data errors (negative spend, impossible values)
- Time series is at consistent frequency (weekly/daily)
Configure the Model
Model configuration follows directly from your analysis plan. Every choice here should trace back to a pre-specified decision.
from mmm_framework import (
ModelConfigBuilder,
TrendConfig,
TrendType,
BayesianMMM,
)
# Model inference configuration
model_config = (
ModelConfigBuilder()
.bayesian_numpyro() # JAX-based sampler for speed
.with_chains(4) # 4 chains for convergence diagnostics
.with_draws(2000) # 2000 posterior draws per chain
.with_tune(1000) # 1000 warmup iterations
.with_target_accept(0.9) # Target acceptance rate
.build()
)
# Trend configuration (from analysis plan: linear)
trend_config = TrendConfig(
type=TrendType.LINEAR,
growth_prior_sigma=0.1
)
# Build the model
mmm = BayesianMMM(panel, model_config, trend_config)
# Verify model structure matches analysis plan
print("Model parameters:")
for var in mmm.model.free_RVs:
print(f" {var.name}")
Set Priors Thoughtfully
Priors are the mechanism for encoding domain knowledge transparently. Unlike post hoc adjustments, priors are explicit, documented, and their effect on results can be measured.
- Setting priors after seeing posteriors
- Using priors to force results in a desired direction
- Extremely informative priors without empirical justification
- No documentation of prior choices
- Setting priors before fitting based on domain knowledge
- Using weakly informative priors that regularize without dominating
- Documenting the source and reasoning for each prior
- Running sensitivity analysis across reasonable prior ranges
from mmm_framework import PriorConfigBuilder, AdstockConfigBuilder
# Adstock priors: encode belief about carryover duration
# TV: expect longer carryover (mean ~0.7, but allow data to inform)
tv_adstock = (
AdstockConfigBuilder()
.geometric()
.with_l_max(8)
.with_alpha_prior(
PriorConfigBuilder().beta(alpha=3, beta=1.5).build()
# Beta(3, 1.5) -> mode at ~0.7, allows 0.3-0.95
)
.build()
)
# Digital: expect shorter carryover
digital_adstock = (
AdstockConfigBuilder()
.geometric()
.with_l_max(4)
.with_alpha_prior(
PriorConfigBuilder().beta(alpha=2, beta=3).build()
# Beta(2, 3) -> mode at ~0.3, allows 0.1-0.7
)
.build()
)
# Document: "TV adstock prior based on industry meta-analysis
# showing 4-8 week half-lives for TV. Digital based on
# platform-reported attribution windows of 1-3 weeks."
Prior Predictive Check
Before fitting the model to data, verify that your priors produce plausible predictions. This is the most underrated step in Bayesian modeling.
# Sample from priors only (no data yet)
prior_pred = mmm.sample_prior_predictive(samples=500)
y_prior = prior_pred.prior_predictive["y_obs"].values.flatten()
print(f"Prior predictive y range: [{y_prior.min():.0f}, {y_prior.max():.0f}]")
print(f"Actual y range: [{panel.y.min():.0f}, {panel.y.max():.0f}]")
print(f"Prior predictive y mean: {y_prior.mean():.0f}")
print(f"Actual y mean: {panel.y.mean():.0f}")
# Check: do priors produce data in the right ballpark?
# They should cover the observed range with room to spare,
# but not predict absurdities (negative sales, 10x actual)
What to Look For
Prior predictions should be plausible but vague. If the prior predictive range is [-1000, 1000] for sales that are always between 800-1200, your priors are too diffuse. If it is [950, 1050], your priors may be too informative. The sweet spot is a range like [200, 2000]—covering the data comfortably without allowing absurdities.
Phase 3: Validate
Fit the Model
# Fit with a fixed random seed for reproducibility
results = mmm.fit(random_seed=42)
print(f"Sampling completed.")
print(f" Chains: {results.n_chains}")
print(f" Draws per chain: {results.n_draws}")
print(f" Total posterior samples: {results.n_chains * results.n_draws}")
Check Diagnostics (Non-Negotiable)
MCMC diagnostics are not optional. They tell you whether the sampler explored the posterior adequately. Interpreting results from a poorly-converged model is worse than having no model at all.
# Convergence diagnostics
print("=== MCMC Diagnostics ===")
diag = results.diagnostics
# 1. Divergences: should be 0
print(f"Divergences: {diag['divergences']}")
if diag['divergences'] > 0:
print(" ACTION: Reparameterize or increase target_accept")
# 2. R-hat: should be < 1.01 for all parameters
print(f"R-hat max: {diag['rhat_max']:.4f}")
if diag['rhat_max'] > 1.01:
print(" ACTION: Run longer chains or investigate multimodality")
# 3. ESS: should be > 400 for all parameters
print(f"ESS bulk min: {diag['ess_bulk_min']:.0f}")
print(f"ESS tail min: {diag['ess_tail_min']:.0f}")
if diag['ess_bulk_min'] < 400:
print(" ACTION: Run more draws")
# 4. Summary of key parameters
summary = results.summary()
print("\n=== Parameter Summary ===")
print(summary[["mean", "sd", "hdi_3%", "hdi_97%", "r_hat"]].to_string())
| Diagnostic | Acceptable Range | If Out of Range |
|---|---|---|
| Divergences | 0 | Increase target_accept to 0.95+, or reparameterize |
| R-hat | < 1.01 | Run longer chains, check for multimodality |
| ESS Bulk | > 400 | Increase number of draws |
| ESS Tail | > 400 | Increase number of draws (tail ESS is harder to achieve) |
| Tree Depth | Rarely hits max | Increase max_treedepth |
Do Not Proceed with Bad Diagnostics
If diagnostics are poor, do not interpret the results. Fix the computational issues first. Common fixes: increase target acceptance rate, use a non-centered parameterization for hierarchical models, increase warmup iterations, or simplify the model.
Posterior Predictive Check
After fitting, compare the model’s predictions to observed data. This tests whether the model can reproduce the patterns in your data.
# Posterior predictive check
pp = results.posterior_predictive
import numpy as np
# Compare predicted vs actual
y_pred_mean = pp["y_obs"].mean(dim=["chain", "draw"]).values
y_actual = panel.y
# Calibration: what fraction of observations fall within 90% CI?
y_pred_low = np.percentile(pp["y_obs"].values, 5, axis=(0, 1))
y_pred_high = np.percentile(pp["y_obs"].values, 95, axis=(0, 1))
coverage = np.mean((y_actual >= y_pred_low) & (y_actual <= y_pred_high))
print(f"90% CI coverage: {coverage:.1%}")
print(f" (Target: ~90%. If much lower, model is overconfident)")
print(f" (If much higher, model is underconfident)")
# MAPE
mape = np.mean(np.abs(y_pred_mean - y_actual) / y_actual)
print(f"MAPE: {mape:.1%}")
Sensitivity Analysis
Test how much your conclusions change when you vary modeling assumptions. Robust findings persist across reasonable choices. Fragile findings suggest more data or experiments are needed.
# Sensitivity analysis: vary key assumptions
sensitivity_results = {}
# 1. Vary TV adstock L_max
for lmax in [4, 6, 8, 12]:
config = make_config(tv_lmax=lmax) # helper function
model = BayesianMMM(panel, config, trend_config)
res = model.fit(random_seed=42)
sensitivity_results[f"tv_lmax_{lmax}"] = {
"tv_roi_mean": res.roi("TV").mean(),
"tv_roi_hdi": res.roi("TV").hdi(0.9),
}
# 2. Vary prior strength
for sigma in [0.5, 1.0, 2.0]:
config = make_config(beta_prior_sigma=sigma)
model = BayesianMMM(panel, config, trend_config)
res = model.fit(random_seed=42)
sensitivity_results[f"prior_sigma_{sigma}"] = {
"tv_roi_mean": res.roi("TV").mean(),
"tv_roi_hdi": res.roi("TV").hdi(0.9),
}
# Report: how much do ROI estimates change?
print("\n=== Sensitivity Analysis ===")
for name, res in sensitivity_results.items():
roi_mean = res["tv_roi_mean"]
roi_low, roi_high = res["tv_roi_hdi"]
print(f" {name}: TV ROI = {roi_mean:.2f} ({roi_low:.2f}-{roi_high:.2f})")
Interpreting Sensitivity
If TV ROI ranges from 1.1 to 1.6 across all reasonable specifications, you can confidently report it is positive and above 1.0. If it ranges from 0.5 to 2.5, the estimate is sensitive to modeling choices and you should recommend validation experiments before acting on it.
Phase 4: Report
Extract Insights
# Channel contributions and ROI
print("=== Channel Results ===")
for channel in panel.channel_names:
roi = results.roi(channel)
contrib = results.contribution(channel)
print(f"\n{channel}:")
print(f" ROI: {roi.mean():.2f} (90% CI: {roi.hdi(0.9)[0]:.2f}-{roi.hdi(0.9)[1]:.2f})")
print(f" Share of effect: {contrib.share:.1%}")
print(f" Saturation level: {results.saturation_level(channel):.0%}")
# Generate the full report
from mmm_framework.reporting import MMMReportGenerator, ReportConfig
report = MMMReportGenerator(
model=mmm,
panel=panel,
results=results,
config=ReportConfig(
title="Q4 2025 Media Effectiveness Analysis",
client="Brand Name",
analysis_period="Jan 2024 - Dec 2025",
),
)
report.to_html("q4_2025_mmm_report.html")
Communicate Results Honestly
How you communicate results matters as much as the analysis itself. Stakeholders need to understand both what the model says and how confident the model is.
- “TV ROI is 1.42”
- “Digital drives 28% of sales”
- “We should shift $2M from TV to digital”
- No mention of uncertainty
- No mention of assumptions
- “TV ROI is estimated at 1.4 (90% CI: 1.1-1.8)”
- “Digital contributes 22-34% of sales (90% CI)”
- “A $2M shift would likely improve returns, but the magnitude is uncertain”
- Sensitivity to key assumptions documented
- Comparison to experimental results where available
For Detailed Guidance on Presenting Results
See the Interpreting Results for Media Planners and CMOs guide for specific recommendations on presenting uncertainty, creating executive summaries, and translating model outputs into actionable planning guidance.
When (and How) to Iterate
Iteration is a normal part of modeling. The key distinction is between legitimate iteration and specification shopping.
| Legitimate Iteration | Specification Shopping |
|---|---|
| Fixing computational issues (divergences, non-convergence) | Adjusting until results “look right” |
| Adding a confounder you forgot to include | Removing a variable because it reduced media effects |
| Expanding priors that are clearly too narrow (based on prior predictive check) | Tightening priors to force results toward desired values |
| Documenting changes and reporting both versions | Only reporting the version with preferred results |
| Iteration driven by failed diagnostic checks | Iteration driven by stakeholder feedback on ROI values |
The Rule of Thumb
If you would make the same change regardless of which direction the results moved, it is legitimate. If you would only make the change because the results went in the “wrong” direction, it is specification shopping.
Next Steps
Ready to implement? Start with the Getting Started guide for installation and a complete code walkthrough. For understanding the business context, see For Business Stakeholders. For presenting results, see Interpreting Results. To wire experiments into the model so each cycle compounds, see the Closed-Loop Measurement & Calibration guide.