Getting Started
Build your first Bayesian Marketing Mix Model in minutes. This guide walks you through installation, core concepts, and a complete working example using the mmm-framework.
What You'll Build
By the end of this guide, you'll have a working MMM that estimates media effects with honest uncertainty quantificationβthe foundation for decisions you can trust.
Prerequisites
Before you begin, ensure you have the following installed:
Why Redis?
The framework uses Redis + ARQ for asynchronous model fitting. This allows the Streamlit frontend to remain responsive while MCMC sampling runs in the background. For development, a local Redis instance is sufficient.
Installation
Clone the Repository
$ git clone https://github.com/redam94/mmm-framework.git
$ cd mmm-framework
# Install with uv (recommended)
$ uv sync
# Or with pip
$ pip install -e .
Install App Dependencies
If you want to use the Streamlit frontend and API backend:
$ uv sync --group app
# For development (includes testing tools)
$ uv sync --group dev --group app
Verify Installation
import mmm_framework
print(f"mmm-framework version: {mmm_framework.__version__}")
# Check available components
from mmm_framework import (
MFFConfigBuilder,
ModelConfigBuilder,
BayesianMMM,
load_mff,
)
print("β All core components imported successfully")
Your First Model
Let's build a complete Marketing Mix Model from scratch. We'll walk through each step of the workflow, from data preparation to interpreting results.
Step 1: Prepare Your Data
The framework uses Master Flat File (MFF) formatβa long-format structure that handles variable-dimension data elegantly. Each row represents a single observation of a single variable.
import pandas as pd
import numpy as np
# Example: Create synthetic MFF data
np.random.seed(42)
n_weeks = 104 # 2 years of weekly data
# Generate dates
dates = pd.date_range("2023-01-01", periods=n_weeks, freq="W")
# Build MFF records
records = []
# KPI: Sales (weekly)
for i, date in enumerate(dates):
base_sales = 1000 + 50 * np.sin(2 * np.pi * i / 52) # Seasonality
noise = np.random.normal(0, 50)
records.append({
"Period": date,
"VariableName": "Sales",
"VariableValue": base_sales + noise
})
# Media: TV spend (weekly, with some weeks at zero)
for i, date in enumerate(dates):
spend = np.random.exponential(500) if np.random.random() > 0.2 else 0
records.append({
"Period": date,
"VariableName": "TV",
"VariableValue": spend
})
# Media: Digital spend (weekly)
for i, date in enumerate(dates):
spend = np.random.exponential(300)
records.append({
"Period": date,
"VariableName": "Digital",
"VariableValue": spend
})
# Control: Price Index
for i, date in enumerate(dates):
price = 100 + np.random.normal(0, 5)
records.append({
"Period": date,
"VariableName": "Price",
"VariableValue": price
})
# Create DataFrame
mff_data = pd.DataFrame(records)
print(f"MFF shape: {mff_data.shape}")
print(mff_data.head(10))
MFF Format Benefits
The Master Flat File format handles complex scenarios like different variables at different granularities (e.g., national media + geo-level sales), hierarchical structures, and missing dataβall in a single, consistent structure.
Step 2: Configure the Model
The framework uses a fluent builder pattern for configuration. This provides a readable, chainable API while ensuring type safety and validation.
from mmm_framework import (
MFFConfigBuilder,
ModelConfigBuilder,
TrendConfig,
TrendType,
BayesianMMM,
load_mff,
)
# Step 2a: Configure the data structure
mff_config = (
MFFConfigBuilder()
.with_kpi_name("Sales") # Target variable
.add_national_media("TV", adstock_lmax=8) # TV with 8-week carryover
.add_national_media("Digital", adstock_lmax=4) # Digital with 4-week carryover
.add_price_control() # Price as control variable
.build()
)
print(f"Media channels: {mff_config.media_names}")
print(f"Control variables: {mff_config.control_names}")
# Step 2b: Load and validate data
panel = load_mff(mff_data, mff_config)
print(f"\nPanel dataset:")
print(f" Observations: {panel.n_obs}")
print(f" Channels: {panel.n_channels}")
print(f" Controls: {panel.n_controls}")
# Step 2c: Configure model inference
model_config = (
ModelConfigBuilder()
.bayesian_numpyro() # Use JAX-based sampler (4-10x faster)
.with_chains(4) # 4 parallel chains for convergence diagnostics
.with_draws(2000) # 2000 posterior draws
.with_tune(1000) # 1000 warmup iterations
.with_target_accept(0.9) # Target acceptance rate
.build()
)
# Step 2d: Configure trend component
trend_config = TrendConfig(
type=TrendType.LINEAR,
growth_prior_sigma=0.1
)
Adstock (Carryover)
Media effects persist over time. The adstock_lmax parameter
sets the maximum lag window. TV typically has longer carryover (6-12 weeks)
than digital (2-6 weeks).
Saturation
Returns diminish at higher spend levels. The framework uses Hill or logistic saturation functions by default, with priors that adapt to your data scale.
Step 3: Fit and Analyze
Now we build the model, check our priors, fit it to data, and analyze the results with proper uncertainty quantification.
# Build the model
mmm = BayesianMMM(panel, model_config, trend_config)
# Inspect model structure
print("Model parameters:")
for var in mmm.model.free_RVs:
print(f" {var.name}")
# Prior predictive check (ALWAYS do this before fitting)
print("\n=== Prior Predictive Check ===")
prior = mmm.sample_prior_predictive(samples=200)
y_prior = prior.prior_predictive["y_obs"].values.flatten()
print(f"Prior predictive y range: [{y_prior.min():.1f}, {y_prior.max():.1f}]")
print(f"Actual y range: [{panel.y.min():.1f}, {panel.y.max():.1f}]")
# Fit the model
print("\n=== Fitting Model ===")
results = mmm.fit(random_seed=42)
# Convergence diagnostics (CRITICAL - check these!)
print("\n=== Diagnostics ===")
print(f"Divergences: {results.diagnostics['divergences']}")
print(f"R-hat max: {results.diagnostics['rhat_max']:.4f}")
print(f"ESS bulk min: {results.diagnostics['ess_bulk_min']:.0f}")
# Check for issues
if results.diagnostics['divergences'] > 0:
print("β οΈ Divergences detected - consider reparameterization")
if results.diagnostics['rhat_max'] > 1.01:
print("β οΈ R-hat > 1.01 - chains may not have converged")
if results.diagnostics['ess_bulk_min'] < 400:
print("β οΈ Low ESS - consider more draws")
# Posterior summary with uncertainty
print("\n=== Posterior Summary ===")
summary = results.summary(["beta_TV", "beta_Digital", "sigma"])
print(summary[["mean", "sd", "hdi_3%", "hdi_97%", "r_hat"]])
Always Check Diagnostics
MCMC diagnostics are not optional. Divergences, high R-hat, or low ESS indicate that your posterior samples may be unreliable. The framework provides these automaticallyβalways review them before interpreting results.
Understanding Your Results
The MMMResults object provides everything you need for analysis:
# Channel contributions with uncertainty
print("\n=== Channel Contributions ===")
if results.channel_contributions is not None:
contrib = results.channel_contributions.sum()
print(f"Total TV contribution: {contrib['TV']:.0f}")
print(f"Total Digital contribution: {contrib['Digital']:.0f}")
# Access the full InferenceData object for ArviZ plots
import arviz as az
# Posterior distributions
az.plot_posterior(results.idata, var_names=["beta_TV", "beta_Digital"])
# Trace plots for convergence
az.plot_trace(results.idata, var_names=["beta_TV", "beta_Digital"])
# Forest plot comparing channels
az.plot_forest(results.idata, var_names=["beta_TV", "beta_Digital"])
Generating Reports
The framework includes a powerful reporting module that generates portable, single-file HTML reports with embedded Plotly charts and honest uncertainty quantification throughout.
from mmm_framework.reporting import MMMReportGenerator, ReportConfig, ReportBuilder
# Option 1: Quick report from fitted model
report = MMMReportGenerator(
model=mmm,
panel=panel,
results=results,
config=ReportConfig(
title="Marketing Mix Model Analysis",
client="Acme Consumer Products",
analysis_period="Jan 2023 - Dec 2025",
),
)
# Save to HTML
report.to_html("mmm_report_q4_2025.html")
# Option 2: Fluent builder pattern for customization
report = (
ReportBuilder()
.with_model(mmm, panel=panel, results=results)
.with_title("Q4 Marketing Analysis")
.with_client("Acme Corp")
.with_credible_interval(0.9) # 90% credible intervals
.enable_all_sections()
.disable_section("diagnostics") # Hide technical details
.build()
)
report.to_html("executive_summary.html")
π Report Contents
Executive summary, model fit visualization, channel ROI forest plots with uncertainty, revenue decomposition (waterfall + time series), saturation curves, and methodology documentation.
π¨ Customizable
Multiple color palettes (Sage, Corporate, Warm), configurable sections, adjustable credible intervals, and support for extended models (nested, multivariate, geographic).
π See an Example Report
View a complete example report generated by the framework to see all available visualizations and sections:
Open Example Report βCore Concept: MFF Data Format
The Master Flat File format is a long-format data structure designed for marketing measurement. It elegantly handles the dimensionality challenges common in MMM:
# MFF structure example
"""
Period | Geography | VariableName | VariableValue
2024-01-01 | National | Sales | 15000
2024-01-01 | National | TV_Spend | 50000
2024-01-01 | East | Sales | 8000
2024-01-01 | West | Sales | 7000
2024-01-08 | National | Sales | 16500
...
"""
# The framework handles dimension alignment automatically
mff_config = (
MFFConfigBuilder()
.with_kpi_name("Sales")
# National media disaggregated to geo by population share
.add_national_media("TV", adstock_lmax=8)
# Geo-level media stays at its native granularity
.add_geo_media("Local_Radio", adstock_lmax=6)
.build()
)
Core Concept: Builder Pattern
The fluent builder pattern provides a readable, type-safe API for configuration:
from mmm_framework import (
PriorConfigBuilder,
AdstockConfigBuilder,
SaturationConfigBuilder,
MediaChannelConfigBuilder,
HierarchicalConfigBuilder,
SeasonalityConfigBuilder,
)
# Build custom priors
decay_prior = (
PriorConfigBuilder()
.beta(alpha=2, beta=2) # Centered at 0.5
.build()
)
# Build adstock configuration
adstock = (
AdstockConfigBuilder()
.geometric()
.with_l_max(8)
.with_alpha_prior(decay_prior)
.build()
)
# Build complete media channel config
tv_channel = (
MediaChannelConfigBuilder()
.with_name("TV")
.national()
.with_adstock(adstock)
.with_hill_saturation()
.positive_effect() # Constrain to positive
.build()
)
# Build hierarchical structure for geo models
hierarchical = (
HierarchicalConfigBuilder()
.enabled()
.pool_across_geo()
.use_non_centered() # Better for sparse geos
.with_non_centered_threshold(20)
.build()
)
Core Concept: Bayesian Workflow
The framework implements the complete Bayesian workflow as described by Gelman et al. (2020):
1. Prior Predictive Check
Sample from priors to ensure they produce plausible data.
Use mmm.sample_prior_predictive() before fitting.
2. Fit & Diagnose
Run MCMC and check convergence. Look for divergences, R-hat < 1.01, and ESS > 400 per parameter.
3. Posterior Predictive Check
Compare model predictions to observed data.
Use results.posterior_predictive for calibration.
4. Sensitivity Analysis
Test how results change with different priors. Robust findings persist across reasonable prior choices.
Project Structure
Understanding the repository layout helps you navigate the codebase:
src/mmm_framework/ β Core Python package
__init__.py β Module exports and version
config.py β Pydantic config classes and enums
data_loader.py β MFF parsing and validation
analysis.py β Analysis utilities
serialization.py β Save/load functionality
jobs.py β Async job management with ARQ
builders/ β Modular builder classes
base.py β Shared mixins and protocols
prior.py β Prior, adstock, saturation builders
variable.py β Media, control, KPI builders
model.py β Model config builders
mff.py β MFF config builders
model/ β Core model module
base.py β BayesianMMM class
results.py β Result containers
components/ β Model components
trend.py β Trend configurations
transforms/ β Transformation functions
adstock.py β Geometric adstock transforms
saturation.py β Logistic/Hill saturation
seasonality.py β Fourier features
trend.py β B-spline, piecewise trends
utils/ β Utility functions
standardization.py β Data standardization
statistics.py β Statistical helpers
reporting/ β HTML report generation
config.py β ReportConfig, ColorScheme
generator.py β MMMReportGenerator
sections.py β Report section implementations
design_tokens.py β Unified design tokens
charts/ β Modular chart functions
decomposition.py, diagnostic.py, fit.py, geo.py, roi.py
extractors/ β Data extraction from models
helpers/ β ROI, decomposition helpers
mmm_extensions/ β Extended model capabilities
config.py β Mediator, Outcome, CrossEffect configs
builders.py β Extension builders + factory functions
results.py β Extended model results
models/ β Model class implementations
nested.py β NestedMMM (mediation)
multivariate.py β MultivariateMMM
combined.py β CombinedMMM
components/ β PyMC/PyTensor blocks
cross_effects.py, variable_selection.py, transforms.py
api/ β FastAPI backend
main.py β Application factory and health endpoints
routes/ β API route handlers
schemas.py β Pydantic request/response models
redis_service.py β Redis connection management
worker.py β ARQ worker settings
app/ β Streamlit frontend
Home.py β Main entry point and dashboard
api_client.py β HTTP client for backend API
pages/ β Multipage app modules
1_Data_Management.py β Upload and manage datasets
2_Configuration.py β Build model configurations
3_Model_Fitting.py β Submit and monitor jobs
4_Results.py β View diagnostics and contributions
5_Scenarios.py β What-if analysis and optimization
components/ β Reusable UI components
__init__.py β Component exports
common.py β Formatters, session state, CSS
charts.py β Plotly visualization functions
examples/ β Usage examples
ex_builder.py β Builder pattern demonstrations
ex_config.py β Configuration examples
ex_models.py β Model fitting workflows
ex_extensions.py β Extended model examples
ex_reporter.py β Report generation examples
tests/ β Test suite
conftest.py β Pytest fixtures
mmm_extensions/ β Extension module tests
docs/ β GitHub Pages documentation
index.html β Documentation homepage
getting-started.html β This page
technical-guide.html β Model specifications
shared/ β Shared CSS and components
pyproject.toml β Project configuration (uv/pip)
README.md β Project documentation
Running the API Backend
For production use or the Streamlit app, start the full stack:
1 Start Redis
The job queue requires Redis running locally.
2 Start API Server
FastAPI backend with OpenAPI docs.
$ uvicorn main:app --reload
3 Start Worker
ARQ worker processes model fitting jobs.
$ arq worker.WorkerSettings
4 Launch Streamlit
Interactive UI for configuration and analysis.
$ streamlit run Home.py
Streamlit Application
The Streamlit app provides an interactive interface for the complete MMM workflow:
π€ Data Upload
Upload MFF-format CSV files with automatic validation and preview.
βοΈ Config Builder
Visual interface for building model configurations without code.
π¬ Model Fitting
Submit jobs, monitor progress, and view real-time diagnostics.
π Results Analysis
Interactive visualizations for posteriors, contributions, and response curves.
Next Steps
You've built your first model! Here's where to go next:
π¬ Interactive Workflow Demo
Walk through the complete scientific modeling workflow step-by-stepβfrom question formulation through prior predictive checks, MCMC fitting, diagnostics, sensitivity analysis, and report generation.
Launch Interactive Demo β View Example Report