Getting Started

Build your first Bayesian Marketing Mix Model in minutes. This guide walks you through installation, core concepts, and a complete working example using the mmm-framework.

What You'll Build

By the end of this guide, you'll have a working MMM that estimates media effects with honest uncertainty quantificationβ€”the foundation for decisions you can trust.

πŸ“Š Prepare Data
β†’
βš™οΈ Configure Model
β†’
πŸ”¬ Fit & Diagnose
β†’
πŸ“ˆ Analyze Results

Prerequisites

Before you begin, ensure you have the following installed:

βœ“ Python 3.12+
βœ“ Redis Server
βœ“ uv (recommended) or pip
βœ“ Git

Why Redis?

The framework uses Redis + ARQ for asynchronous model fitting. This allows the Streamlit frontend to remain responsive while MCMC sampling runs in the background. For development, a local Redis instance is sufficient.

Installation

Clone the Repository

# Clone the repository
$ git clone https://github.com/redam94/mmm-framework.git
$ cd mmm-framework

# Install with uv (recommended)
$ uv sync

# Or with pip
$ pip install -e .

Install App Dependencies

If you want to use the Streamlit frontend and API backend:

# Install app dependencies for Streamlit frontend
$ uv sync --group app

# For development (includes testing tools)
$ uv sync --group dev --group app

Verify Installation

import mmm_framework
print(f"mmm-framework version: {mmm_framework.__version__}")

# Check available components
from mmm_framework import (
    MFFConfigBuilder,
    ModelConfigBuilder,
    BayesianMMM,
    load_mff,
)
print("βœ“ All core components imported successfully")

Your First Model

Let's build a complete Marketing Mix Model from scratch. We'll walk through each step of the workflow, from data preparation to interpreting results.

Step 1: Prepare Your Data

The framework uses Master Flat File (MFF) formatβ€”a long-format structure that handles variable-dimension data elegantly. Each row represents a single observation of a single variable.

import pandas as pd
import numpy as np

# Example: Create synthetic MFF data
np.random.seed(42)
n_weeks = 104  # 2 years of weekly data

# Generate dates
dates = pd.date_range("2023-01-01", periods=n_weeks, freq="W")

# Build MFF records
records = []

# KPI: Sales (weekly)
for i, date in enumerate(dates):
    base_sales = 1000 + 50 * np.sin(2 * np.pi * i / 52)  # Seasonality
    noise = np.random.normal(0, 50)
    records.append({
        "Period": date,
        "VariableName": "Sales",
        "VariableValue": base_sales + noise
    })

# Media: TV spend (weekly, with some weeks at zero)
for i, date in enumerate(dates):
    spend = np.random.exponential(500) if np.random.random() > 0.2 else 0
    records.append({
        "Period": date,
        "VariableName": "TV",
        "VariableValue": spend
    })

# Media: Digital spend (weekly)
for i, date in enumerate(dates):
    spend = np.random.exponential(300)
    records.append({
        "Period": date,
        "VariableName": "Digital",
        "VariableValue": spend
    })

# Control: Price Index
for i, date in enumerate(dates):
    price = 100 + np.random.normal(0, 5)
    records.append({
        "Period": date,
        "VariableName": "Price",
        "VariableValue": price
    })

# Create DataFrame
mff_data = pd.DataFrame(records)
print(f"MFF shape: {mff_data.shape}")
print(mff_data.head(10))

MFF Format Benefits

The Master Flat File format handles complex scenarios like different variables at different granularities (e.g., national media + geo-level sales), hierarchical structures, and missing dataβ€”all in a single, consistent structure.

Step 2: Configure the Model

The framework uses a fluent builder pattern for configuration. This provides a readable, chainable API while ensuring type safety and validation.

from mmm_framework import (
    MFFConfigBuilder,
    ModelConfigBuilder,
    TrendConfig,
    TrendType,
    BayesianMMM,
    load_mff,
)

# Step 2a: Configure the data structure
mff_config = (
    MFFConfigBuilder()
    .with_kpi_name("Sales")                    # Target variable
    .add_national_media("TV", adstock_lmax=8)  # TV with 8-week carryover
    .add_national_media("Digital", adstock_lmax=4)  # Digital with 4-week carryover
    .add_price_control()                       # Price as control variable
    .build()
)

print(f"Media channels: {mff_config.media_names}")
print(f"Control variables: {mff_config.control_names}")

# Step 2b: Load and validate data
panel = load_mff(mff_data, mff_config)

print(f"\nPanel dataset:")
print(f"  Observations: {panel.n_obs}")
print(f"  Channels: {panel.n_channels}")
print(f"  Controls: {panel.n_controls}")

# Step 2c: Configure model inference
model_config = (
    ModelConfigBuilder()
    .bayesian_numpyro()       # Use JAX-based sampler (4-10x faster)
    .with_chains(4)           # 4 parallel chains for convergence diagnostics
    .with_draws(2000)         # 2000 posterior draws
    .with_tune(1000)          # 1000 warmup iterations
    .with_target_accept(0.9)  # Target acceptance rate
    .build()
)

# Step 2d: Configure trend component
trend_config = TrendConfig(
    type=TrendType.LINEAR,
    growth_prior_sigma=0.1
)

Adstock (Carryover)

Media effects persist over time. The adstock_lmax parameter sets the maximum lag window. TV typically has longer carryover (6-12 weeks) than digital (2-6 weeks).

Saturation

Returns diminish at higher spend levels. The framework uses Hill or logistic saturation functions by default, with priors that adapt to your data scale.

Step 3: Fit and Analyze

Now we build the model, check our priors, fit it to data, and analyze the results with proper uncertainty quantification.

# Build the model
mmm = BayesianMMM(panel, model_config, trend_config)

# Inspect model structure
print("Model parameters:")
for var in mmm.model.free_RVs:
    print(f"  {var.name}")

# Prior predictive check (ALWAYS do this before fitting)
print("\n=== Prior Predictive Check ===")
prior = mmm.sample_prior_predictive(samples=200)
y_prior = prior.prior_predictive["y_obs"].values.flatten()

print(f"Prior predictive y range: [{y_prior.min():.1f}, {y_prior.max():.1f}]")
print(f"Actual y range: [{panel.y.min():.1f}, {panel.y.max():.1f}]")

# Fit the model
print("\n=== Fitting Model ===")
results = mmm.fit(random_seed=42)

# Convergence diagnostics (CRITICAL - check these!)
print("\n=== Diagnostics ===")
print(f"Divergences: {results.diagnostics['divergences']}")
print(f"R-hat max: {results.diagnostics['rhat_max']:.4f}")
print(f"ESS bulk min: {results.diagnostics['ess_bulk_min']:.0f}")

# Check for issues
if results.diagnostics['divergences'] > 0:
    print("⚠️  Divergences detected - consider reparameterization")
if results.diagnostics['rhat_max'] > 1.01:
    print("⚠️  R-hat > 1.01 - chains may not have converged")
if results.diagnostics['ess_bulk_min'] < 400:
    print("⚠️  Low ESS - consider more draws")

# Posterior summary with uncertainty
print("\n=== Posterior Summary ===")
summary = results.summary(["beta_TV", "beta_Digital", "sigma"])
print(summary[["mean", "sd", "hdi_3%", "hdi_97%", "r_hat"]])

Always Check Diagnostics

MCMC diagnostics are not optional. Divergences, high R-hat, or low ESS indicate that your posterior samples may be unreliable. The framework provides these automaticallyβ€”always review them before interpreting results.

Understanding Your Results

The MMMResults object provides everything you need for analysis:

# Channel contributions with uncertainty
print("\n=== Channel Contributions ===")
if results.channel_contributions is not None:
    contrib = results.channel_contributions.sum()
    print(f"Total TV contribution: {contrib['TV']:.0f}")
    print(f"Total Digital contribution: {contrib['Digital']:.0f}")

# Access the full InferenceData object for ArviZ plots
import arviz as az

# Posterior distributions
az.plot_posterior(results.idata, var_names=["beta_TV", "beta_Digital"])

# Trace plots for convergence
az.plot_trace(results.idata, var_names=["beta_TV", "beta_Digital"])

# Forest plot comparing channels
az.plot_forest(results.idata, var_names=["beta_TV", "beta_Digital"])

Generating Reports

The framework includes a powerful reporting module that generates portable, single-file HTML reports with embedded Plotly charts and honest uncertainty quantification throughout.

from mmm_framework.reporting import MMMReportGenerator, ReportConfig, ReportBuilder

# Option 1: Quick report from fitted model
report = MMMReportGenerator(
    model=mmm,
    panel=panel,
    results=results,
    config=ReportConfig(
        title="Marketing Mix Model Analysis",
        client="Acme Consumer Products",
        analysis_period="Jan 2023 - Dec 2025",
    ),
)

# Save to HTML
report.to_html("mmm_report_q4_2025.html")

# Option 2: Fluent builder pattern for customization
report = (
    ReportBuilder()
    .with_model(mmm, panel=panel, results=results)
    .with_title("Q4 Marketing Analysis")
    .with_client("Acme Corp")
    .with_credible_interval(0.9)  # 90% credible intervals
    .enable_all_sections()
    .disable_section("diagnostics")  # Hide technical details
    .build()
)

report.to_html("executive_summary.html")

πŸ“Š Report Contents

Executive summary, model fit visualization, channel ROI forest plots with uncertainty, revenue decomposition (waterfall + time series), saturation curves, and methodology documentation.

🎨 Customizable

Multiple color palettes (Sage, Corporate, Warm), configurable sections, adjustable credible intervals, and support for extended models (nested, multivariate, geographic).

πŸ“„ See an Example Report

View a complete example report generated by the framework to see all available visualizations and sections:

Open Example Report β†’

Core Concept: MFF Data Format

The Master Flat File format is a long-format data structure designed for marketing measurement. It elegantly handles the dimensionality challenges common in MMM:

# MFF structure example
"""
Period      | Geography | VariableName | VariableValue
2024-01-01  | National  | Sales        | 15000
2024-01-01  | National  | TV_Spend     | 50000
2024-01-01  | East      | Sales        | 8000
2024-01-01  | West      | Sales        | 7000
2024-01-08  | National  | Sales        | 16500
...
"""

# The framework handles dimension alignment automatically
mff_config = (
    MFFConfigBuilder()
    .with_kpi_name("Sales")
    # National media disaggregated to geo by population share
    .add_national_media("TV", adstock_lmax=8)
    # Geo-level media stays at its native granularity  
    .add_geo_media("Local_Radio", adstock_lmax=6)
    .build()
)

Core Concept: Builder Pattern

The fluent builder pattern provides a readable, type-safe API for configuration:

from mmm_framework import (
    PriorConfigBuilder,
    AdstockConfigBuilder,
    SaturationConfigBuilder,
    MediaChannelConfigBuilder,
    HierarchicalConfigBuilder,
    SeasonalityConfigBuilder,
)

# Build custom priors
decay_prior = (
    PriorConfigBuilder()
    .beta(alpha=2, beta=2)  # Centered at 0.5
    .build()
)

# Build adstock configuration
adstock = (
    AdstockConfigBuilder()
    .geometric()
    .with_l_max(8)
    .with_alpha_prior(decay_prior)
    .build()
)

# Build complete media channel config
tv_channel = (
    MediaChannelConfigBuilder()
    .with_name("TV")
    .national()
    .with_adstock(adstock)
    .with_hill_saturation()
    .positive_effect()  # Constrain to positive
    .build()
)

# Build hierarchical structure for geo models
hierarchical = (
    HierarchicalConfigBuilder()
    .enabled()
    .pool_across_geo()
    .use_non_centered()  # Better for sparse geos
    .with_non_centered_threshold(20)
    .build()
)

Core Concept: Bayesian Workflow

The framework implements the complete Bayesian workflow as described by Gelman et al. (2020):

1. Prior Predictive Check

Sample from priors to ensure they produce plausible data. Use mmm.sample_prior_predictive() before fitting.

2. Fit & Diagnose

Run MCMC and check convergence. Look for divergences, R-hat < 1.01, and ESS > 400 per parameter.

3. Posterior Predictive Check

Compare model predictions to observed data. Use results.posterior_predictive for calibration.

4. Sensitivity Analysis

Test how results change with different priors. Robust findings persist across reasonable prior choices.

Project Structure

Understanding the repository layout helps you navigate the codebase:

mmm-framework/
src/mmm_framework/ β€” Core Python package
__init__.py β€” Module exports and version
config.py β€” Pydantic config classes and enums
data_loader.py β€” MFF parsing and validation
analysis.py β€” Analysis utilities
serialization.py β€” Save/load functionality
jobs.py β€” Async job management with ARQ
builders/ β€” Modular builder classes
base.py β€” Shared mixins and protocols
prior.py β€” Prior, adstock, saturation builders
variable.py β€” Media, control, KPI builders
model.py β€” Model config builders
mff.py β€” MFF config builders
model/ β€” Core model module
base.py β€” BayesianMMM class
results.py β€” Result containers
components/ β€” Model components
trend.py β€” Trend configurations
transforms/ β€” Transformation functions
adstock.py β€” Geometric adstock transforms
saturation.py β€” Logistic/Hill saturation
seasonality.py β€” Fourier features
trend.py β€” B-spline, piecewise trends
utils/ β€” Utility functions
standardization.py β€” Data standardization
statistics.py β€” Statistical helpers
reporting/ β€” HTML report generation
config.py β€” ReportConfig, ColorScheme
generator.py β€” MMMReportGenerator
sections.py β€” Report section implementations
design_tokens.py β€” Unified design tokens
charts/ β€” Modular chart functions
decomposition.py, diagnostic.py, fit.py, geo.py, roi.py
extractors/ β€” Data extraction from models
helpers/ β€” ROI, decomposition helpers
mmm_extensions/ β€” Extended model capabilities
config.py β€” Mediator, Outcome, CrossEffect configs
builders.py β€” Extension builders + factory functions
results.py β€” Extended model results
models/ β€” Model class implementations
nested.py β€” NestedMMM (mediation)
multivariate.py β€” MultivariateMMM
combined.py β€” CombinedMMM
components/ β€” PyMC/PyTensor blocks
cross_effects.py, variable_selection.py, transforms.py
api/ β€” FastAPI backend
main.py β€” Application factory and health endpoints
routes/ β€” API route handlers
schemas.py β€” Pydantic request/response models
redis_service.py β€” Redis connection management
worker.py β€” ARQ worker settings
app/ β€” Streamlit frontend
Home.py β€” Main entry point and dashboard
api_client.py β€” HTTP client for backend API
pages/ β€” Multipage app modules
1_Data_Management.py β€” Upload and manage datasets
2_Configuration.py β€” Build model configurations
3_Model_Fitting.py β€” Submit and monitor jobs
4_Results.py β€” View diagnostics and contributions
5_Scenarios.py β€” What-if analysis and optimization
components/ β€” Reusable UI components
__init__.py β€” Component exports
common.py β€” Formatters, session state, CSS
charts.py β€” Plotly visualization functions
examples/ β€” Usage examples
ex_builder.py β€” Builder pattern demonstrations
ex_config.py β€” Configuration examples
ex_models.py β€” Model fitting workflows
ex_extensions.py β€” Extended model examples
ex_reporter.py β€” Report generation examples
tests/ β€” Test suite
conftest.py β€” Pytest fixtures
mmm_extensions/ β€” Extension module tests
docs/ β€” GitHub Pages documentation
index.html β€” Documentation homepage
getting-started.html β€” This page
technical-guide.html β€” Model specifications
shared/ β€” Shared CSS and components
pyproject.toml β€” Project configuration (uv/pip)
README.md β€” Project documentation

Running the API Backend

For production use or the Streamlit app, start the full stack:

1 Start Redis

The job queue requires Redis running locally.

$ redis-server

2 Start API Server

FastAPI backend with OpenAPI docs.

$ cd api
$ uvicorn main:app --reload

3 Start Worker

ARQ worker processes model fitting jobs.

$ cd api
$ arq worker.WorkerSettings

4 Launch Streamlit

Interactive UI for configuration and analysis.

$ cd app
$ streamlit run Home.py

Streamlit Application

The Streamlit app provides an interactive interface for the complete MMM workflow:

πŸ“€ Data Upload

Upload MFF-format CSV files with automatic validation and preview.

βš™οΈ Config Builder

Visual interface for building model configurations without code.

πŸ”¬ Model Fitting

Submit jobs, monitor progress, and view real-time diagnostics.

πŸ“Š Results Analysis

Interactive visualizations for posteriors, contributions, and response curves.

Next Steps

You've built your first model! Here's where to go next:

πŸ”¬ Interactive Workflow Demo

Walk through the complete scientific modeling workflow step-by-stepβ€”from question formulation through prior predictive checks, MCMC fitting, diagnostics, sensitivity analysis, and report generation.

Launch Interactive Demo β†’ View Example Report

Questions or Issues?

Open an issue on GitHub or check the FAQ for common questions.