9 steps from question to actionable insights
What do you actually want to learn?
Every good analysis starts with a clear question. This might seem obvious, but skipping this step is the most common mistake in marketing measurement. Without a clear question, you end up with lots of numbers but no insights.
You wouldn't open Google Maps before knowing where you want to go. The destination determines the route. Similarly, your business question determines what analysis to run, what data you need, and how to interpret the results.
"Which channels work and by how much?"
Learn more →"How should we split our budget?"
Learn more →"What happens if we change spend?"
Learn more →For this walkthrough: "What is the ROI of each marketing channel, and how should we allocate our budget?"
This is a causal question—we want to know if changing our marketing spend will cause sales to change. That's harder than measuring correlation, but it's what we need for actual decisions.
When you decide your question after seeing the data, you can unconsciously twist the question to match whatever the data happened to show. That's like drawing a target around wherever your arrow landed—it looks impressive but doesn't mean you're a good archer.
This practice of deciding your analysis plan beforehand is called pre-registration, and it's one of the most important safeguards against fooling yourself.
What process generated these observations? Think causally.
Before touching any model code, write down the generative story—a narrative of how you believe your data came to be. This isn't just about marketing effects; it's about everything that affects sales, including what else might create spurious associations between media and outcomes.
Every data point in your sales figures has an origin story. Your job is to imagine the invisible forces that shaped each observation—like a novelist explaining why their character ended up where they did. Miss a key plot element (a confounder), and your story doesn't make sense. Include irrelevant details (a collider), and you'll mislead the reader.
Watch how different factors combine to create the sales pattern you observe. Toggle each chapter on or off to see its contribution. This is the generative process your model will learn.
Organic demand without marketing
Long-term growth or decline
Predictable annual rhythms
Channel contributions with carryover
Distribution, competition, weather
Unexplained variation
Sales start from a baseline of ~$50K/week from loyal customers. The brand is growing 2% annually as it gains market share. Seasonal patterns drive a holiday surge in Q4. Marketing channels add incremental sales with diminishing returns and carryover effects. External factors like distribution and weather affect sales independently. Finally, random variation accounts for unpredictable events.
Click each chapter below to understand its role in your data story.
Imagine you turned off all advertising tomorrow. Sales wouldn't drop to zero—people who already love your product would still buy it. This is your baseline demand: the foundation everything else builds upon.
Repeat customers who buy habitually
Shoppers who discover you on shelf
Referrals from satisfied customers
People actively seeking your product
If you overestimate your baseline, marketing looks less effective than it really is (you're crediting baseline with sales that marketing created). If you underestimate it, marketing looks too effective. Getting the baseline right is crucial for accurate ROI.
Zoom out from weekly fluctuations and ask: Is your business growing, stable, or declining? This long-term trajectory exists independent of marketing tactics—it's the tide that lifts or lowers all boats.
Growth Brand: New market entrant gaining share. Risk: Marketing gets credited for organic growth if trend is ignored.
Trend is like an escalator. A growth brand is on an up escalator—even standing still, you move forward. Marketing is like walking on the escalator; it adds to your movement but isn't the whole picture. If you ignore the escalator, you'll think walking (marketing) is more powerful than it really is.
Sales don't flow evenly through the year. Ice cream peaks in summer, toys in December, fitness products in January. These seasonal patterns are predictable rhythms driven by weather, holidays, and cultural cycles—not by marketing.
Companies often run big campaigns during Q4 holidays—exactly when sales would spike anyway due to seasonal demand. Without proper seasonality modeling, you'll credit your holiday campaign with sales that would have happened regardless. This is why seasonality is a confounder: it affects both media planning AND sales.
Each marketing channel can lift sales above what baseline, trend, and seasonality would predict. But marketing effects have two crucial properties that make them tricky to measure:
The first dollar of TV spend is more valuable than the millionth. Saturation sets in.
Today's ad doesn't just affect today's sales—its impact lingers for weeks.
See how a single week of advertising ripples through time with carryover effects.
If you ignore carryover, you'll think last week's campaign had no effect because you only look at same-week sales. But a TV ad seen on Monday might drive a purchase on Friday—or three Fridays later. Carryover modeling captures this delayed response.
Here's where things get subtle. Besides baseline, trend, seasonality, and marketing, other factors affect your sales. Some of these factors are harmless. Others will completely bias your results if you handle them wrong.
You're a detective trying to solve "Who drove these sales?" Media is your prime suspect, but there are accomplices, witnesses, and red herrings. Confounders are accomplices who also left fingerprints—ignore them and you'll blame the wrong suspect. Colliders are red herrings that will lead you astray if you follow them.
The brand expanded to 500 new stores. They increased media spend to support the launch. Sales rose 20%. Was it the media or the new stores? Both—distribution is a confounder.
A competitor launched an aggressive campaign. The brand increased media to maintain share of voice. Sales dipped slightly. Ignoring competitor activity makes media look ineffective.
An unusual heatwave boosted store traffic. The brand didn't change media spend in response. Sales rose. Weather is a precision control—include it to reduce noise.
Not all control variables are created equal. Their causal relationship to media and sales determines whether they can be dropped, must be included, or should be avoided entirely. Getting this wrong is one of the most common sources of bias in MMM.
Affects both media and sales
Must always include
Affects sales only, not media
Safe for variable selection
Lies on the causal path
Control = block effect!
Caused by both media and sales
Control = create bias!
| Variable | Type | Why? | What to Do |
|---|---|---|---|
| Distribution/ACV | Confounder | Brands with more distribution get more media and have higher sales | ✓ Always include; never drop |
| Price/Promotions | Confounder | Promotional periods often coincide with media flights | ✓ Always include; never drop |
| Competitor Media | Confounder | Competitors react to your media; both affect your sales | ✓ Always include; never drop |
| Seasonality | Confounder | Media planning follows seasonal demand patterns | ✓ Always include (Fourier terms) |
| Weather | Precision | Affects store visits but doesn't drive media budgets | ✓ Include or apply selection* |
| Gas Prices | Precision | Affects discretionary spending; not tied to media | ✓ Include or apply selection* |
| Minor Holidays | Precision | Columbus Day affects traffic but not media planning | ✓ Include or apply selection |
| Brand Awareness | Mediator | Media → Awareness → Sales (it's the mechanism!) | ⚠️ Usually exclude for total effect |
| Purchase Intent | Mediator | On the causal path from media to sales | ⚠️ Exclude unless decomposing paths |
| Survey Response | Collider | People who saw ads AND bought are more likely to respond | 🚫 Never include |
You might wonder: "If I include weather (which is seasonal) and gas prices (which have trends), do I still need explicit trend and seasonality components?" Yes, absolutely.
External factors like weather or gas prices capture specific mechanisms—how temperature affects store visits, how fuel costs affect discretionary spending. Your model's trend and seasonality components capture something different: the overall rhythms of demand and long-term trajectory that aren't fully explained by any single external factor.
The specific effect of temperature/precipitation on behavior (e.g., hot days → more ice cream)
The overall annual rhythm of demand (e.g., Q4 holiday surge) that persists even controlling for weather
Think of it this way: Even if you perfectly controlled for every external factor, would demand still have an annual pattern? Would the brand still be growing or declining? If yes, you need trend and seasonality components. These are confounders (they affect both media planning and sales), so they must always be included—regardless of what other seasonal variables you have.
A confounder creates a spurious correlation between media and sales that has nothing to do with causation. If you don't control for it, your media effect estimate will absorb this spurious correlation and be biased.
Ice cream sales and drowning deaths are correlated. But ice cream doesn't cause drowning—summer is the confounder that increases both. If you tried to estimate "the effect of ice cream on drowning" without controlling for season, you'd get a spuriously positive number.
In MMM, distribution works the same way: brands with more distribution spend more on media AND sell more. If you don't control for distribution, media looks more effective than it really is.
Expanding to 1,000 new stores triggers both media spend increase AND sales increase.
Q4 holiday season drives both ad spending spikes AND natural demand surges.
Competitor launches → you increase media → their launch hurts your sales.
When you include a variable to control for confounding, its coefficient may not be interpretable as a causal effect. Distribution might show a negative coefficient because other omitted variables (market saturation, competitive intensity) are correlated with it. This is sometimes called the table 2 fallacy
The goal: Get unbiased media effects, not to understand distribution's causal impact. Include distribution to isolate media effects—don't agonize over its sign.
Precision controls only affect the outcome—they're uncorrelated with your treatment (media spending). Including them reduces noise and gives you tighter credible intervals on media effects, but omitting them doesn't bias your estimates.
Imagine weighing packages on a scale that wobbles. The wobble (weather, gas prices) adds noise to your measurements but doesn't systematically make packages heavier or lighter. Stabilizing the scale (including precision controls) gives more precise readings but doesn't change the average weight.
Because precision controls don't affect media spending, it's safe to apply Bayesian variable selection (horseshoe, spike-and-slab) to them. If they turn out to be irrelevant, we shrink them toward zero without biasing our media estimates.
Ask: "Does this variable influence how we allocate media budget?" If no, it's likely a precision control. If yes, it's probably a confounder and must be protected from variable selection.
A mediator lies on the causal path between media and sales. Brand awareness is the classic example: TV advertising increases awareness, and awareness drives purchase. The chain is: Media → Awareness → Sales.
If you control for awareness in your model, you're asking: "What's the effect of media on sales, holding awareness constant?" But media works through awareness! You've just blocked the causal pathway and will dramatically underestimate media's true effect.
Rule: For the total effect of media on sales, exclude mediators. Only include mediators if you specifically want to decompose paths (direct vs. indirect effects).
"What's TV's total impact on sales?"
"How much of TV's effect is through awareness vs. direct?"
A collider is caused by both media and sales. Unlike confounders (which we must control) and mediators (which we usually exclude), colliders are dangerous specifically when we control for them.
Among Hollywood actors, there's a negative correlation between attractiveness and acting talent. Does being attractive make you worse at acting? No—it's selection bias. To become a Hollywood actor (the collider), you need either looks OR talent OR both. By conditioning on "is a Hollywood actor," you've created a spurious negative relationship.
In MMM, "survey respondent" can be a collider: people who both saw your ads AND bought your product are more likely to respond to a brand survey. Controlling for survey response creates a spurious association between ad exposure and purchase.
Common colliders in marketing data include: survey response, website visit + purchase combinations, and any "engagement metric" that's caused by both advertising and underlying purchase intent. When in doubt, draw the DAG and check the arrow directions.
See how omitting a confounder (distribution) biases your media effect estimate. Toggle between including and excluding it to see the effect on your results.
With distribution included, we isolate the true causal effect of media (0.25). The correlation between media and sales that was actually due to distribution is properly attributed—not falsely credited to media.
Even after accounting for everything—baseline, trend, seasonality, marketing, and control factors—there's always unexplained variation. A celebrity mentions your product, a competitor has a recall, an unexpected weather event. We can't predict these, but we acknowledge they exist as the error term.
Some analysts try to explain every wiggle in the data. This is a mistake. Overfitting to noise makes your model look good on historical data but perform poorly on future predictions. A good model acknowledges that some variation is fundamentally unpredictable.
The error term isn't a failure—it's honesty. It says: "Given everything we know, this is how uncertain we are about any given week's sales."
Meteorologists can predict that next Tuesday will be warmer than Monday based on seasonal patterns and weather systems. But they can't predict whether a random cloud will pass overhead at 2pm. The random cloud is the error term—real, unpredictable, and important to acknowledge.
"Sales start from a baseline of existing demand, following an underlying trend (growth or decline). This foundation rises and falls with seasonal patterns. Marketing channels add to this, but with diminishing returns as spend increases, and effects carry over for several weeks. Meanwhile, other factors—distribution, pricing, competition—affect both media decisions and sales, creating spurious correlations we must control for. Weather and other external events which affect sales but do not impact media spend can help isolate the media effect, but if excluded won't bias results. We avoid controlling for variables that lie on the causal path from media to sales (like brand awareness) to get the total effect of marketing, unless we specifically want to decompose paths. And we never control for colliders like survey response, which would introduce bias. Finally, even after accounting for all these factors, some variation in sales remains unexplained."
This story—these assumptions—will become our mathematical model. If we're wrong about any part, especially the causal structure, our results will be wrong. That's why we classify variables before seeing results, not after.
Translate the story into mathematical components
Now we translate each part of our story into a specific mathematical component. The choices here should flow directly from the business context we just described— not from "what makes the results look good."
We make these choices before seeing results. If we adjusted them afterward to improve fit or achieve desired coefficients, we'd be doing specification shopping— and our uncertainty estimates would be meaningless.
Based on our story, we need to capture the underlying trajectory. Here are the options and when each makes sense:
| Option | The Business Story | Math | Pros/Cons |
|---|---|---|---|
| None |
"Business is stable—no real growth or decline pattern."
Mature category, flat market, steady demand. |
τ(t) = 0 |
✓ Simplest ✗ Misses real trends |
| Linear |
"We've been growing 3% year-over-year consistently."
Organic growth, category expansion, steady trajectory. |
τ(t) = δ · t |
✓ Interpretable slope ✓ Well-identified ✗ Assumes constant rate |
| Piecewise |
"COVID hit in March 2020 and everything changed."
Product launches, market disruptions, known structural breaks. |
Linear segments joined at changepoints |
✓ Captures known breaks ✓ Interpretable changes ✗ Need to specify when |
| B-Spline |
"Things have been shifting gradually—can't point to one moment."
Gradual evolution, multiple small changes, smooth trajectory. |
Smooth spline basis functions |
✓ Flexible shape ✓ Smoother than piecewise ✗ Less interpretable |
| Gaussian Process |
"I genuinely don't know what's driving the underlying pattern."
Uncertain trajectory, want uncertainty in trend itself. |
τ(t) ~ GP(0, k) |
✓ Maximum flexibility ✓ Uncertainty quantified ✗ Computationally expensive ✗ Can absorb real signal |
"We've been growing steadily—about 3% year-over-year."
Linear trend
Seasonality captures predictable calendar patterns. The key choice is how complex the pattern is—more Fourier terms capture finer patterns but risk overfitting.
| Order | The Business Story | What It Captures | When to Use |
|---|---|---|---|
| 0 (None) | "Our B2B clients buy consistently year-round." | No seasonal pattern | B2B, subscription, industrial |
| 1 | "Q4 is big, summer is slow—one smooth wave." | Simple sine wave | Basic retail pattern |
| 2 | "Holiday spike is sharp—rises fast, falls fast." | Peaked pattern | Typical CPG, retail |
| 3-5 | "We have distinct peaks at Valentine's, Easter, Halloween, Christmas." | Multiple peaks | Confectionery, flowers, gifts |
"Big Q4 spike, sharp drop in January, summer lull."
Fourier seasonality, order 2
"More spending helps, but each dollar is less effective than the last"
Saturation function
S-curve. Needs critical mass, then saturates.
Immediate returns, gradual diminishing.
Steep S-curve with tipping point.
"People remember ads and act on them days or weeks later"
Adstock transformation
Pizza, snacks. Buy within days.
Shoes, electronics. Weeks to decide.
Cars, B2B. Months of research.
A coffee ad makes you want coffee now—fast decay. A car ad plants a seed that grows over months as you research—delayed peak.
"Advertising Product A also affects sales of Product B"
Cross-effect terms
Flagship ads boost whole brand.
New model steals from old.
Console ads boost game sales.
Cross-effects require more data to estimate. Only include if you can explain the mechanism: "Why would Product A ads affect Product B sales?"
"TV builds awareness → Awareness drives sales"
Nested/Mediated model
For established brands.
For new brands, B2B.
Coca-Cola ad = bumping into an old friend (buy immediately). New brand ad = first date (need to build familiarity before purchase).
Priors encode what we believe before seeing data. They're not arbitrary— they represent real knowledge: "Media effects are probably positive but not huge."
"We expect positive effects, but with uncertainty"
Prior distribution
Putting all the pieces together:
sales[t] = α + τ(t) + s(t) + Σ β[c] × saturation(adstock(spend[c,t])) + ε[t]
where: τ(t) = linear trend, s(t) = Fourier(order=2)
Do our assumptions generate plausible predictions?
Before showing the model any real data, we test whether our assumptions could plausibly generate realistic-looking outcomes. This catches problems early—before we waste time fitting a broken model.
Before opening night, actors do a full rehearsal to catch problems. We run our model "forward" without real data: given our assumptions about how marketing works, what kind of sales numbers would we expect to see?
Evaluating...
If the prior predictive shows impossible outcomes (negative sales, values in the trillions), it means our assumptions are too vague. Tighten the priors—add more knowledge. This is good—we caught a problem before wasting computation on fitting.
Let the data update our beliefs
Now we show our model the actual sales data and let it figure out what the media effects probably are. This is the core of Bayesian inference: starting with prior beliefs, updating them with evidence.
Imagine you think a coin is fair. Then you flip it 100 times and get 75 heads. You'd update your belief—maybe it's biased. Our model does the same: it starts with prior beliefs about media effects and updates them based on the sales data.
Did the algorithm actually work?
Before trusting results, we verify that the MCMC sampling algorithm worked correctly. These are computational checks—they tell us if the algorithm converged, not whether the model is correct.
Before trusting a calculation, you might verify the calculator is working. These diagnostics check if our statistical "calculator" (MCMC) ran correctly.
Trace plots show the sampling history. Good traces look like "fuzzy caterpillars"— random but stationary. Bad traces show trends, stuck periods, or disagreement between chains.
The algorithm converged properly. We can trust these samples represent the posterior distribution.
Does the model reproduce key data patterns?
The algorithm worked (Step 6), but is the model any good? We check by asking: can this model reproduce data that looks like what we actually observed? If not, the model is missing something important.
If you truly understand how something works, you should be able to recreate it. If our model truly captures how sales are generated, it should produce realistic fake data that matches the patterns in real data.
These curves show how sales respond to spend for each channel, incorporating all the effects we modeled (saturation, carryover).
The model reproduces the key features of the data. Some individual weeks are off— that's expected (random events we can't predict). The overall patterns are captured.
Are conclusions robust to different assumptions?
We made many choices building this model: saturation shape, adstock decay, priors. What if we'd made different (reasonable) choices? Sensitivity analysis checks whether our conclusions would change.
Engineers don't just test a bridge under normal conditions—they test under extreme loads, wind, earthquakes. We stress-test our conclusions under different model specifications.
| Spec | Saturation | Adstock | Priors |
|---|---|---|---|
| Base | Hill (k=2) | Geometric (λ=0.7) | HalfNormal(0.5) |
| Alt 1 | Hill (k=1.5) | Geometric (λ=0.7) | HalfNormal(0.5) |
| Alt 2 | Hill (k=2) | Geometric (λ=0.5) | HalfNormal(0.5) |
| Alt 3 | Hill (k=2) | Geometric (λ=0.7) | HalfNormal(1.0) |
| Alt 4 | Michaelis | Geometric (λ=0.7) | HalfNormal(0.5) |
| Alt 5 | Hill (k=2) | Delayed (θ=2) | HalfNormal(0.5) |
| Channel | Mean ROI | Range | Robust? |
|---|---|---|---|
| TV | 2.1 | [1.8 – 2.5] | ✓ Always profitable |
| Paid Search | 2.4 | [2.1 – 2.8] | ✓ Always profitable |
| Paid Social | 1.3 | [0.8 – 1.9] | ⚠ Sometimes below break-even |
| Display | 0.7 | [0.3 – 1.1] | ✗ Usually unprofitable |
| Radio | 1.6 | [1.3 – 2.0] | ✓ Always profitable |
If conclusions change dramatically with small assumption changes, we can't trust them. TV, Search, and Radio are robust—conclusions hold regardless of specification. Social and Display are sensitive—be cautious making decisions based on these estimates.
What did we learn, and how confident are we?
The final step: translate technical results into actionable insights. Good reporting separates confident findings from uncertain ones, and acknowledges limitations honestly.
TV, Paid Search, and Radio are robustly profitable across all reasonable model specifications. Safe bets for continued or increased investment.
Paid Social shows positive effects but with wide uncertainty and sensitivity to model assumptions. Consider a controlled experiment (geo-test) before major budget changes.
Display is probably not paying for itself. The ROI range includes 0.3-1.1 across specifications, suggesting reallocation of budget may be warranted.
"TV ROI is probably 1.8-2.5" is more useful than "TV ROI is 2.1" because it's honest about what we actually know. You can confidently invest in TV knowing it's profitable across the entire range—without false precision that could mislead planning.
Similarly, acknowledging uncertainty about Social prevents over-investment in a channel that might not be working—while also preventing premature abandonment of something that might be.
Run geo-experiment to validate MMM estimates.
Reallocate to TV/Search over 3 months.
Refresh model quarterly with new data.