Building a Marketing Mix Model Platform: Architecture and Lessons Learned

Marketing Mix Modeling (MMM) has been around since the 1960s, but the tooling has historically been challenging. You either shell out for expensive consultants who deliver a PDF three months later, or you cobble together something in R that only one person on your team understands.

Open-source Bayesian MMM frameworks have changed that calculus. They provide proper statistical foundations that actually work. But getting MMM into production—with real data pipelines, training infrastructure, and a UI that non-technical stakeholders can use—is where things get interesting.

This post shares architectural patterns and lessons from building such a platform.

The Problem

MMM sounds simple on paper: figure out which marketing channels are actually driving results so you can allocate budget intelligently. In practice, you're dealing with:

Data scattered across a dozen platforms (ad networks, CRM, analytics tools)
Carryover effects where today's ad spend influences next month's conversions
Response curves that flatten at high spend levels
Stakeholders who want answers now, not after a week of model training

The goal is to build something that handles all of this while remaining accessible to people who don't want to write Python.

Architecture Overview

The stack is straightforward:

┌─────────────────────────────────────────────────┐
│           Frontend (React/Next.js)              │
│    (TypeScript, visualization libraries)        │
└─────────────────────┬───────────────────────────┘
                      │ REST API
┌─────────────────────▼───────────────────────────┐
│           Backend (FastAPI/Flask)               │
│    (Python, async job handling)                 │
└─────────────────────┬───────────────────────────┘
                      │
┌─────────────────────▼───────────────────────────┐
│         Bayesian MMM Framework                  │
│    (MCMC sampling, GPU acceleration)            │
└─────────────────────┬───────────────────────────┘
                      │
┌─────────────────────▼───────────────────────────┐
│           Cloud Infrastructure                  │
│    (Data warehouse, object storage, compute)    │
└─────────────────────────────────────────────────┘

Nothing exotic here—the complexity lives in how these pieces talk to each other and handle the async nature of model training.

Data Pipelines: The Foundation

MMM is garbage-in-garbage-out. Most of the code isn't the model itself—it's wrangling data into a format the model can consume.

Multi-Source Ingestion

Data can come from direct CSV uploads or data warehouse queries. The warehouse path is more interesting for production use:

Analysts can write their own data preparation queries
Data lineage is maintained
Scheduled refreshes keep the model current

Key validation steps at ingestion:

Check for required columns
Validate data types and ranges
Filter channels with insufficient data (sparse data breaks Bayesian models)
Verify date coverage and completeness

Data Lineage

Every transformation should be persisted with clear organization:

storage/
├── raw_data/           # Untouched source data
├── transformed_data/   # Post-validation
├── model_input/        # Ready for training
├── models/             # Trained model artifacts
├── scenarios/          # Versioned results
└── optimizations/      # Budget allocation outputs

This isn't just for debugging—it's audit trail. When someone asks "why did the model say Facebook has a 3x ROI?" you need to trace back to the exact input data that produced that result.

Model Training Pipeline

Bayesian MMM frameworks use MCMC sampling for inference. This is computationally expensive—training can take 30-60 minutes for a typical model with multiple channels and years of weekly data.

Async Job Handling

Training can't block the API. The solution is a job queue:

Submit training jobs to a background worker
Return job ID immediately
Frontend polls for status updates
Stream logs via SSE or WebSocket for real-time feedback

Single-worker execution is often intentional—GPU memory is limited, and parallel training jobs can cause out-of-memory errors. Queue jobs and process them sequentially.

Model Configuration

Bayesian MMM's power comes from priors. You can encode domain knowledge:

ROI priors: Expected return on investment for each channel. A well-chosen prior (say, log-normal with median around 1.2x) prevents obviously wrong estimates while letting data speak.

Carryover effects: How long marketing effects persist. Brand campaigns might have effects lasting 8+ weeks. Performance ads might be 2 weeks. The model learns actual decay rates, but priors keep estimates reasonable.

Response curve shape: Diminishing returns parameters. Most channels saturate at high spend—priors encode this expectation.

Different channels need different configurations. The UI should make this accessible to marketing analysts, not just statisticians.

GPU Memory Management

When running MCMC on GPU:

Configure memory growth to avoid pre-allocation
Clean up sessions between training runs
Run garbage collection explicitly
Consider memory limits when sizing instances

Without explicit cleanup, memory fragments over multiple training runs, eventually causing failures.

Budget Optimization

A trained model is interesting. A trained model that tells you where to move money is useful.

Optimization Modes

Three scenarios come up repeatedly:

Fixed budget: "We have $1M. How should we split it across channels?"

Target ROI: "What's the minimum spend to hit 2x ROI?"

Unconstrained: "Ignoring budget constraints, where are the opportunities?"

All modes benefit from constraints preventing unrealistic recommendations. Nobody's going to 10x their LinkedIn spend overnight, even if the model says they should. Bounds like "50% to 200% of current spend" keep recommendations actionable.

Response Curves

The response curves are the most important output. They show:

Current position on the curve
Where diminishing returns kick in
Saturation points by channel

When a channel's curve is nearly flat at current spend levels, that's a clear signal to reallocate budget elsewhere.

Simulation and Forecasting

Models are only useful if people can explore them. Interactive simulation is essential:

Spend Scenario Testing

Users should be able to:

Adjust channel spend with sliders or inputs
See predicted outcomes update in real-time
Compare scenarios side-by-side
Save scenarios for later reference

The confidence intervals (from the Bayesian posterior) communicate uncertainty—a model isn't very useful if people treat point estimates as gospel.

What-If Analysis

Beyond simple scenarios, enable questions like:

"What if we cut TV by 20% and move it to digital?"
"What's the optimal allocation for Q4 given seasonal patterns?"
"How would results differ with a 10% larger budget?"

Versioning and Reproducibility

A scenario represents a trained model instance with all its artifacts:

Model configuration used
Training data snapshot
Trained model weights
Generated visualizations
Optimization results

Each training run creates a new version. The UI shows version history and lets users compare outputs across retrains—useful for debugging when results shift unexpectedly.

Visualization Considerations

Bayesian models produce rich outputs that need appropriate visualization:

Posterior distributions: Show uncertainty, not just point estimates. Box plots, violin plots, or credible intervals work well.

Response curves: Interactive charts letting users explore the relationship between spend and outcome.

Attribution over time: How much did each channel contribute week by week?

Comparison views: How does the current allocation compare to optimal?

Consider extracting chart data from model outputs for custom rendering rather than using static images. Interactive charts with tooltips, zooming, and linked brushing improve user understanding.

Lessons Learned

A few things became obvious in hindsight:

Job queue needs persistence. In-memory queues work fine for single-instance deployment, but horizontal scaling requires shared state. Plan for Redis or a cloud task queue from the start.

Cache aggressively. Response curve calculations are deterministic given a trained model. Caching these cuts frontend latency significantly.

Prior elicitation is hard. Setting Bayesian priors is powerful but requires statistical intuition. A wizard that translates "I think Facebook usually returns $1.50 per dollar spent" into proper prior parameters makes this accessible to more users.

Training time expectations. Users expecting instant results need clear communication about why MCMC takes time. Progress indicators with estimated completion help manage expectations.

Key Takeaways

The core insight: the model is the easy part. Bayesian MMM frameworks handle the hard statistics. The work is in everything around it:

Data pipelines that don't break
Async job handling that doesn't leak memory
UIs that translate posterior distributions into actionable recommendations
Versioning that enables comparison and debugging

If you're considering building something similar:

Start with the data pipeline. Get that bulletproof before touching the model.
Design for async training from day one.
Budget twice as much time for the frontend as you think you'll need—stakeholders have opinions about charts.
Make priors accessible through good UX rather than requiring statistical expertise.

This post discusses architectural patterns for MMM platforms. For MMM methodology, see resources on Bayesian Marketing Mix Modeling. Open-source frameworks like Google's Meridian, Meta's Robyn, and PyMC-Marketing provide implementations of these concepts.

Note: The patterns discussed here are intentionally generalized, drawn from industry experience but presented as transferable concepts rather than specific proprietary implementations.