What is Incrementality in Marketing?
TL;DR: Incrementality measures what your marketing actually caused — not what it coincided with. It answers the question attribution never asks: would that customer have converted anyway? By comparing outcomes between a group exposed to your marketing and a control group that wasn't, incrementality testing isolates the true causal impact of a campaign, channel, or budget decision.
What You'll Learn
- The fundamental difference between correlation and causation in marketing measurement
- What a holdout group is and why it's the foundation of honest measurement
- How geo-based testing works — and when to use it instead of user-level holdouts
- What statistical significance means in plain language, and why it matters for your budget decisions
- Why incrementality tests consistently reveal that branded search is less effective than it appears — and brand investment is more effective
- How to run your first incrementality test without a dedicated measurement team
Quick Overview: Why Incrementality Is the Most Important Concept in Modern Marketing Measurement
Every marketing team is making allocation decisions — where to put next quarter's budget, which channels to scale, which to cut. Most of those decisions are based on attribution data. And attribution, as we've established, answers the wrong question.
Incrementality answers the right one: If I hadn't spent this money on this channel, what would have happened differently?
Here's what this article covers:
- Why correlation isn't enough for marketing decisions
- The core logic of incrementality measurement
- Holdout groups — what they are and how they work
- Geo-based testing — the alternative when user-level holdouts aren't possible
- Statistical significance in plain language
- The difference between incrementality and attribution
- What incrementality tests consistently reveal in practice
1. Why Correlation Isn't Enough
Marketing measurement has a fundamental causality problem — and most practitioners don't think about it carefully enough.
When you run a retargeting campaign and your conversion rate goes up, something caused that improvement. But what? It could be:
- Your ads performing well
- Your retargeting audience already being far along in the buying journey, ready to convert regardless
- A new product launch creating organic demand
- A salesperson following up more aggressively in the same period
- A competitor having a bad quarter
The correlation between your ad spend and the outcome tells you almost nothing about which of these actually drove the result.
This is called the selection problem in econometrics. The people who see your retargeting ads are not randomly selected from the general population — they visited your site, engaged with your content, or matched a behavioral profile. They were already more likely to convert than the average person. When they convert at a higher rate, you cannot credit the ads without first asking: would they have converted anyway?
Last-touch attribution never asks this question. Neither does any model that assigns credit based on touchpoint presence alone.
2. The Core Logic of Incrementality Measurement
Incrementality testing borrows directly from the scientific method. The question is causal: did my marketing cause this outcome?
To answer a causal question, you need a counterfactual — a picture of what would have happened without your marketing. The cleanest way to establish a counterfactual is a randomized controlled experiment:
- Split your potential audience into two groups
- Expose one group to your marketing (the test group)
- Hold the other group back — no exposure to the campaign (the control group)
- After a defined period, measure the difference in outcomes between the two groups
- That difference — controlling for everything else — is the incremental effect of your campaign
Key term: Incremental lift = the percentage increase in conversions (or pipeline, or branded search) in the test group compared to the control group. This is the number that represents what your marketing actually caused.
The elegance of this approach is that randomization controls for everything you didn't change. If test and control groups were assigned randomly, they should be similar in every other way — meaning any difference in outcomes is attributable to the marketing exposure, not to some other variable.
3. Holdout Groups: The Foundation of User-Level Testing
A holdout group is a randomly assigned subset of your target audience that is deliberately excluded from seeing a campaign. Typically 10–20% of the target audience is held out.
How it works in practice:
- Before launching a campaign, randomly assign 15% of your retargeting audience to a holdout bucket
- Run your campaign to the remaining 85% as normal
- After 4–6 weeks, compare conversion rates between the two groups
- The lift in the exposed group, relative to the holdout, is your incremental conversion rate
What holdout testing can measure:
- Whether a retargeting campaign is creating new conversions — or just intercepting people who were going to convert anyway
- Whether a paid social campaign is generating incremental pipeline — or claiming credit for organic demand
- The true cost per incremental conversion (which is almost always higher than attributed CPA)
Important limitations:
- Holdout testing requires a large enough audience to achieve statistical significance — generally 1,000+ users per group for conversion events
- User-level holdouts are increasingly affected by cross-device journeys and privacy changes
- They measure one campaign at a time, not how channels interact
Most major ad platforms (Meta, Google) offer built-in holdout functionality within campaign settings. This is the easiest place to start.
4. Geo-Based Testing: When You Can't Do User-Level Holdouts
Geo-based testing is the alternative when user-level holdouts aren't feasible — particularly for brand campaigns, out-of-home advertising, TV, or any marketing that can't easily be withheld at the individual level.
How it works:
- Identify 6–10 comparable geographic markets (cities, DMAs, or regions)
- Randomly assign half to the test condition and half to the control
- Run your campaign in test markets only
- Measure outcomes in both groups during and after the campaign flight
Why geo testing is often more reliable than user-level holdouts:
- People in the control markets genuinely can't see the campaign — no spillover from social feeds or search results
- It works for any marketing channel, including those with no user-level tracking
- It captures community-level effects (word of mouth, local PR) that user-level models miss
What to measure in geo tests:
- Branded search volume (Google Search Console, by geography)
- Direct traffic (GA4, filtered by geography)
- Demo or trial requests by source geography
- Pipeline creation rate during and 4–8 weeks after the campaign
Example: You're running a LinkedIn brand awareness campaign targeting CMOs at SaaS companies. You select 8 comparable markets. Run the campaign in 4 (test) and hold back 4 (control). After 6 weeks, branded search volume is 18% higher in test markets and demo request rate is 11% higher. Incremental lift = 11 percentage points on demo requests. Before this test, your attribution model gave LinkedIn zero credit for these demos — because LinkedIn rarely appears as the last touch before a demo booking.
5. Statistical Significance in Plain Language
Every incrementality test requires a statistical significance threshold. This is where many marketing teams get lost — but it's critical to understand.
What statistical significance answers: How confident are we that the result we observed isn't just random chance?
When you run an experiment, the difference between test and control groups will never be exactly zero — even if your campaign had zero effect. There will always be some random variation. Statistical significance tells you how likely it is that the gap you observed would occur by random chance alone if there were truly no effect.
The convention: 95% confidence level — meaning you accept a 5% probability of concluding the campaign worked when it actually didn't (a false positive). For large, irreversible budget decisions, 99% confidence is appropriate.
What this means practically:
| Situation | Recommended Confidence Level |
|---|---|
| Quick directional test (small decision) | 90% |
| Standard campaign evaluation | 95% |
| Major budget reallocation decision | 99% |
The most common mistake: Peeking at results mid-test and stopping when the number looks favorable. This inflates your false positive rate dramatically. Decide your test duration before you start, and don't call the test early.
A null result is still valuable. If your test doesn't reach statistical significance, it doesn't mean the campaign did nothing — it means you couldn't prove it worked with the required confidence. This is useful information: either the effect is smaller than expected, or you need a larger sample size.
6. Incrementality vs. Attribution: The Core Difference
| Attribution | Incrementality | |
|---|---|---|
| The question it answers | Which channels were present when this conversion happened? | How many conversions happened because of this channel? |
| Direction | Backward-looking | Forward-looking |
| Logic | Correlation | Causation |
| What it's good for | Understanding journey patterns | Making budget allocation decisions |
| What it misses | Whether touchpoints caused outcomes | Channel interaction effects (requires MMM for full picture) |
A channel can score extremely well in attribution and have near-zero incrementality.
The canonical example is branded search. Brand search captures demand — it intercepts people who are already planning to buy and are searching your company name to find the website. If you paused your branded search campaigns for a month (in a controlled test), most of that traffic would arrive via organic results or direct navigation instead. Pipeline barely moves.
In attribution, branded search looks like it's printing money. In an incrementality test, it often shows 70–80% of its attributed conversions would have happened anyway.
The inverse is also true for brand campaigns. They often show poor attribution metrics but high incrementality — because the lift they create shows up downstream in branded search, organic traffic, and shorter sales cycles, not in their own last-touch attribution.
7. What Incrementality Tests Consistently Reveal
Across the companies we've worked with, incrementality testing produces three consistent surprises:
Branded search is rarely as incremental as it looks. Holdout tests on branded paid search consistently show that the majority of attributed conversions would have arrived anyway through organic results. The true incremental CAC for branded search is typically 3–5x higher than attributed CAC. This frees up significant budget for genuinely incremental channels.
Brand investment is almost always more incremental than attribution gives it credit for. The lift from brand campaigns shows up in places attribution can't see: lower CPCs on non-brand terms, higher organic conversion rates, faster sales cycles. Measured incrementally over a 12-week window, brand contribution is consistently larger than last-touch models suggest.
Performance channels hit diminishing returns faster than expected. The marginal return on additional Google Search spend is very different once you're past a certain spend threshold. Incrementality tests that vary spend intensity can identify the inflection point — which is where budget should start flowing to other channels.
How to Run Your First Incrementality Test (Step by Step)
- Choose one campaign to test. Start with your highest-spend retargeting campaign or your next brand awareness flight. Don't try to test everything at once.
- Define your success metric before you start. Is it demo requests? Pipeline created? Branded search volume? Write it down before the test begins.
- Set your test duration. Most B2B campaigns need 4–6 weeks to accumulate enough conversion events for significance. Don't cut it short.
- Create the holdout. In most platforms, this is a setting within the campaign. Aim for 15% holdout. Make sure the holdout assignment is random, not based on geography or audience segment.
- Let the test run without interference. Don't adjust bids, creative, or targeting during the test period.
- Compare outcomes between exposed and holdout groups. Calculate the incremental lift. Run a simple significance test (most platforms provide this, or use a free online calculator).
- Document what you found. Write down the result even if it's a null result. Over time, these tests build into an evidence base that guides allocation decisions.
Best Practices for Incrementality Testing
- Start with one test, not a full program. A single holdout test on your top retargeting campaign will almost always change how you see your data.
- Pre-commit to what result would change your behavior. Before running a test, write down: "If the incremental lift is below X, we will reduce this channel's budget by Y%." This prevents rationalization after the fact.
- Test in multiple seasons before drawing permanent conclusions. A geo test in Q4 will look very different from Q2 for most B2B companies.
- Treat results as directional, not definitive. One test gives you a point estimate with uncertainty around it. Run more tests, refine your methodology, and build a progressively more accurate picture.
- Don't conflate null results with "doesn't work." A null result means you couldn't detect the effect — not that there wasn't one. Increase your sample size or test duration before concluding a channel is non-incremental.
- Build toward a Marketing Mix Model. Individual tests answer one question at a time. MMM estimates channel interactions and long-term effects simultaneously — and it's the natural next step once you have several quarters of experimental data.
FAQ: Common Questions About Incrementality Testing
Q: Is incrementality testing only for large companies with big budgets? No. A basic holdout test requires nothing more than a campaign, a platform that supports holdout audiences (Meta, Google, and LinkedIn all do), and a defined measurement period. You can run your first test with no additional tools or budget. The methodology scales from a $10,000 campaign to a $10 million one.
Q: How is a lift test different from A/B testing? A/B testing typically compares two versions of a creative or landing page — which ad copy performs better, which headline converts higher. Incrementality (lift) testing asks whether running the campaign at all generates more outcomes than not running it. They measure different things and serve different decisions.
Q: What if my holdout group is too small to reach statistical significance? This is the most common practical challenge. Solutions: increase the holdout percentage (from 10% to 20%), extend the test period, or switch to a geo-based test design where the geographic unit of randomization gives you more statistical power.
Q: Can I test brand campaigns with holdouts? Yes, though geo-based tests tend to work better for brand campaigns because brand exposure often travels across social networks in ways that contaminate user-level holdouts (the holdout user sees the campaign shared by a friend, for example). Geo tests — where control markets have no campaign exposure at all — are cleaner for brand measurement.
Additional Resources
From the Zaitz Marketing Knowledge Library:
- Why Your Attribution Model Is Lying to You — Why the alternative to incrementality (attribution) fails
- Why Your Customer Acquisition Cost Is Probably Wrong — How attribution errors flow through to broken CAC calculations
External Reading:
- Google's Guide to Conversion Lift Studies — How Google's built-in lift testing works
- Meta's Conversion Lift Tool Documentation — Meta's holdout methodology explained
- Introduction to Causal Inference in Marketing (LinkedIn Engineering) — Technical background on why randomized experiments are the gold standard
Zaitz Marketing designs and interprets incrementality tests for B2B companies that want to move beyond attribution and make budget decisions based on causal evidence. If you want to know what your marketing is actually driving, start with a Growth Architecture Review.
→ Book a Conversation
Want a second read on your measurement setup?
Start with a Growth Architecture Review — we'll map your channel mix, audit your attribution, and show you where the real leverage is.
Book a Conversation →