How It Works About Knowledge Base
Categories
Measurement Truth Attribution & Its Failures Capital Efficiency & CAC Budget Allocation & Waste Brand × Performance
Articles
Why Your Attribution Model Is Lying to You What is Incrementality in Marketing? Why Your Customer Acquisition Cost Is Probably Wrong Brand vs Performance Is a False Choice
FAQ Book a Conversation

What is Incrementality in Marketing?

TL;DR: Incrementality measures what your marketing actually caused — not what it coincided with. It answers the question attribution never asks: would that customer have converted anyway? By comparing outcomes between a group exposed to your marketing and a control group that wasn't, incrementality testing isolates the true causal impact of a campaign, channel, or budget decision.


What You'll Learn


Quick Overview: Why Incrementality Is the Most Important Concept in Modern Marketing Measurement

Every marketing team is making allocation decisions — where to put next quarter's budget, which channels to scale, which to cut. Most of those decisions are based on attribution data. And attribution, as we've established, answers the wrong question.

Incrementality answers the right one: If I hadn't spent this money on this channel, what would have happened differently?

Here's what this article covers:

  1. Why correlation isn't enough for marketing decisions
  2. The core logic of incrementality measurement
  3. Holdout groups — what they are and how they work
  4. Geo-based testing — the alternative when user-level holdouts aren't possible
  5. Statistical significance in plain language
  6. The difference between incrementality and attribution
  7. What incrementality tests consistently reveal in practice

1. Why Correlation Isn't Enough

Marketing measurement has a fundamental causality problem — and most practitioners don't think about it carefully enough.

When you run a retargeting campaign and your conversion rate goes up, something caused that improvement. But what? It could be:

The correlation between your ad spend and the outcome tells you almost nothing about which of these actually drove the result.

This is called the selection problem in econometrics. The people who see your retargeting ads are not randomly selected from the general population — they visited your site, engaged with your content, or matched a behavioral profile. They were already more likely to convert than the average person. When they convert at a higher rate, you cannot credit the ads without first asking: would they have converted anyway?

Last-touch attribution never asks this question. Neither does any model that assigns credit based on touchpoint presence alone.


2. The Core Logic of Incrementality Measurement

Incrementality testing borrows directly from the scientific method. The question is causal: did my marketing cause this outcome?

To answer a causal question, you need a counterfactual — a picture of what would have happened without your marketing. The cleanest way to establish a counterfactual is a randomized controlled experiment:

  1. Split your potential audience into two groups
  2. Expose one group to your marketing (the test group)
  3. Hold the other group back — no exposure to the campaign (the control group)
  4. After a defined period, measure the difference in outcomes between the two groups
  5. That difference — controlling for everything else — is the incremental effect of your campaign

Key term: Incremental lift = the percentage increase in conversions (or pipeline, or branded search) in the test group compared to the control group. This is the number that represents what your marketing actually caused.

The elegance of this approach is that randomization controls for everything you didn't change. If test and control groups were assigned randomly, they should be similar in every other way — meaning any difference in outcomes is attributable to the marketing exposure, not to some other variable.


3. Holdout Groups: The Foundation of User-Level Testing

A holdout group is a randomly assigned subset of your target audience that is deliberately excluded from seeing a campaign. Typically 10–20% of the target audience is held out.

How it works in practice:

  1. Before launching a campaign, randomly assign 15% of your retargeting audience to a holdout bucket
  2. Run your campaign to the remaining 85% as normal
  3. After 4–6 weeks, compare conversion rates between the two groups
  4. The lift in the exposed group, relative to the holdout, is your incremental conversion rate

What holdout testing can measure:

Important limitations:

Most major ad platforms (Meta, Google) offer built-in holdout functionality within campaign settings. This is the easiest place to start.


4. Geo-Based Testing: When You Can't Do User-Level Holdouts

Geo-based testing is the alternative when user-level holdouts aren't feasible — particularly for brand campaigns, out-of-home advertising, TV, or any marketing that can't easily be withheld at the individual level.

How it works:

  1. Identify 6–10 comparable geographic markets (cities, DMAs, or regions)
  2. Randomly assign half to the test condition and half to the control
  3. Run your campaign in test markets only
  4. Measure outcomes in both groups during and after the campaign flight

Why geo testing is often more reliable than user-level holdouts:

What to measure in geo tests:

Example: You're running a LinkedIn brand awareness campaign targeting CMOs at SaaS companies. You select 8 comparable markets. Run the campaign in 4 (test) and hold back 4 (control). After 6 weeks, branded search volume is 18% higher in test markets and demo request rate is 11% higher. Incremental lift = 11 percentage points on demo requests. Before this test, your attribution model gave LinkedIn zero credit for these demos — because LinkedIn rarely appears as the last touch before a demo booking.


5. Statistical Significance in Plain Language

Every incrementality test requires a statistical significance threshold. This is where many marketing teams get lost — but it's critical to understand.

What statistical significance answers: How confident are we that the result we observed isn't just random chance?

When you run an experiment, the difference between test and control groups will never be exactly zero — even if your campaign had zero effect. There will always be some random variation. Statistical significance tells you how likely it is that the gap you observed would occur by random chance alone if there were truly no effect.

The convention: 95% confidence level — meaning you accept a 5% probability of concluding the campaign worked when it actually didn't (a false positive). For large, irreversible budget decisions, 99% confidence is appropriate.

What this means practically:

Situation Recommended Confidence Level
Quick directional test (small decision) 90%
Standard campaign evaluation 95%
Major budget reallocation decision 99%

The most common mistake: Peeking at results mid-test and stopping when the number looks favorable. This inflates your false positive rate dramatically. Decide your test duration before you start, and don't call the test early.

A null result is still valuable. If your test doesn't reach statistical significance, it doesn't mean the campaign did nothing — it means you couldn't prove it worked with the required confidence. This is useful information: either the effect is smaller than expected, or you need a larger sample size.


6. Incrementality vs. Attribution: The Core Difference

Attribution Incrementality
The question it answers Which channels were present when this conversion happened? How many conversions happened because of this channel?
Direction Backward-looking Forward-looking
Logic Correlation Causation
What it's good for Understanding journey patterns Making budget allocation decisions
What it misses Whether touchpoints caused outcomes Channel interaction effects (requires MMM for full picture)

A channel can score extremely well in attribution and have near-zero incrementality.

The canonical example is branded search. Brand search captures demand — it intercepts people who are already planning to buy and are searching your company name to find the website. If you paused your branded search campaigns for a month (in a controlled test), most of that traffic would arrive via organic results or direct navigation instead. Pipeline barely moves.

In attribution, branded search looks like it's printing money. In an incrementality test, it often shows 70–80% of its attributed conversions would have happened anyway.

The inverse is also true for brand campaigns. They often show poor attribution metrics but high incrementality — because the lift they create shows up downstream in branded search, organic traffic, and shorter sales cycles, not in their own last-touch attribution.


7. What Incrementality Tests Consistently Reveal

Across the companies we've worked with, incrementality testing produces three consistent surprises:

Branded search is rarely as incremental as it looks. Holdout tests on branded paid search consistently show that the majority of attributed conversions would have arrived anyway through organic results. The true incremental CAC for branded search is typically 3–5x higher than attributed CAC. This frees up significant budget for genuinely incremental channels.

Brand investment is almost always more incremental than attribution gives it credit for. The lift from brand campaigns shows up in places attribution can't see: lower CPCs on non-brand terms, higher organic conversion rates, faster sales cycles. Measured incrementally over a 12-week window, brand contribution is consistently larger than last-touch models suggest.

Performance channels hit diminishing returns faster than expected. The marginal return on additional Google Search spend is very different once you're past a certain spend threshold. Incrementality tests that vary spend intensity can identify the inflection point — which is where budget should start flowing to other channels.


How to Run Your First Incrementality Test (Step by Step)

  1. Choose one campaign to test. Start with your highest-spend retargeting campaign or your next brand awareness flight. Don't try to test everything at once.
  2. Define your success metric before you start. Is it demo requests? Pipeline created? Branded search volume? Write it down before the test begins.
  3. Set your test duration. Most B2B campaigns need 4–6 weeks to accumulate enough conversion events for significance. Don't cut it short.
  4. Create the holdout. In most platforms, this is a setting within the campaign. Aim for 15% holdout. Make sure the holdout assignment is random, not based on geography or audience segment.
  5. Let the test run without interference. Don't adjust bids, creative, or targeting during the test period.
  6. Compare outcomes between exposed and holdout groups. Calculate the incremental lift. Run a simple significance test (most platforms provide this, or use a free online calculator).
  7. Document what you found. Write down the result even if it's a null result. Over time, these tests build into an evidence base that guides allocation decisions.

Best Practices for Incrementality Testing


FAQ: Common Questions About Incrementality Testing

Q: Is incrementality testing only for large companies with big budgets? No. A basic holdout test requires nothing more than a campaign, a platform that supports holdout audiences (Meta, Google, and LinkedIn all do), and a defined measurement period. You can run your first test with no additional tools or budget. The methodology scales from a $10,000 campaign to a $10 million one.

Q: How is a lift test different from A/B testing? A/B testing typically compares two versions of a creative or landing page — which ad copy performs better, which headline converts higher. Incrementality (lift) testing asks whether running the campaign at all generates more outcomes than not running it. They measure different things and serve different decisions.

Q: What if my holdout group is too small to reach statistical significance? This is the most common practical challenge. Solutions: increase the holdout percentage (from 10% to 20%), extend the test period, or switch to a geo-based test design where the geographic unit of randomization gives you more statistical power.

Q: Can I test brand campaigns with holdouts? Yes, though geo-based tests tend to work better for brand campaigns because brand exposure often travels across social networks in ways that contaminate user-level holdouts (the holdout user sees the campaign shared by a friend, for example). Geo tests — where control markets have no campaign exposure at all — are cleaner for brand measurement.


Additional Resources

From the Zaitz Marketing Knowledge Library:

External Reading:


Zaitz Marketing designs and interprets incrementality tests for B2B companies that want to move beyond attribution and make budget decisions based on causal evidence. If you want to know what your marketing is actually driving, start with a Growth Architecture Review.

→ Book a Conversation

Want a second read on your measurement setup?

Start with a Growth Architecture Review — we'll map your channel mix, audit your attribution, and show you where the real leverage is.

Book a Conversation →