Home Our Process About Knowledge Hub
Categories
Measurement & Attribution Growth Strategy Positioning & Messaging Capital Efficiency Marketing Operations
Articles
Zero-Click Isn't a Traffic Problem. It's a CAC Reallocation Problem. Competitive Displacement as a Primary Acquisition Motion Expansion Revenue Costs Half as Much. Most Marketing Budgets Don't Reflect That. Geo-Lift Testing Is the Incrementality Method B2B Actually Needs
Weekly News
The Boardroom Brief Week of April 27, 2026
Free Resources
Prompt Library CAC Audit Template Positioning Diagnostic
FAQ Book a Conversation

TL;DR: Incrementality measures what your marketing actually caused — not what it coincided with. It answers the question attribution never asks: would that customer have converted anyway? By comparing outcomes between a group exposed to your marketing and a control group that wasn’t, incrementality testing isolates the true causal impact of a campaign, channel, or budget decision.


What You’ll Learn


Quick Overview: Why Incrementality Is the Most Important Concept in Modern Marketing Measurement

Every marketing team is making allocation decisions — where to put next quarter’s budget, which channels to scale, which to cut. Most of those decisions are based on attribution data. And attribution, as we’ve established, answers the wrong question.

Incrementality answers the right one: If I hadn’t spent this money on this channel, what would have happened differently?

Here’s what this article covers:

  1. Why correlation isn’t enough for marketing decisions
  2. The core logic of incrementality measurement
  3. Holdout groups — what they are and how they work
  4. Geo-based testing — the alternative when user-level holdouts aren’t possible
  5. Statistical significance in plain language
  6. The difference between incrementality and attribution
  7. What incrementality tests consistently reveal in practice

1. Why Correlation Isn’t Enough

Marketing measurement has a fundamental causality problem — and most practitioners don’t think about it carefully enough.

When you run a retargeting campaign and your conversion rate goes up, something caused that improvement. But what? It could be:

The correlation between your ad spend and the outcome tells you almost nothing about which of these actually drove the result.

This is called the selection problem in econometrics. The people who see your retargeting ads are not randomly selected from the general population — they visited your site, engaged with your content, or matched a behavioural profile. They were already more likely to convert than the average person. When they convert at a higher rate, you cannot credit the ads without first asking: would they have converted anyway?

Last-touch attribution never asks this question. Neither does any model that assigns credit based on touchpoint presence alone.


2. The Core Logic of Incrementality Measurement

Incrementality testing borrows directly from the scientific method. The question is causal: did my marketing cause this outcome?

To answer a causal question, you need a counterfactual — a picture of what would have happened without your marketing. The cleanest way to establish a counterfactual is a randomised controlled experiment:

  1. Split your potential audience into two groups
  2. Expose one group to your marketing (the test group)
  3. Hold the other group back — no exposure to the campaign (the control group)
  4. After a defined period, measure the difference in outcomes between the two groups
  5. That difference — controlling for everything else — is the incremental effect of your campaign

Key term: Incremental lift = the percentage increase in conversions (or pipeline, or branded search) in the test group compared to the control group. This is the number that represents what your marketing actually caused.

The elegance of this approach is that randomisation controls for everything you didn’t change. If test and control groups were assigned randomly, they should be similar in every other way — meaning any difference in outcomes is attributable to the marketing exposure, not to some other variable.


3. Holdout Groups: The Foundation of User-Level Testing

A holdout group is a randomly assigned subset of your target audience that is deliberately excluded from seeing a campaign. Typically 10–20% of the target audience is held out.

How it works in practice:

  1. Before launching a campaign, randomly assign 15% of your retargeting audience to a holdout bucket
  2. Run your campaign to the remaining 85% as normal
  3. After 4–6 weeks, compare conversion rates between the two groups
  4. The lift in the exposed group, relative to the holdout, is your incremental conversion rate

What holdout testing can measure:

Important limitations:

Most major ad platforms (Meta, Google) offer built-in holdout functionality within campaign settings. This is the easiest place to start.


4. Geo-Based Testing: When You Can’t Do User-Level Holdouts

Geo-based testing is the alternative when user-level holdouts aren’t feasible — particularly for brand campaigns, out-of-home advertising, TV, or any marketing that can’t easily be withheld at the individual level.

How it works:

  1. Identify 6–10 comparable geographic markets (cities, DMAs, or regions)
  2. Randomly assign half to the test condition and half to the control
  3. Run your campaign in test markets only
  4. Measure outcomes in both groups during and after the campaign flight

Why geo testing is often more reliable than user-level holdouts:

Example: You’re running a LinkedIn brand awareness campaign targeting CMOs at SaaS companies. You select 8 comparable markets. Run the campaign in 4 (test) and hold back 4 (control). After 6 weeks, branded search volume is 18% higher in test markets and demo request rate is 11% higher. Incremental lift = 11 percentage points on demo requests. Before this test, your attribution model gave LinkedIn zero credit for these demos — because LinkedIn rarely appears as the last touch before a demo booking.


5. Statistical Significance in Plain Language

Every incrementality test requires a statistical significance threshold. This is where many marketing teams get lost — but it’s critical to understand.

What statistical significance answers: How confident are we that the result we observed isn’t just random chance?

When you run an experiment, the difference between test and control groups will never be exactly zero — even if your campaign had zero effect. There will always be some random variation. Statistical significance tells you how likely it is that the gap you observed would occur by random chance alone if there were truly no effect.

The convention: 95% confidence level — meaning you accept a 5% probability of concluding the campaign worked when it actually didn’t (a false positive). For large, irreversible budget decisions, 99% confidence is appropriate.

Situation Recommended Confidence Level
Quick directional test (small decision) 90%
Standard campaign evaluation 95%
Major budget reallocation decision 99%

The most common mistake: Peeking at results mid-test and stopping when the number looks favourable. This inflates your false positive rate dramatically. Decide your test duration before you start, and don’t call the test early.

A null result is still valuable. If your test doesn’t reach statistical significance, it doesn’t mean the campaign did nothing — it means you couldn’t prove it worked with the required confidence. This is useful information: either the effect is smaller than expected, or you need a larger sample size.


6. Incrementality vs. Attribution: The Core Difference

Attribution Incrementality
The question it answers Which channels were present when this conversion happened? How many conversions happened because of this channel?
Direction Backward-looking Forward-looking
Logic Correlation Causation
What it’s good for Understanding journey patterns Making budget allocation decisions
What it misses Whether touchpoints caused outcomes Channel interaction effects (requires MMM for full picture)

A channel can score extremely well in attribution and have near-zero incrementality.

The canonical example is branded search. Brand search captures demand — it intercepts people who are already planning to buy and are searching your company name to find the website. If you paused your branded search campaigns for a month (in a controlled test), most of that traffic would arrive via organic results or direct navigation instead. Pipeline barely moves.

In attribution, branded search looks like it’s printing money. In an incrementality test, it often shows 70–80% of its attributed conversions would have happened anyway.


7. What Incrementality Tests Consistently Reveal

Across the companies we’ve worked with, incrementality testing produces three consistent surprises:

Branded search is rarely as incremental as it looks. Holdout tests on branded paid search consistently show that the majority of attributed conversions would have arrived anyway through organic results. The true incremental CAC for branded search is typically 3–5x higher than attributed CAC.

Brand investment is almost always more incremental than attribution gives it credit for. The lift from brand campaigns shows up in places attribution can’t see: lower CPCs on non-brand terms, higher organic conversion rates, faster sales cycles. Measured incrementally over a 12-week window, brand contribution is consistently larger than last-touch models suggest.

Performance channels hit diminishing returns faster than expected. The marginal return on additional Google Search spend is very different once you’re past a certain spend threshold. Incrementality tests that vary spend intensity can identify the inflection point — which is where budget should start flowing to other channels.


How to Run Your First Incrementality Test

  1. Choose one campaign to test. Start with your highest-spend retargeting campaign or your next brand awareness flight.
  2. Define your success metric before you start. Is it demo requests? Pipeline created? Branded search volume? Write it down before the test begins.
  3. Set your test duration. Most B2B campaigns need 4–6 weeks to accumulate enough conversion events for significance.
  4. Create the holdout. Aim for 15% holdout. Make sure assignment is random.
  5. Let the test run without interference. Don’t adjust bids, creative, or targeting during the test period.
  6. Compare outcomes between exposed and holdout groups. Calculate the incremental lift and run a significance test.
  7. Document what you found. Write down the result even if it’s a null result.

FAQ

Q: Is incrementality testing only for large companies with big budgets? No. A basic holdout test requires nothing more than a campaign, a platform that supports holdout audiences (Meta, Google, and LinkedIn all do), and a defined measurement period.

Q: How is a lift test different from A/B testing? A/B testing compares two versions of a creative or landing page. Incrementality (lift) testing asks whether running the campaign at all generates more outcomes than not running it.

Q: What if my holdout group is too small to reach statistical significance? Increase the holdout percentage, extend the test period, or switch to a geo-based test design.

Q: Can I test brand campaigns with holdouts? Yes, though geo-based tests tend to work better for brand campaigns because brand exposure often travels across social networks in ways that contaminate user-level holdouts.


Additional Resources

From the Zaitz Marketing Knowledge Library:

External Reading:

Want a second read on your measurement setup?

Start with a Growth Architecture Review. We will map your channel mix, audit your attribution, and show you where the real leverage is.

Book a Conversation →