Last-Touch Attribution Really Isn’t All You Need

A colorful image of a group of people facing the same direction with their backs to the camera. Colorful circles are focused on the backs of several people, and colorful circles are on the ground below each person

Marketing attribution is the discipline of assigning value to a collection of marketing efforts and identifying the campaigns responsible for driving sales or other positive outcomes. These analyses help businesses understand and target their investments, and the work is typically done with the help of one or more models or heuristics (akin to rules of thumb), which match promotional activities to future revenue and determine how that revenue should be apportioned across different marketing interactions. This is an especially complex problem in the modern business environment, where multiple campaigns might be running simultaneously in multiple channels.

We prefer causal modeling for these kinds of problems, but building these powerful models is an involved process, and it sometimes isn’t possible to build such a model right away. As a simpler alternative, businesses and vendors tend to implement attribution models, using machine learning models or just heuristics. We know correlation isn’t causation, but does the difference between these approaches matter in practice?

It matters a lot, especially with the simplest and most common form of marketing attribution heuristic, last-touch attribution. There is a narrow region where last-touch attribution models are capable, but in far more cases they can be quickly and thoroughly led astray by confounding factors. We were interested in assessing the quality of the simpler attribution heuristics, and this blog post will summarize our findings.

Attribution models connect action to value.

Marketing attribution is performed in the context of a customer journey: a timeline of interactions between the business and a customer, culminating (or not) in some preferred outcome, like the sale of a product. (We’ll use sales as our example, but the framework is applicable in other contexts.) In practice, attribution is implemented using models that survey all or part of a customer’s journey and assign credit to the event(s) that meaningfully affected the outcome.

image of marketing attribution journey showing these steps: 1. Start, 2. Email, 3. Ad, 4. Email, 5. Ad, 6. Purchase

There are a wide variety of attribution models applied in practice, each assigning credit in a different way. These techniques are broadly classified as multi-touch or single-touch, depending on whether they attempt to account for the whole customer journey or focus on a single event.

Single-touch attribution models pick one point on the journey and assign the full sales value to that point—often either the first point or the last point—under the assumption that a specific interaction is responsible for most of the drive to purchase.

Multi-touch models share credit among all the points along the journey, though they are often quite simple too. They typically assign credit equally or apply a heuristic set of weights across the customer journey.

Last-touch attribution modeling is the simplest and the most commonly applied attribution approach, assigning the full value of a sale to the marketing event that immediately precedes the sale. In the example journey above, a last-touch model would attribute 100% of the purchase value to the “ad” action. Last-touch attribution is overly simple but plausible in situations where marketing actions can lead to immediate results.

Simulation studies quantify complex behavior.

Performing simulation studies with synthetic data is one of our favorite techniques for benchmarking models, an approach strongly influenced by the Bayesian modeling field. Because we’re simulating the data ourselves, we know the precise underlying characteristics of the marketing attribution process and can compare our model results to this benchmark. As long as our simulations reproduce enough relevant context, our results should be indicative of real-world performance.

Our structure is very simple: We simulate a collection of customer journeys, each spanning multiple sales cycles. (This mimics a recent problem of interest.) Each customer’s journey is measured over 64 “time steps,” where a step can represent a day, week, or other relevant time scale depending on the particular business context. Finally, the customer’s purchasing behavior is controlled by several parameters:

How often customers interact with the marketing campaign of interest.
How often customers interact with other simultaneous campaigns we are running.
How these marketing campaigns affect the likelihood to purchase.
The customer’s baseline likelihood of purchasing in the absence of known marketing activities.

This simple model has some richness to it depending on how we set the various parameters in relation to each other. Below, we’ve sketched an example of how the simulations work. Interactions with our marketing campaigns (blue lines) raise the customer’s likelihood of purchasing (gray line) until the customer purchases (red points), at which point the sales cycle resets. This continues through the end of the simulation.

trajectory figure

Next, we’ll study how the last-touch heuristic performs by comparing its attributed sales to those measured against a counterfactual simulation (a direct measure of what happens without our marketing interventions). By varying the simulation parameters, we can measure the effectiveness of single-touch attribution models across different situations.

Last-touch attribution works well in simple cases.

We start with the simplest case, in which last-touch attribution performs very well: We only carry out one marketing campaign, and customers will not purchase without our marketing. But when they do encounter our marketing, they purchase immediately.

Because our simulations are configured such that our marketing reaches the customer about eight times across 64 time steps, we expect to see about eight sales per customer. And that’s exactly what we measure, from both the attribution model and our counterfactual/causal estimate:

Last-touch effect: 7.94 ± 0.07
Causal effect: 7.94 ± 0.07

This is a good sanity check for our simulations as well: In this configuration there are no sales without marketing, so any sale should be attributable to our marketing. Here, last-touch attribution matches a causal analysis.

The results also extend a little further into configurations where our marketing campaign only increases customers’ likelihood to buy rather than prompting an immediate purchase. Lowering the effectiveness of the marketing results in time delays between the marketing and an eventual sale, and this effect is traced below (throughout this post, uncertainty bands span two standard errors—about a 95% confidence interval).

Time delay figure

As our campaign’s efficacy increases, the amount of time between the marketing interaction and the sale decreases, and the total incremental sales increase. Interestingly, last-touch attribution seems to ever-so-slightly overstate the sales effect relative to the counterfactual comparison as the effects are increasingly delayed, but this simple attribution remains a simple and useful measure of value.

Last-touch attribution struggles with complexity.

We now begin to introduce complexity in two ways: by allowing sales to happen spontaneously (without any tracked marketing campaigns) or by introducing our marketing campaign into an environment where other campaigns are ongoing. Each change alters the counterfactual, what would happen without our campaign, which will cause problems for the last-touch attribution approach. Because last-touch attribution is more-or-less a count of how many times our marketing happened before a sale, we have no way to distinguish the effect of our campaign from the other background effects.

Consider an example where customers, without any measured campaign, might purchase our product four times in their 64-step journey. Last-touch attribution still claims our marketing efforts are worth 7.95 ± 0.07 incremental sales, just as before, because sales still occur this often following a marketing intervention. Examining the true counterfactual, though, paints a slightly different picture:

Counterfactual sales: 3.98 ± 0.035
Sales with marketing: 11.47 ± 0.08
Incremental sales: 7.49 ± 0.09

Because the last-touch attribution cannot reason against the counterfactual and adjust for the baseline sales rate, it estimates the value of our marketing campaign as slightly too high. A similar effect can also occur when our campaign runs concurrently with one or more other campaigns: The attribution heuristic cannot separate the two concerns, so some of the effects of the simultaneous campaigns are conflated.

This can be seen in the figures below, where the true incrementality (red) decreases as other effects are introduced, but the last-touch attribution model (blue) cannot fully account for the reduction. On the left, the attribution model cannot account at all for sales that are not attached to a marketing campaign in its data (background sales or other unobserved marketing), and its overestimate grows with the size of this effect. On the right, we see that the attribution model is able to account for some of the sales linked to another tracked campaign but is unable to do so completely.

sales lift figure

Last-touch attribution misses both ways in heterogeneous environments.

In real-world scenarios, where sales cannot always be linked to marketing campaigns and where multiple marketing efforts proceed simultaneously, last-touch attribution’s performance further degrades. To demonstrate, we simulate a series of customer journeys in which three parameters interact: the background sales rate, the frequency of competing campaigns, and the frequency of our target campaign. We also set up the problem such that our campaign is immediately effective when it occurs, but the competing campaigns are ineffective.

In this case, the interplay of the background sales rate and the frequency of the different competing marketing events thoroughly confuses the attribution. When the background rate is high (right panel) the attribution model again overstates the true effect by a wide margin. When the background rate is lower (left and middle panels), the model can under-predict value for higher rates of competing marketing efforts.

background rate sales low and high figure

As a result, in an environment with many different things happening simultaneously, we cannot even count on last-touch attribution being biased in a consistent direction (overstating value).

Last-touch attribution sees things that don’t exist.

Finally, since the goal of an attribution model is to attach value, how well can last-touch attribution diagnose an ineffective campaign? So far, we have assumed our campaign has a real, and often immediate, effect. Now we adjust the simulation so that neither our marketing campaign nor any other competing campaigns are effective.

The figures below emphatically demonstrates that last-touch attribution cannot make this critical distinction, regardless of whether simultaneous marketing efforts are present. As the number of background sales increases (left to right), the attribution model begins to claim them for itself despite any real connection.

background rate sales increase figure

This happens precisely because these attribution frameworks are only correlational. They try to relate our marketing events with sales. As long as our marketing interactions continue to occur—and as long as sales continue—it will appear as though our campaign is excellent, even when it is completely devoid of value.

An analysis of a slightly simpler simulation shows that, when these campaigns are ineffective, the attribution is the product of (1) the rate at which the customer engages with our marketing and (2) one minus the rate at which the customer engages with other campaigns we are running. A lack of effectiveness doesn’t stop last-touch attribution from believing our marketing leads to sales.

Last-touch attribution is appealing because it is so simple to understand and implement, but this simplicity brings significant downsides. If we can be convinced that our situation is simple enough for last-touch attribution to apply, then our simulations suggest it is a useful heuristic. But as the complexity grows, or as the potential for complexity grows, last-touch attribution seems unable to cope with the task it has been given. Especially since we cannot know the true state of the world, unlike in our simulations.

Other methods, including multi-touch attribution, might prove more accurate, but the primary difficulty of incorporating a counterfactual remains. An approach built on causal inference, with experimental validation, is the best way to provide reliable attribution estimates.

Last-Touch Attribution Really Isn’t All You Need

Author:

Date Published:

Attribution models connect action to value.

Simulation studies quantify complex behavior.

Last-touch attribution works well in simple cases.

Last-touch attribution struggles with complexity.

Last-touch attribution misses both ways in heterogeneous environments.

Last-touch attribution sees things that don’t exist.

Last-Touch Attribution Really Isn’t All You Need

Author:

Date Published:

Attribution models connect action to value.

Simulation studies quantify complex behavior.

Last-touch attribution works well in simple cases.

Last-touch attribution struggles with complexity.

Last-touch attribution misses both ways in heterogeneous environments.

Last-touch attribution sees things that don’t exist.

Author

Tom Shafer

Principal Data Scientist