Most advertisers would love to know in advance whether the new ads in which they invest so much were going to actually work in-market. So they copy test, often testing several different concepts, selecting a winner, and then fine-tuning on the basis of the diagnostic feedback they gain.
But how well does copy testing work? Two of the larger copy testing systems have been developed and refined – reverse engineered, really – on the basis of those companies’ in-market measurement systems. For reasons discussed below, there is a fairly wide margin of error in the predictive accuracy of these systems, but advertisers who use them do at least improve their odds of success.
Then there is the rest of copy testing – the non-accountable systems.
Copy tests that claim to predict how well the ad is going to work in-market (either via traditional survey designs or biometric measurement) but have no in-market feedback loop cannot provide any degree of confidence in their predictions. If you keep testing ads, keep deciding which are the winners based on getting better scores on your metrics, but haven’t validated and refined your algorithms, how can a client be sure that your system works?
The ‘diagnostics’ that are provided by these non-accountable copy testing systems are equally suspect. The supplier can tell you with a high degree of confidence how to make the ad better. But if better means it’ll score higher in their system, and scoring higher in their system has no apparent correlation with in-market success, you’re refining to the copy test, not to the real world.
Based in part on the frustration with traditional copy testing, another new category of diagnostic systems has emerged. Because we know that advertising often works most powerfully on an emotional level, advertisers have been experimenting with brain wave analyses, facial expression tracking and other means of observing how ads make people feel. Some have found this feedback to be of value as they seek to improve their ads’ ability to connect and persuade. These emotion-based measurement systems are, however, a long way from providing predictions of in-market performance, with most not even attempting to build quantitative links with actual in-market performance.
But even the copy testing systems that do have the feedback/refinement loop struggle mightily to provide reliable predictions. You’d think it wouldn’t be so hard – Will this ad engage? Will the brand be remembered? Will it persuade?
Copy testing fails in large part because it does not acknowledge that the world has changed.
First, people don’t watch TV the way they used to – there’s less family around the TV all watching together, more multi-screen behavior, and more time shifting. All of this means that fewer consumers are passively gazing at the screen waiting to be engaged by your commercial. Any copy testing approach that assumes eyes on screen – and rewards commercials that generate engagement in this no-longer-the-norm environment – is going to fail to pick the real winners and losers.
Second, TV commercials aren’t the totality of the brand’s marketing communications anymore. The best campaigns are those that surround the consumer with a multiplicity of touchpoints that are carefully orchestrated to build to a holistic brand story. A TV spot that may appear to work well in isolation is unlikely to be the same one that really shines when it is a part of a larger communications program.
Third, in today’s cluttered advertising and brand environment, the consumer takeaway from advertising messages needs to be firmly fixed in long term memory. Copy testing invariably measures short term memory, which has been proven to operate quite differently than long term memory.
As advertisers continue to rely on copy testing that is based on old concepts of how advertising works, mediocre (and worse) advertising continues to make it on-air. With the false confidence provided by the copy test, advertisers invest millions in air time in ads that simply don’t work. To compound the problem, actual in-market success is often monitored only with insensitive, indirect or rear-view mirror advertising research methods that rely on exposure opportunities, behavioral or other aggregated data.
Consumers continue to complain that most ads are bad. And they’re right. The ads are bad because the copy testing is at best out of touch with reality, and the in-market feedback that could at least provide solid foundational learning is also broken.