Lessons learned from failing to predict the future

In October 2018 I did a talk at SearchLove called: Creativity, Crystal Balls & Eating Ground Glass. It was a bit different to the talks that I usually give, in that it was largely based on a thought experiment. At that point I’d been responsible for creating content which generates coverage and links from journalists for several years. Towards the end of 2017 someone asked me:

How good are you at predicting the success (or otherwise) of a piece?

I realised I didn’t have a great answer, (I thought I was good, but I had no evidence to support that thinking), and so, in 2018 I resolved to make a prediction about how each piece would perform ahead of launch. Within the talk I shared my predictions (which to be fair, I think were only somewhat interesting); but I also shared what I learned along the way (which I thought was way more interesting).

I’ve been meaning to write up this deck for a long time, partially because there’s only so much stuff you can squeeze into a 30 minute talk; but also because some of the stuff I learned subsequently, I’ve never shared.

This is likely to be a long old post, so you might want to grab a coffee (or similar) – I hope you’ll find it useful and/or interesting.

How I made those predictions:

Predicting the success or otherwise of a creative piece is not strictly binary, and so, I created the following scale:

Scoring Band	LinkScore Points	Equivalent number of links
A	10,000+	More than 100
B	5,000 – 9,999	50 – 99
C	2,000 – 4,999	20 – 49
D	1,000 – 1,999	10 – 19
E	Less than 1,000	Less than 10

LinkScore is a proprietary metric created and used by Verve. My original predictions were based solely on LinkScore, the equivalent number of links column is just to give you sense of what that means (i.e. an average link is not worth 100 points).

Ahead of launch I predicted the scoring band of each campaign.

In the event that a campaign launched without me making a prediction (this happened a couple of times) I did not make a prediction. I did this because felt that it wouldn’t be a true reflection of my thoughts (i.e. my predictions would be influenced by how the campaign was performing).

It’s probably worth noting here that I made all of these predictions without telling my team what I was up to. I just saved the predictions in a google document and didn’t think too much about them. Moreover, I didn’t analyse how I was faring until I came to write the talk.

That’s cute Hannah, but HOW did you decide what band to put each campaign into?

This was the question fired at me from someone on my team when I presented a draft version of the SearchLove deck internally at Verve.

It’s an excellent question.

But it’s one that difficult to answer.

Deconstructing my own thought processes is something I struggle with. Here are the things I think I considered when scoring each campaign:

Resonance

resonance = the power to evoke emotion

Initially I might consider resonance at a topic level:

In this example you can see that 35 times the number of articles are written about AI vs RPA, and these articles get 8 times more engagement. This would lead me to conclude that AI is a more resonant topic than RPA.

But I also think about this stuff at a human level too: how many people are likely to care about this? Or, how many people are likely to be touched in some way by this?

Breadth of Appeal

Here I’m thinking about things like:

How many publications are likely to cover this?
Can we sell it in to different verticals?
Different countries?
Can we use the piece to tell a variety of stories?

Past Experience

This is really hard to deconstruct – essentially here I’m thinking about how I’ve seen similar pieces perform before. I’ll talk more about this a bit later on in the post.

How accurate were my predictions?

The short answer is: not very accurate at all.

I correctly predicted the LinkScore band for just 47% of campaigns

This means I am wrong more often than I am right.

This was pretty shocking – I really thought that I’d be better at making predictions that that. I was interested to see how wrong I was – remember I was scoring campaigns based on a scale, and so I wanted to see how far off my predictions were.

I decided to bucket the predictions which were wrong as follows:

A bit wrong = plus or minus one scoring band e.g. I predicted band “A” (100+ links); but the campaign actually achieved band “B” (50 – 99 links).

Very wrong = plus or minus two or more scoring bands e.g. I predicted band “A” (100+ links); but the campaign actually achieved band “C” (20 – 49 links).

Here’s how I fared across all campaigns:

In addition to making accurate predictions about just 47% of our campaigns; I was very wrong about 1 in 5.

I did not feel good about this at all.

Right now I suspect some of you might be wondering why on earth you’re wasting your time reading a post from a human who clearly has little to no idea what she’s doing.

Shaken as I was to discover just how poor I was at making these predictions, I learned some important lessons as a result of doing this analysis.

What I learned…

Given my predictions really weren’t accurate I was keen to understand where my thinking had gone awry.

Can I predict a ‘winner’?

First up, I looked at how accurate my predictions were for our best performing campaigns – i.e. the campaigns which achieved either an “A” band (100+ links) or “B” band (50 – 99 links).

I did a little bit better here, accurately predicting the the band for 75% of our best performing campaigns; however, when I was wrong, I was very wrong (out by two or more scoring bands):

In order to better understand what was causing my predictions to be so far off, I looked at one of the campaigns I’d predicted outrageously wrongly:

This is On Location, a piece we created for GoCompare. Using 20 years of IMDb data, we calculated the most filmed locations on the planet.

I predicted this piece would achieve band “E” (less than 10 links); however, in fact, this was a band “A” campaign which generated over 390 links.

So I was about as wrong as a person can be.

What was my problem? Remember I said that think I considered three things when trying make these predictions? I’ll deal with them each in turn: