Predicting people is hard

Jan 24, 2016

I had some fun with machine learning last year. Essentially I had a lot of engagement data for a group of customers and I wanted to see if I could use it to see which way they'd fall on a YES/NO decision.

It turns out I was terrible at working out why people were YES, but much better at working out that they would be NO. By knowing this I could adjust aggregate results and started getting more realistic predictions. I got some very right! I am a data scientist magician. I got some a little off. Opps?

In the end I settled on a rough range I could be reasonably confident in - but this was too large to really make any real decisions on. The vague result I was coming up was (while neat to pull from raw data) not really better than everyone else's gut.

I still think it was cool I built a gut, but from a business point of view it's fairly pointless. We already had a lot of those.

There's a school of thought that says what I was doing was absolutely the right track: I just needed to gather more information. My machine gut will scale better than everyone else's gut if I just found more firehoses to point at it.

I think this is wishful thinking. We really want machine learning to do the cool things for people-based analysis that we know it can do for more concrete subjects. So the fact that the data is almost universally non-existent, inaccessible or misleading is put to the side.

(Incidentally, I love this story about pigeons being used to process brain scans. If you ever get stuck in the past you can jump start all kind of research with a sufficiently large pigeon coup.)

There's a famous story about Target predicting someone was pregnant from their purchase history:

“My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”

The manager didn’t have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized and then called a few days later to apologize again.

On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

Now if you think about it for a moment, this story is fairly suspect. Taboo sex, a father defending honour, a father shamed (by the MACHINE). When stories have features that make it especially shareable (sex, disgust, shame) we should generally be suspicious of them. If the odds say it didn't reach you on its merits, chances are it has few merits.

And as it turns out, this story is very suspect.

Stories like this help construct the idea of data analysis as something that is closer to magic. We put the information in the box, turn it on and we know you better than you know yourself. In reality, predictive stuff is quite bad quite a lot. Here is Amazon trying to market to me:

screenshot of amazon

Now I really like books. Amazon has all my card details on file. It has years of my book purchasing history. I have a device that can only display books bought from Amazon. I can click a single button and it will take my money and instantly give me a book on that device.

And Amazon is marketing the same book I don't want to me three times.

Ever bought an oven (or similar, infrequent purchase)? Were you followed round the internet for weeks by an optimistic algorithm hoping this was just the start of your oven buying spree?

Think about the huge amount of hours invested in making that incredibly complicated process of identification, auctioning and ad display happen. Think about how stupid the outcome is.

This article on the Facebook newsfeed team (whose job is to guess what things your friends have to say you'll find interesting) has a number of telling details, but I liked this one:

Over the past several months, the social network has been running a test in which it shows some users the top post in their news feed alongside one other, lower-ranked post, asking them to pick the one they’d prefer to read. The result? The algorithm’s rankings correspond to the user’s preferences “sometimes,” Facebook acknowledges, declining to get more specific. When they don’t match up, the company says, that points to “an area for improvement".

Personalisation like this is a hard trap to see when you falling into it. You start off with excellent data about people's past engagement, but the instant you start using that information to shortcut the process you damage your own information collecting system. They are no longer engaging with things you don't show them. It's cutting off your own legs on the grounds that if you weight less you'll run faster.

To avoid this you'd have to do all sorts of clever tricks to avoid self-reinforcing data, justifying the entire team of clever people.

But what's the result? "Sometimes".

There are two assumptions that justify the creation of these systems:

  • If you have enough information, you can know things about people without asking.
  • You have enough information.

The first is the temptation to godhood that cheap processing power offers. The second should be a rebuff to that, but is usually forgotten.

Previous Post: The Handmaid's Tale and Data Collection

Next Post: Gerrymandering in the Public Interest?