On how too much data can make us overconfident in our predictions, rather than boost their accuracy

On how too much data can make us overconfident in our predictions, rather than boost their accuracy

The problem of more data was investigated by Paul Slovic, Professor of Psychology at the University of Oregon. He ran an experiment with professional horseracing handicap setters in which they were given a list of 88 variables that were useful in predicting a horse’s performance. The participants then had to predict the outcome of the race and their confidence in their prediction. They repeated these tasks with access to different levels of data: either 5, 10, 20, 30 or 40 of the variables.

The results were illuminating. Accuracy was the same regardless of the number of variables used. However, overconfidence grew as more data was harnessed. Experts overestimated the importance of factors that had a limited value. It was only when five data points were used that accuracy and confidence were well calibrated.

Marketers face a similar set of problems. They have access to more data than ever before and many believe that because the information exists they should use it. The Slovic experiment suggests otherwise. We shouldn’t harness data just because we can. Instead, as much time should be spent choosing which data sets to ignore as which to use.

Excerpt from: The Choice Factory: 25 behavioural biases that influence what we buy by Richard Shotton

Facebook Comments

Product Geek?

Join over 5,000 product geeks and get one email every Monday containing the best excerpts I've read over the previous week.