The problem of more data was investigated by Paul Slovic, Professor of Psychology at the University of Oregon. He ran an experiment with professional horseracing handicap setters in which they were given a list of 88 variables that were useful in predicting a horse’s performance. The participants then had to predict the outcome of the race and their confidence in their prediction. They repeated these tasks with access to different levels of data: either 5, 10, 20, 30 or 40 of the variables.
The results were illuminating. Accuracy was the same regardless of the number of variables used. However, overconfidence grew as more data was harnessed. Experts overestimated the importance of factors that had a limited value. It was only when five data points were used that accuracy and confidence were well calibrated.
Marketers face a similar set of problems. They have access to more data than ever before and many believe that because the information exists they should use it. The Slovic experiment suggests otherwise. We shouldn’t harness data just because we can. Instead, as much time should be spent choosing which data sets to ignore as which to use.