๐Ÿ’Ž On the danger of a theory-free analysis of mere correlations (winter detector)

The ‘winter detector’ problem is common in big data analysis. A literal example, via computer scientist Sameer Singh, is the pattern-recognising algorithm that was shown many photos of wolves in the wild, and many photos of pet husky dogs. The algorithm seemed to be really good at distinguishing the two rather similar canines; it turned out that it was simply labelling any picture with snow as containing a wolf. An example with more serious implications was described by Janelle Shane in her book You Look Like a Thing and I Love You: an algorithm that was shown pictures of healthy skin and of skin cancer. The algorithm figured out the pattern: if there was a ruler in the photograph, it was cancer. If we don’t know why the algorithm is doing what it’s doing, we’re trusting our lives to a ruler detector.

Excerpt from: How to Make the World Add Up: Ten Rules for Thinking Differently About Numbers by Tim Harford