On P-hacking leading to the most ridiculous conclusions (the emotions of dead salmon)

A classic demonstration was an experiment carried out by reputable researchers in 2009 which involved showing a subject a series of photographs of humans expressing different emotions, and carrying out brain imaging (fMRI) to see which regions of the subject’s brain showed a significant response, taking P < 0.001.

The twist was that the ‘subject’ was a 41b Atlantic salmon, which ‘was not alive at the time of scanning’. Out of a total of 8,064 sites in the brain of this large dead fish, 16 showed a statistically significant response to the photographs. Rather than concluding the dead salmon had miraculous skills, the team correctly identified the problem of multiple testing – over 8,000 significance tests are bound to lead to falsepositive results. Even using a stringent criterion of P < 0.001, we would expect 8 significant results by chance alone.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter

