💎 On data not having to be big to be useful (the sample is most important)

George Gallup, who essentially invented the idea of the opinion poll in the 1930s, came up with a fine analogy for the value of random sampling. He said that if you have a large pan of soup, you do not need to eat it all to find out if it needs more seasoning. You can just taste a spoonful, provided you have given it a good stir.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter

💎 The power of framing (crime) statistics to change their impact

A classic example of how alternative framing can change the emotional impact of a number is an advertisement that appeared on the London Underground in 2011, proclaiming that ‘99% of young Londoners do not commit serious youth violence’. These ads were presumably intended to reassure passengers about their city, but we could reverse its emotional impact with two simple changes. First, the statement means that 1% of Londoners do commit serious violence. Second, since the population of London is around 9 million, there are around 1 million people aged between 15 and 25, and if we consider these as ‘young’, this means there are 1% of 1 million or a total of 10,000 seriously violent young people in the city. This does not sound at all reassuring. Note the two tricks used to manipulate the impact of this statistic: convert from a positive to a negative frame, and then turn a percentage into actual numbers of people.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter

On P-hacking leading to the most ridiculous conclusions (the emotions of dead salmon)

A classic demonstration was an experiment carried out by reputable researchers in 2009 which involved showing a subject a series of photographs of humans expressing different emotions, and carrying out brain imaging (fMRI) to see which regions of the subject’s brain showed a significant response, taking P < 0.001.

The twist was that the ‘subject’ was a 41b Atlantic salmon, which ‘was not alive at the time of scanning’. Out of a total of 8,064 sites in the brain of this large dead fish, 16 showed a statistically significant response to the photographs. Rather than concluding the dead salmon had miraculous skills, the team correctly identified the problem of multiple testing – over 8,000 significance tests are bound to lead to falsepositive results. Even using a stringent criterion of P < 0.001, we would expect 8 significant results by chance alone.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter

💎 On the danger of priming in surveys (beware inflated responses)

The responses to questions can also be influenced by what has been asked beforehand, a process known as priming. Official surveys of wellbeing estimate that around 10% of young people in the UK consider themselves lonely, but an online questionnaire by the BBC found the far higher proportion of 42% among those choosing to answer. This figure may have been inflated by two factors: the self-reported nature of the voluntary ‘survey’, and the fact that the question about loneliness had been preceded by a long series of enquires as to whether the respondent in general felt a lack of companionship, isolated, left out, and so on, all of which might have primed them to give a positive response to the crucial question of feeling lonely.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter

💎 Before believing a study take a look at its methodology (Ryanair satisfaction)

For example, in 2017 budget airline Ryanair announced that 92% of their passengers were satisfied with their flight experience. It turned out that their satisfaction survey only permitted the answers, ‘Excellent, very good, good, fair, OK’.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter

💎 In our keenness to attribute success or failure to an intervention, we often forget that the change may well be a case of reversion to the mean (league tables)

It is not only sports teams that are ranked in league tables. Take the example of the PISA Global Education Tables, which compare different countries’ school systems in mathematics. A change in league table position between 2003 and 2013 was strongly negatively correlated with initial position, meaning that countries at the top tended to go down, and those at the bottom tended to go up. The correlation was -0.60, and some theory shows that if the rankings were complete chance and all that was operating were regression-to-the-mean, the correlation would be expected to be -0.71, not very different from what was observed. This suggest the differences between countries were far less than claimed, and that change in league position had little do with changes in teaching philosophy.

Excerpt from: The Art of Statistics: Learning from Data by David Spiegelhalter