πŸ’Ž On the core problem of Big Data

“I am not saying here that there is no information in Big Data,” essayist and statistician Nassim Taleb has written. “There is plenty of information. The problemβ€”the central issueβ€”is that the needle comes in an increasingly larger haystack.”

Excerpt from:Β Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz

πŸ’Ž People lie on surveys (and in their Netflix queues) both consciously and unconsciously

Netflix learned a similar lesson early on in its life cycle: don’t trust what people tell you; trust what they do. Originally, the company allowed users to create a queue of movies they wanted to watch in the future but didn’t have time for at the moment. This way, when they had more time, Netflix could remind them of those movies. However, Netflix noticed something odd in the data. Users were filling their queues with plenty of movies. But days later, when they were reminded of the movies on the queue, they rarely clicked. What was the problem? Ask users what movies they plan to watch in a few days, and they will fill the queue with aspirational, highbrow films, such as black-and-white World War II documentaries or serious foreign films. A few days later, however, they will want to watch the same movies they usually want to watch: lowbrow comedies or romance films. People were consistently lying to themselves.

Excerpt from: Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz