Imagine you polled a thousand people on the street and asked “where do you get your groceries?” Looking at the results, the vast majority, over 90%, say “Acme.” Wow! So Acme is the preferred grocery store of 90% of Americans?
Wait just a minute, you say. Is that a representative sample?
You check the data and discover, shockingly, it is. Somehow the pollsters have collected a thousand responses that exactly mirror the same demographic proportions of the country at large. Race, religion, gender, socio-economic status, it all perfectly reflects the broader social fabric of the country. So surely the polling is accurate!
And then you discover that the pollsters set up their station right outside of an Acme.
Back when a lot of polling was done by telephone, a lot of data was skewed or misinterpreted because people didn’t understand the way “patterns of interacting with the telephone” would impact data that seemingly had nothing to do with the phone itself. A classic example from the era before cell phones: if you conduct all your phone polling during the day, only retired people are home to pick up, so the results are going to skew enormously towards the general views of the elderly.
Nowadays, polls are often conducted via online responses. But oh boy is that going to skew things, even if it’s still overall the best collection method.
But it gets even worse when you try to measure things that directly relate to being online.
“90% of Americans spend 10+ hours a day on their phone!” …say the results of a poll that you could only access if you were online enough to see it. People on nature hikes don’t answer the poll in the negative – they don’t answer at all.
When you use your little magic box to look out of your tiny bubble, you think you’re looking out at the world. But you’re actually just looking at a very slightly larger bubble. Keep it in mind.