Select Page

18.7 Always be a Data Skeptic — But not a Data Cynic


The more you work with data and statistics, the more you should be able to discern fact from fiction. The knowledge gained over time should give you perspective and experience that sharpens your ‘spidey’ senses regarding statistics and statistical models.  That intuition is what helps you spot signs of obvious trouble (like a regression model with an r-squared of 1, or a classification model with 100% accuracy).

Familiarity with statistical sampling, and with survey techniques in particular, can put you on alert regarding headlines that seem alarmist or patently misleading.  By all means, be a skeptic by digging a bit deeper to find out how those insights were derived.  When someone makes a bold claim such as “1 in 10 college students in Boston is homeless”1 it is often worth pulling the string back far enough to see what methodology was used, and whether the claim is coming from an advocacy group or partisan source.

If you detect an egregious mistake or two, please do not let that experience disillusion you into thinking that everyone is out to deceive. Do not automatically view things from a negative perspective, because distrust forecloses positive outcomes from the start. Question everything, always – including our statements here – but do so without becoming reflexively cynical.2

1 https://www.wbur.org/cognoscenti/2018/05/17/hunger-and-homelessness-college-students-pam-eddinger-sara-goldrick-rab
2 Harford, Tim.  The Data Detective.  New York:  Riverhead Books, 2021.  In Harford’s book, he delineates this distinction between the two approaches wonderfully.  He also delivers many other important insights about evaluating statistical claims.