Lies, Damned Lies, and Statistics

Lies, Damned Lies, and Statistics January 13, 2014

Every day we read statistics about everything under the sun: “Eating a bar of chocolate every day reduces your chance of dying of acute hydrophobia by 23%”, or “People who are more than 20 pounds over weight are 90% more likely to die during cataract surgery.” We see them, and if we like them we share them with our FaceBook friends and if we don’t we harrumph and click on something else. We can’t avoid them, because statistics is the killer tool for virtually all research involving human beings. Sometimes we wonder if they are true; and we are right to do so. Today I want to tell a cautionary tale.

I was a math/economics major in college, and I took quite a bit of statistics (my advisor was a statistician by training). One of the classes involved something called “regression analysis,” where you gather data on a bunch of variables and then run a line through it. Suppose, for example, that you think you’ve found a relationship between the amount of honey produced in San Bernardino County in a given year and the number of cellular telephone towers. You can gather year by year numbers for each, and then fit a line to the data; and maybe the line says that for each new cell tower, honey production drops by five quarts. So far, so good. Regression analysis is a useful tool, and will absolutely give you the line that best fits the data.

But is it true? Does it mean anything? There are a number of statistical tests that you can do, tests that depend on the degree of error in your measurements. Everyone who uses regression analysis is taught to look at the “t-scores”; and if the t-scores are too low, you’ve got nothing.

The final project in this class was to find a published journal article that used a regression model, and to try to replicate it by running the data ourselves. I found such a paper in an economics journal; and I contacted the author, whom I will call Dr. X. The good doctor was gracious enough to send me his data set. (This was before the web; he had to send me a big old tape through the mail.) I set up the regression problem using a software package called SHAZAM, ran it, and lo and behold! I got exactly the same results, right down to the t-scores, which were good and strong.

But here’s the problem. The validity of those t-scores depends on a number of statistical assumptions. If the assumptions are met, you can trust the t-scores. If they aren’t, the t-scores are meaningless. Our professor, Sue Feigenbaum, hammered the assumptions into our heads, and insisted that we check them. The first check was for a condition delightfully named “heteroskedasticity,” which I can explain if anyone wants me to. It’s an obvious test to do, and I did it; and lo and behold! The model exhibited rampant heteroskedasticity, and all of the pretty results were suspect. So I corrected for it, which is often possible; and lo and behold! All of the pretty results were gone. Nothing. Nada. Zilch. In short, the paper was a travesty, proving nothing, and adding nothing to the sum total of human knowledge.

I don’t know whether Dr. X checked for heteroskedasticity or not; his paper didn’t say. But either he didn’t check, or he checked and then ignored the problem. I’d prefer to think the former; but I’ll note that this article was based on Dr. X’s doctoral dissertation, which he surely had to defend before a panel of his predecessors; and his professors didn’t catch the error either.

I’d promised to send him a copy of my write-up; and I spent an hour or so trying and failing to write a cover letter that didn’t accuse of him being either a fool or a liar. Eventually I gave up, shoved a copy of my paper in an envelope, and sent it to him with no cover letter. (I never heard back from him, unsurprisingly.)

A couple of years later, while getting my masters at Stanford University, I had to take a class on how to use a particular statistics package that was available for student use. The class taught statistical methods as a cookbook: if you’ve got this kind of problem, you put the data in here, turn the crank, and get your data out. There was no discussion of correctness, so far as I recall; the presumption was that everything was just going to work. The text, and the instructor, recommended to us several techniques for “getting a good fit” that Sue Feigenbaum had specifically warned us about because they biased the results.

I’d like to think that Dr. X had learned his statistical techniques in a class like this one rather than a class like Sue Feigenbaum’s. And I think of all of the sociologists, psychologists, and research physicians who publish studies every year, and I ask myself, how many of them learned statistics as a cookbook? They aren’t statisticians; most of them probably learned just enough to get by. And of those who do know better, how many are above overlooking problematic assumptions if it means getting published?

Using statistics is hard. Even when you know precisely what you are doing, it’s easy to fool yourself, as Michael Flynn has noted on his blog. If you’re using a cookbook, you don’t even need to fool yourself. And if you’re dishonest, all bets are off.

And just looking at the study in a journal, or (much worse) in a popular publication, you have no way of knowing whether the statistics were done right or not. My guess: they probably weren’t.


Browse Our Archives