How to Think Critically VI: Bayes' Rule

You’ve just been to the doctor, and she has some bad news. There’s a deadly new disease sweeping the population – one which strikes 1 out of 100 people and invariably kills everyone who catches it. Medical science has developed a test that is 95% accurate during the incubation period: that is, when given to someone who has the disease, it correctly returns a positive result 95% of the time, and when given to someone who does not have the disease, it correctly returns a negative result 95% of the time. Your test results have just come back positive. Are you doomed?

Before I say anything more, a simple exercise: Based on the facts I just gave, estimate the probability that you actually have the disease. A detailed calculation isn’t necessary – just write down what you think the general neighborhood of the number is.

So, a 95% accurate test has indicated that you have this 1-in-100, invariably fatal disease. Should you feel terrified? Should you feel despair? Should you call your lawyer and start making out your will?

Nope. In fact, you should be optimistic. The probability that you actually have the disease is only about 16%.

Did you write down 95%, or something in that range? If so, you’re probably not alone. That’s the common answer. But without an education in statistics, human beings are not very good at calculating conditional probabilities off the cuff. To see where your unexpected reprieve came from, we have to consider a famous statistical principle called Bayes’ rule.

Named for its discoverer, the 18th-century mathematician Thomas Bayes (also a Christian minister, ironically), Bayes’ rule is a theorem for calculating conditional probabilities. In other words, if you have two possible events, A and B, each with its own independent probability, and you know the probability of B given A, Bayes’ rule is the formula for determining the probability of A given B.

This may sound abstract, so let’s frame it in terms of concrete events rather than variables. Let’s say that A is the probability of catching the deadly disease I mentioned earlier. That’s 1 out of 100, or in other words, 1%. We can write that as 0.01 for convenience.

Let’s also say that B is the probability that your test results came back positive. This is a little more complicated to calculate, but we’ll come back to it. We know the probability of B given A – given that you actually have the disease, the probability of a positive test is 95%, or 0.95. What you really want to know is the probability of A given B – the probability that you actually have the disease, given a positive test result.

Here’s what Bayes’ rule says (read “|” as “given”):

P(A|B) = ( P(B|A) * P(A) ) / P(B)

Let’s fill this out with some numbers. P(A), the probability of having the disease, is 0.01. P(B|A), the probability of a positive test result given that you have the disease, is 0.95. That gives us:

P(A|B) = ( 0.95 * 0.01 ) / P(B)

What we need to know is P(B), the overall probability of a positive test result. To figure this out, let’s break it down into cases.

There are two cases to consider: the probability that you have a positive test result if you’re one of the people who has the disease, plus the probability that you have a positive test result if you’re not one of the people who has the disease. Here are those two cases:

1% of people actually have the disease, and 95% of those will test positive. That gives us:

0.01 * 0.95 = 0.0095

99% of people do not have the disease, and 5% of those will test positive. That gives us:

0.99 * 0.05 = 0.0495

Adding up these terms, we get an overall P(A) of:

0.0095 + 0.0495 = 0.059, or 5.9%

Now, put that term into our equation:

P(A|B) = ( 0.95 * 0.01 ) / P(B)

P(A|B) = ( 0.95 * 0.01 ) / 0.059

P(A|B) = ( 0.0095 ) / 0.059

P(A|B) = 0.161, or about 16%

It seems like mathematical sleight of hand, but it’s not. A more intuitive way to explain this result is this: the test is highly accurate, but the disease is rare. Therefore, the vast majority of people who are tested won’t actually have it – and the number of false positives from that group, though small compared to the size of that group, is larger than the relatively small number of people who actually have the disease and correctly test positive.

Bayesian reasoning turns up in critical thinking contexts as well. One of the best examples is in criminal trials, where both sides often make claims about the odds of a particular outcome occurring given the defendant’s guilt or innocence. For example, let’s assume that DNA is found at a crime scene and the police cross-check it against a database, and a match is found. The odds that the crime scene DNA would match a randomly selected DNA segment are 1 in 10 million. The police therefore arrest the person whose DNA matches and haul him into court. But the odds of the suspect’s innocence are not 1 in 10 million, even though those are the odds of a match. In fact, in a country the size of America, with about 300 million people, we would expect about 30 people in the populace to match. Thus, the chances that the specific person being accused is actually the guilty party are only 1 in 30, or about 3%! (This is sometimes called the prosecutor’s fallacy. See also here and here.)

The lesson to be learned here is that we should never let single cases or anecdotes guide our decisions. Given a sufficiently large pool of chance events, even very unlikely things are bound to happen on occasion. Bayes’ rule gives us the tools to see those occurrences in context. On the other hand, people who pay attention only to unique and striking events, while disregarding the background they come from, are almost certain to reach incorrect conclusions.

Other posts in this series:

About Adam Lee

Adam Lee is an atheist writer and speaker living in New York City. His new novel, Broken Ring, is available in paperback and e-book. Read his full bio, or follow him on Twitter.

  • TJ

    Nice explanation. Very few people even begin to understand Bayesian reasoning–especially the prosecutor’s fallacy. Even though I understand the math completely, it’s almost impossible to intuitively grasp that a 1 in 10M chance match maps to a 1 in 30 chance of having the right person (out of all US citizens) until you’ve rolled it around in your head for a good while. Of course, 1 in 30 is pretty good when you toss in more geographic limitations, motive, opportunity, etc. But 1 in 10M is not a slam dunk. (1 in 10B would be a clincher, I suppose, since there aren’t yet 10B of us on the planet.)

  • http://stargazers-observatory.blogspot.com/ Stargazer1323

    Hooray for math!

    I have always felt that statistics should be given equal, or possibly greater, weight in all basic math instruction because it is so important when one wants to verify information like the situation described above. I took several optional statistics classes while I was in school, and since then I have always been very cautious about accepting any numbers given at face value. Any time a new statistic on diseases, death rates, political polls, etc., comes up in the news, I always ask myself where they got the numbers, how accurate their sample size is, and what percentages they are using to prove their point, and the frequent discrepancies have made me realize that the world is a much less dangerous place than the media generally makes it out to be. Numbers can be very easily manipulated to scare people, but if more people recognized that things such as Bayes’ Rule exist and could take those types of calculations into consideration, the power of others to spread fear through statistics would be greatly diminished.

  • hrd2imagin

    I need to brush up on my statistics, I used to know all of that stuff back in college. It’s a shame that our human minds can only remember so much.

  • Christopher

    I was never all that great in math (I only scored a “C” in Collage on statistics), but I’ve always known that numbers can be played with to yield just about any result that one desires them to yield…

  • Valhar2000

    Christopher, you are correct, but only if you do not exclude fallacious manipaulations of numbers. If you are required to be rigorous and correct in your manipulation of statistics, fudging them is a lot harder. The problem is, indeed, to be able to tell who fudged them and how. That can be tricky.

  • spaceman spif

    I bounced this one around my li’l noggin for a bit this morning. My first thought was “The accuracy of the test, versus the number of people who have the disease, are two different, independent values. If a test that is 95% accurate says you have the disease, it doesn’t matter how many other people have the disease. The test itself does not care about that. The only number that matters is how accurate that test is. If a bunch of my friends have the disease, that doesn’t change the fact that the test is right 95% of the time, and it said I have it, therefore it’s 95% likely I have the disease.”

    Then I decided to play with the numbers in my head a bit. If you stretch the number of people who have the disease all the way out to 100 out of 100 people have it, than the 95% accuracy rate of the test is meaningless…and I realized the numbers aren’t so independent after all! Interesting!

  • http://nesoo.wordpress.com/ Nes

    I had heard of Bayesian reasoning before, but I had either forgotten to look it up or was confused when I did (I don’t remember which). This, however, was a very helpful and understandable explanation. Thanks!

  • Samuel SKinner

    Cartoon Guide to Statistics. Like ebonmuse, but harder to understand (math wise at least). However they cover everything.

  • rob

    OK, I get the prosecutor’s fallacy. Somehow that makes perfect sense. But I can’t get around the idea that Spaceman Spiff brings up. Somehow it makes perfect sense that a if more than 1/2 people have the disease that could increase the likelyhood that you have the disease even after taking a test that was, say, 10% accurate. If your chances of having the disease were 1/2 when you went in to the doctor’s office, then the only way the test could make it more likely than that that you have the disease would be if the test was more than 50% accurate.

    Put another way, let’s say 1/100 people have a virus, but amongst people with blue eyes the rate is 90/100. The rate of the disease among brown-eyed people is irrelevant to you, because you have blue eyes. Likewise, amongst people who have taken the test 95/100 of the group who tested positive have the disease. You are now part of this group, so the frequency of the disease amongst people who have not taken the test is irrelevant to you.

    That is how it feels instinctively to me.

  • CCSea

    1 in 30 chance? Pshaw! The problem is your presumption that it is equally likely for every American to be tried for any given crime. But this is clearly not the case; on average, people being tried for crimes turn out to be the real perpetrators more often than, say, the people sitting in the jury box or people who were far away at the time of the crime.

    So let’s assume conservative a priori odds that the person chosen by prosecutors for trial has a 1% chance of being guilty. In other words, prosecutors get it wrong 99 times out of a hundred. But we admit that prosecutors sometimes do their jobs and identify criminals that should be tried.

    If you use a test that is only right 99,999 times out of 100,000 (clearly, an astronomically weaker test than DNA matching), you get Bayesian odds that a positive test indicates guilt 99% of the time.

    Clearly, DNA tests are extremely useful in criminal cases. It is absurd to argue that they are wrong 29 times out of 30.

    Aargh!

  • spaceman spif

    I crunched the numbers, and by cracky it does work, rob!

    I took a sample population of 10,000 people. 1% of them have the disease, so 100 people have the disease, while 9900 do not.

    Of the 100 who have the disease, 95 will test (true) positive, while 5 will test (false) negative.

    Of the 9900 who do not have the disease, 9405 will test (true) negative, while 495 will test (false) positive.

    So we have 95 true positives + 495 false positives = 590 total positive tests.

    The odds you are a true positive is 95/590 = 0.161….16%!

  • spaceman spif

    (edit to add: I know I basically restated what Ebon did in his calculations, but sometimes spelling it all out with whole numbers and descriptive phrases helps “bring it home”)

  • rob

    I am clearly interpreting “95% accurate” to mean something it does not, which is that of all positive cases, 95% are actually sick. That would mean that if 1/100 people were actually sick, the maximum number of positives in a population of 10000 would be about 101, and only one of them would be wrong.

    Now, unfortunately, we get into real world specifics. What is meant by “95% accurate” would depend on the nature of the test. For example, if 95% of the people who have gum disease have heart disease, you could say that having gum disease is a 95% accurate test for heart disease. However, not having gum disease would be completely meaningless – you could still have heart disease, at the chances of whatever the rate of heart disease amongst people who do not have gum disease is.

    That is, if I get it now, which I am not sure I do.

  • http://www.altoonaatheist.blogspot.com Altoona Atheist

    Stats boggles my brain, my eyes went crossed. But great explanation, I’ll keep this in mind!

  • http://elliptica.blogspot.com Lynet

    The lesson to be learned here is that we should never let single cases or anecdotes guide our decisions. Given a sufficiently large pool of chance events, even very unlikely things are bound to happen on occasion. Bayes’ rule gives us the tools to see those occurrences in context.

    Yes, yes — but go further with that point! Suppose X happened. There are two possibilities. Either X was due to a miracle (M) or X happened “by chance” due to natural causes (N). However, if the probability that X happened due to natural causes is 1%, that does not mean there is a 99% chance that it was a miracle! This is the same fallacy as above, yes? The fallacy is that of assuming that P(M|X)=1-P(X|N). In fact, it would be true that P(M|X)=1-P(N|X). Presumably, our minds get confused between the two.

    If we really want to find P(M|X), we should use Bayes’ rule.

    P(M|X)=P(X|M)*P(M)/P(X)

    P(M) is small according to nearly everyone. P(X|M) — the probability that, if a miracle occurs, it will cause X to happen — is rather difficult to determine, but multiplying by it can only make the number the same size or smaller, anyway. The only way to make this number large is if you think P(X) is of comparable magnitude to P(M). In other words, if X is as unlikely as a miracle, you might have a case.

    I realise this won’t solve things entirely; we would need to make other arguments given that people do disagree on the exact prior probability of a miracle. It does put things into perspective, though, doesn’t it?

  • Entomologista

    I’ve had a little bit of Bayes because I was lucky enough to take a class called “Systematics and the Comparative Method” during my MS. Even though I’ve seen this equation and applied Bayesian analysis to data sets, it still isn’t easy. So much of statistics just makes no intuitive sense, which I suppose is why it’s important. We didn’t cover Bayes at all in Stat 801, so I’m hoping it will be covered at some point in 802 this semester.

  • http://wilybadger.wordpress.com Chris Swanson

    Ugh… math… *eyes cross, Chris falls over, injures self, files lawsuit against site for damages, gets laughed out of court by a prosecutor with a small phallusy*

  • http://www.daylightatheism.org/ Ebonmuse

    CCSea:

    Clearly, DNA tests are extremely useful in criminal cases. It is absurd to argue that they are wrong 29 times out of 30.

    It’s not that the DNA test is “wrong”. It’s just that the number it returns doesn’t necessarily indicate what most people intuitively think it does. Granted, if you match, it does place you in a relatively small and select group of possibly guilty individuals. But it would be incorrect to argue that any one member of that group is more likely than any other to be the guilty party.

    Of course, if the prosecutors have other evidence in addition to DNA, then these calculations become largely irrelevant. But that’s not the point I was trying to make here. The point is that it’s fallacious to judge someone guilty on statistical evidence alone. There was a case in Britain, I believe, a few years back where a woman was convicted of murder after three of her children suffered SIDS, though there was no specific evidence of homicide. The prosecutor argued that it was far too unlikely that three infant deaths in a row in the same family could be anything other than malice. But neither he nor the jury ever considered the question of how many families we’d expect that to happen to just by chance.

  • spaceman spif

    So if there are statistical problems in DNA testing and matching and it’s not always conclusive….then what’s Maury Povich gonna do?????

  • http://thegreenbelt.blogspot.com The Ridger

    @rob: you’re ignoring the 5% false positive rate, which means that 5% of all the people who DON’T have the disease test positive. That means of your 1000 people, 100 of whom are sick, you have 9900 people who aren’t sick and 5% of them test positive. So 95 people who are sick test positive, but 495 people who AREN’T sick also test positive. Genuinely sick people are therefore 100 out 595…

  • Alex, FCD

    So if there are statistical problems in DNA testing and matching and it’s not always conclusive….then what’s Maury Povich gonna do?????

    Maury hasn’t got anything to worry about, actually. Different statistics apply to paternity tests. Everybody gets exactly half of their genome each from their mother and their father, so if you have a sample of a person’s DNA, then half of the genetic markers will be identical to something in the mother’s genome, and half will be identical to dad’s (ignoring mutations). Since the identity of the mother is generally not in question, for obvious reasons, you can just cancel out the mother’s genetic markers and see whether dad has something to match the rest. It’s possible that two people fit the father markers, but if you have a small sample (say, all the men that the mother had sex with nine-ish months before the kid was born), it’s not likely to come up.

  • Alex Weaver

    My head feels too much like it’s full of marshmallow with crossbow bolts lodged in my inner ears to analyze this, but it seems to lend credence to my occasional quip tht “Numbers don’t lie. …on their own. Which is why we pay statisticians.”

  • Steve Bowen

    There was a case in Britain, I believe, a few years back where a woman was convicted of murder after three of her children suffered SIDS, though there was no specific evidence of homicide. The prosecutor argued that it was far too unlikely that three infant deaths in a row in the same family could be anything other than malice. But neither he nor the jury ever considered the question of how many families we’d expect that to happen to just by chance.

    Actually the fallacy here was the assumption that the events were statistically independent. Say the probability of SIDS is 1/1000, three deaths would be 1/1000 x 1/1000 x 1/1000 or 1/1000000000. In fact if you allow that there could be genetic abnormalities that contribute to SIDS it is actually quite likely that more than one event could happen in a familly. i.e the events are not statistically independent.

  • http://www.kellygorski.com Kelly

    Whew. I’m glad I’m not alone. I thought along the same lines as spaceman spi did.

    Now I feel better. :)

    Talk about relativity…

  • J Myers

    Actually the fallacy here was the assumption that the events were statistically independent.

    Actually, that’s just one of the errors here; there may also be other risk factors as well. Ebonmuse pointed out the most egregious error, though: long odds simply do constitute evidence. Any event having a non-zero probability is almost certain to occur given enough opportunities. In the absence of positive evidence of parental misconduct, the a priori probability of three infants in the same family dying of SIDS is simply irrelevant if that’s what actually happened.

  • Steve Bowen

    There was a case in Britain, I believe, a few years back where a woman was convicted of murder after three of her children suffered SIDS, though there was no specific evidence of homicide.

    Just for completeness the person in question was sally Clark , a lawer who was convicted of killing two of her children who apparently died from SIDS. The expert witness was subsequently pilloried for mis representing the stats, sadly Sally Clark, although being aquited in 2003 has subsequently died.

  • bkspecial

    its refreshing to see people discuss critical thinking. so often, especially in America, it seems we embrace blind faith and question those who dare to challenge it

  • Chet

    Just by estimation I guessed 1 in 5 (20%). A little high, but if you figure that, in a sample of 100 people, 1 has the disease but 5 people have positive results, the odds that any one of those five actually has the disease is less than 1 in 5 (because of the small possibility that the infected individual got a false negative.)

    Of course, having a one-in-five chance of dying sometime in the next year might be, to some people, enough reason to draft a will anyway.

    So, follow-up question. How reliable would the test have to be before a positive result would give us 95% confidence that the testee was infected with the 1-in-100 disease?

  • J Myers

    How reliable would the test have to be before a positive result would give us 95% confidence that the testee was infected with the 1-in-100 disease?

    99.947%

    Ebonmuse, I think there’s a typo in your original post:
    “Adding up these terms, we get an overall P(A) of….” – shouldn’t this be “…overall P(B) of…”?

  • http://deconbible.blogspot.com bbk

    To me, this is one of several examples in statistics where the hard-wired strategic mechanisms in our brains just don’t work very well. It’s not really statistics that’s the problematic, but us. Move away from the statistical proof of Bayes’ rule and just start applying it to some every day examples to see what I mean.

    Take the big question: is Christianity overall good for us or overall bad for us? A Christian will look around and say that 95% of the Christians are good, therefore it’s overall good. An atheist will look around and see that 95% of the bad people are Christians, so it’s overall bad. Obviously, both claims can be true at the same time, so how do we make sense of it? Well, it doesn’t really matter so much if 95% of Christians are good when only 1% of all people are bad. If Christians hardly ever encounter bad people in their daily lives, they think it’s because Christianity is working and so it must be a good test of goodness. And they’re just plain wrong. It gets really bad when one of these 1%-ers comes along, like Bush or Pat Robertson, and they all give these bad ones unconditional support because they passed the Christianity litmus test. Guys like Bush, even guys like Hitler, have all been helped immensely because people suck at applying Bayes’ rule to their daily lives. Instead, people actually need to be taught it and even then it’s hard to understand the implications.

  • Stuart

    But 1 in 10M is not a slam dunk. (1 in 10B would be a clincher, I suppose, since there aren’t yet 10B of us on the planet.)

    Well, that is better but it isn’t a clincher depending on the question being asked – consider the birthdays question: how many people do you need to put in a room before the odds are better than even that you have two people with the same birthday. Most people (with no prior knowledge of the problem) will answer 183 or something like that, when it is actually around 23.

    You also have to factor in human effects – if the same lab tested the DNA sample from the crime scene and the suspect then the (hopefully very small) chance of a mistake in labeling samples or results, contamination, etc. could well outweigh the theoretical accuracy of the test – consider some problems with fingerprint evidence in practice (Shirley McKie for example).

    Generally things like DNA evidence should be the final nail in a case – find a suspect based on other evidence alone, then the final confirmation would be a DNA match. Using DNA or fingerprints as a wide ranging net to try and start an investigation is prone to errors (especially as databases get larger and multiple forces/institutions merge them together).

  • http://www.someareboojums.org/blog jre

    This is a superb, concise description by example of Bayes’ theorem. Bravo!
    One small typo: In “Adding up these terms, we get an overall P(A) of:”
    substitute “P(B)” for “P(A)”

    Then it will be perfect!