Sensitivity and specificity

From The Jolly Contrarian
Jump to navigation Jump to search
Lies, Damn Lies and Statistics

Index: Click to expand:
Tell me more
Sign up for our newsletter — or just get in touch: for ½ a weekly 🍺 you get to consult JC. Ask about it here.

In which JC goes yet further down the wormhole in the spacetime continuum into the profoundly counterintuitive world of statistics and probabilities.

You will hear, in the context of medical tests, the expressions base rate, accuracy, specificity and sensitivity thrown about. They are related, quite similar, but looseness with them can lead to wildly mad conclusions, so it is as well to know them.

Right. Sensitivity is the true positive rate in a sample: the proportion of actual positive cases that are correctly identified. For example, if 100 sick people test for an illness and 95 of them test positive (therefore 5 of them falsely test negative), the sensitivity of that test is 95%.

The specificity is the true negative rate in the sample: the proportion of actual negative cases that are correctly identified as negative. For example, if 100 healthy people test for an illness and 98 of them test negative for it (and therefore 2% test falsely positive) the specificity of that test is 98%.

Putting it together: If 200 people, half of them healthy and half of them sick, test for an illness and 95 of the 100 sick people test positive, while 2 of the 100 healthy people test positive, the test has a sensitivity of 95% and a specificity of 98%.

Positive and negative predictive values

This is all well and good but can be mightily upset by base rates as we have seen elsewhere: a sensitivity and specificity of 99.9% is all well and good, but if the base rate of the illness in the population is one in a million (that is, 0.0001%) then we can still expect our test to yield false positives 99 times out of 100.

In a population of a million people with a base rate of 1 in a million, we should expect only one person to have the illness. The other 999,999 in the sample will not have it. If we then apply a test with 99.9% sensitivity and specificity — that is, a 1-in-a-thousand risk of generating a false positive or a false negative, then while we can expect the true positive rate to be more or less correct (0.999 × 1 = ~1 person — the one sick person), so there are unlikely to be any false negatives, we are almost certain that each positive test will be a false one: False positives: 0.001 × 999,999 = 1,000 people

In other words we would ex[ect 1001 positive tests, but only one of them would be a true positive.

So, of all positive test results, the chance it’s a true positive = 1/1,001 ≈ 0.1%; the chance it’s a false positive = 1,000/1,001 ≈ 99.9%.

Counterintuitive, huh?