A frequentist asks "given my hypothesis, how likely is my data" - they calculate \(P(D \mid H)\).
Much of the criticism of frequentism arises from the fact that most of the time this is the
wrong question to ask. We really want to know is \(P(H \mid D)\) - how likely the hypothesis is,
given the data! Unfortunately many scientists don't understand this subtlety. The result is that
confusion arises, and scientists publishing in Nature are surprised to find 20% of published
results based on P-values smaller than 0.05 are probably wrong.
Remember - a P-value is the chance of getting my data, assuming my hypothesis is true. When
testing a drug, we may start with the hypothesis that the drug does not work. A small P-value
means the data was unlikely to arise given the hypothesis. It is tempting to conclude that the
hypothesis was wrong, and the drug does work.
We will see that this is a serious mistake, which arises as a failure to compare the small
Likelihood of my data to the probability that my hypothesis is true in the first place. I
illustrate the problem with a common thought experiment.
Bayes Theorem and Medical Tests
Suppose I am worried about my health and fear I have Rabies. I go for a test. My null hypothesis
is that I do not have Rabies. The test is positive. The Likelihood that the test is positive,
given that I do not have Rabies is 1%. What is the chance I have Rabies after all?
If you answered 99% you have fallen foul of the 'base rate fallacy', also known as the
'prosecutors fallacy'. To answer the question correctly, let's imagine testing the whole
population of Sheffield, 60 000 people. Let's say we know that, on average, 0.1% of people have
Rabies at any time. Prior to taking the test, this is my best guess for the chance that I have
the disease. We say the prior probability of having Rabies is 0.1%. In Sheffield then, at any
one time 60 people have Rabies, and 59 940 people do not. When we test the whole population, 1%
of these 59 940 people will receive a positive test. Thats 599 people who falsely test positive.
If all the people who do have Rabies also test positive, that's 659 positive tests in total.
Let us ask this question - given I have had a positive test, what are the chances I have Rabies?
Well, out of all 659 positive tests, 60 of them are genuinely sick people. Therefore the
conditional probability that I have Rabies, given a positive test is \(60 / 659 \approx 9\)%. I
am much more likely to be well than not - even after a positive test!
The confusion arises because although the Likelihood of a positive test was small, we were
testing a hypothesis which was itself very unlikely. This is why it is so important to compare
the Likelihood with the prior probability of your null hypothesis. It is why tests using the
Likelihood can be so misleading. It is for this reason that particle physicists required such a
high threshold (5\(\sigma\)) for claiming the detection of the Higgs boson. It is also the
reason why so many scientific studies with apparently significant p-values are wrong - the
claims they were making were very unlikely, and so needed extra-ordinary levels of evidence. As
well as it's impact in science, ignorance of the base rate fallacy has widespread ramifications
in society. People have spent years in
jail due to a failure to
understand this point. We shall now look at how to fix this problem using Bayes theorem.
Recall Bayes Theorem states that
\[P(B \mid A) = \frac{P(A \mid B) \, P(B)}{P(A)}.\]
Let's label the event that I have Rabies as \(R\), and the event of a positive test \(T^+\).
Bayes Theorem tells us that the probability I have Rabies, given my positive test is
\[P(R \mid T^+) = \frac{P( T^+ \mid R) \, P(R)}{P(T^+)}.\]
We assume the test always works when someone does have Rabies, i.e \(P( T^+ \mid R) = 1\). We
know the prior probability that I have Rabies is 1%, i.e \(P(R) = 0.001\). What is \(P(T^+)\)?
Well, I either have Rabies (event \(R\)) or I do not (event \(\tilde{R}\)). I can get a positive
test in either case, so the probability I receive a positive test is
\[P(T^+) = P( T^+ \mid R) \, P(R) + P( T^+ \mid \tilde{R}) \, P(\tilde{R}) = 1 \times 0.001 +
0.01 \times 0.999 = 0.01099.\]
Putting this all together gives the correct chance that I have Rabies given a positive test,
\(P(R \mid T^+) = 0.09\). Phew.