Bayes' theorem - Mathematics

Sep 04, 2024

Bayes' theorem, named after Thomas Bayes, is a fundamental principle in probability theory that provides a mathematical framework for updating beliefs based on new evidence.

(The italicised sections contain math formulas and calculations. If that's not your thing, feel free to skip them—they won't disrupt the flow or cause you to miss any key points.) Bayes' theorem is expressed as:

P(A|B) = (P(B|A) * P(A)) / P(B)

Where:

P(A|B) is the probability of A given B (posterior probability)
P(B|A) is the probability of B given A (likelihood)
P(A) is the probability of A (base rate)
P(B) is the probability of B (marginal likelihood)

Base rate is important

At its core, Bayes' theorem emphasises the importance of prior knowledge, or the "base rate," in making accurate predictions. This concept challenges our intuitive thinking, which often overlooks pre-existing probabilities when faced with new information. As noted in "Thinking, Fast and Slow," we tend to exaggerate the significance of new evidence while underestimating the relevance of base rates.

Rare disease example. Imagine a rare disease that affects 1 in 10,000 people (base rate). A test for this disease is 99% accurate (both for positive and negative results). If someone tests positive, what's the probability they actually have the disease? Many people might intuitively say 99%, but this ignores the base rate. Using Bayes' theorem, the actual probability is much lower, around 1%, because the base rate of the disease is so low.

Let A = Having the disease; B = Positive test result

Then we have P(A) = 0.0001 (base rate) P(B|A) = 0.99 (test accuracy for true positives) and P(B|not A) = 0.01 (false positive rate)

Thus P(B) = P(B|A) * P(A) + P(B|not A) * P(not A) = 0.99 * 0.0001 + 0.01 * 0.9999 = 0.0000999 + 0.009999 = 0.0100989

Now, applying Bayes' theorem:

P(A|B) = (P(B|A) * P(A)) / P(B) = (0.99 * 0.0001) / 0.0100989 ≈ 0.0098 or about 1%

“Invert, always invert”

Bayes' theorem also addresses the challenge of inverse probability. While humans naturally excel at predicting effects from causes, the reverse – inferring causes from effects – is considerably more difficult. This cognitive asymmetry, as explored in "The Book of Why," is what Bayes sought to overcome. By providing a mathematical framework for inverse probability, Bayes' theorem empowers us to reason backwards from observations to their potential causes.

Email Spam Example. Consider the following forward probability: Given that an email is spam, there's a 70% chance it contains the word “free”. In reverse: given that an email contains the word “free”, what's the probability it's spam? This is much hard to know intuitively. But using Bayes’ theorem we can get a fairly accurate estimate.

Given:

P(contains "free" | spam) = 0.7 (70% of spam emails contain "free")
Let's assume:
- 20% of all emails are spam: P(spam) = 0.2
- 5% of non-spam emails contain "free": P(contains "free" | not spam) = 0.05

We want to find P(spam | contains "free")

A = Email is spam B = Email contains "free"

P(A) = 0.2 P(B|A) = 0.7 P(B|not A) = 0.05

First, calculate P(B): P(B) = P(B|A) * P(A) + P(B|not A) * P(not A) = 0.7 * 0.2 + 0.05 * 0.8 = 0.14 + 0.04 = 0.18

Now, applying Bayes' theorem:

P(A|B) = (P(B|A) * P(A)) / P(B) = (0.7 * 0.2) / 0.18 ≈ 0.7778 or about 77.78%

Therefore, if an email contains the word "free", there's approximately a 77.78% chance it's spam, given our assumptions.

In conclusion, Bayes' theorem offers a powerful tool for probabilistic reasoning, challenging us to consider both prior knowledge and new evidence in a balanced manner.

Latticework of Mental Models

Discussion about this post