Bayes’ Theorem
Probability Suitable for IGCSE • GCSE • IB • AP • American Curriculum • High School
What is Bayes’ Theorem?
Bayes’ Theorem is a fundamental rule in probability that calculates the probability of an event based on prior knowledge of related conditions. It allows you to update a probability estimate when new evidence becomes available, by relating conditional probabilities in reverse. Formally, it connects P(A|B) to P(B|A) using the prior probabilities of A and B.
📌 Featured Definition: Bayes’ Theorem states that the probability of event A given that event B has occurred equals the probability of B given A, multiplied by the probability of A, divided by the probability of B. It is a cornerstone of conditional probability and modern statistics.
Key Concepts of Bayes’ Theorem
Before diving into the formula, make sure you understand these building blocks. Each concept plays a specific role in how Bayes’ Theorem in probability works.
Prior Probability — P(A)
The probability of event A before any new evidence is considered. It represents your initial belief or background knowledge.
Conditional Probability — P(A|B)
The probability of A occurring given that B has already occurred. Written as P(A|B), the vertical bar means "given."
Posterior Probability
The updated probability of A after taking new evidence B into account. This is what Bayes’ Theorem calculates: P(A|B).
Likelihood — P(B|A)
How probable is the observed evidence B, assuming A is true? This is the reverse conditional — P(B given A).
Sample Space
The complete set of all possible outcomes. P(B) in the denominator represents the total probability of B across the entire sample space.
Independent vs Dependent Events
Independent events don’t affect each other — P(A|B) = P(A). Bayes’ Theorem applies to dependent events, where knowing B changes the probability of A.
💡 Why Bayes’ Theorem matters: In the real world, probabilities are rarely static. As new information arrives — a positive test result, a new sensor reading, a user’s click — Bayes’ Theorem gives you a mathematically rigorous way to update your beliefs. It is the engine behind modern AI, medical screening, and forensic statistics.
Bayes’ Theorem Formula
Bayes’ Theorem Formula
P(B)
Valid provided P(B) ≠ 0
What Each Symbol Means
| Symbol | Name | Plain English |
|---|---|---|
| P(A|B) | Posterior probability | Probability of A after seeing evidence B — this is what you want to find |
| P(B|A) | Likelihood | How likely is evidence B if A is actually true? |
| P(A) | Prior probability | Your initial probability of A, before seeing any evidence |
| P(B) | Total / marginal probability | The overall probability that B occurs across all possible scenarios |
💡 Expanded denominator tip: When you don’t know P(B) directly, you can expand it using the law of total probability:
P(B) = P(B|A) × P(A) + P(B|A′) × P(A′)
where A′ means "not A." This is the most common version you’ll use in exam questions.
How Bayes’ Theorem Works
The core idea is belief updating. You start with a belief about how likely something is (the prior). Then evidence arrives. Bayes’ Theorem tells you exactly how much to revise your belief in light of that evidence (the posterior).
Think of it like a detective. Before finding any clues, the detective has a prior suspicion about who committed a crime. Each new clue — a footprint, a receipt, a witness — updates that suspicion mathematically. Bayes’ Theorem is the equation that drives this updating process.
The 4-Step Logic
- Identify your prior — P(A): What is the baseline probability of the event of interest, independent of any new information?
- Identify the likelihood — P(B|A): If A is true, how probable is it that you would observe the evidence B you actually saw?
- Calculate the total probability — P(B): How likely is this evidence B to appear across all scenarios, both when A is true and when it is not?
- Apply the formula to get the posterior — P(A|B): Divide the numerator [P(B|A) × P(A)] by P(B). The result is your updated probability.
🤔 Intuition check: Notice that if P(B|A) is very high (the evidence strongly suggests A) and P(A) is already reasonably large, the posterior P(A|B) will jump up sharply. But if P(A) is tiny to begin with — like a rare disease — even strong evidence may leave P(A|B) surprisingly low. This counterintuitive result is why Bayes’ Theorem is so important in medical and forensic contexts.
Bayes’ Theorem Examples
Work through each example carefully. Each one shows the full method so you can apply the same approach in your exam.
📋 Problem: A rare disease affects 1% of the population. A diagnostic test correctly identifies the disease 95% of the time (true positive rate) and incorrectly gives a positive result 4% of the time for healthy people (false positive rate). A randomly selected patient tests positive. What is the probability they actually have the disease?
Given Information:
- P(Disease) = 0.01 ← prior probability
- P(No Disease) = 0.99
- P(Positive | Disease) = 0.95 ← true positive rate
- P(Positive | No Disease) = 0.04 ← false positive rate
- Find: P(Disease | Positive) ← posterior probability
Formula Setup:
Step-by-Step Solution:
Step 1 — Calculate P(Positive) using total probability:
P(Positive) = P(Positive|Disease) × P(Disease) + P(Positive|No Disease) × P(No Disease)
P(Positive) = (0.95 × 0.01) + (0.04 × 0.99)
P(Positive) = 0.0095 + 0.0396 = 0.0491
Step 2 — Apply Bayes’ Theorem:
P(Disease|Positive) = (0.95 × 0.01) / 0.0491
P(Disease|Positive) = 0.0095 / 0.0491 ≈ 0.1935
✅ Final Answer: ≈ 19.4%
Interpretation: Even with a 95% accurate test, a patient who tests positive has only about a 19.4% chance of actually having the disease. This is because the disease is rare — the false positives from healthy people outnumber the true positives.
📋 Problem: In a class, 60% of students studied hard and 40% did not. Of those who studied hard, 80% passed the exam. Of those who did not study, 30% still passed. A student is chosen at random and found to have passed. What is the probability that they studied hard?
Given Information:
- P(Studied) = 0.60
- P(Did not study) = 0.40
- P(Pass | Studied) = 0.80
- P(Pass | Did not study) = 0.30
- Find: P(Studied | Pass)
Step-by-Step Solution:
Step 1 — Find P(Pass):
P(Pass) = P(Pass|Studied) × P(Studied) + P(Pass|Not Studied) × P(Not Studied)
P(Pass) = (0.80 × 0.60) + (0.30 × 0.40)
P(Pass) = 0.48 + 0.12 = 0.60
Step 2 — Apply Bayes’ Theorem:
P(Studied|Pass) = (0.80 × 0.60) / 0.60
P(Studied|Pass) = 0.48 / 0.60 = 0.80
✅ Final Answer: 0.80 or 80%
Interpretation: If a student passed the exam, there is an 80% probability they had studied hard — which makes intuitive sense given that the majority of the class studied and studying significantly boosted pass rates.
📋 Problem: There are two bags. Bag A contains 3 red balls and 7 blue balls. Bag B contains 6 red balls and 4 blue balls. A bag is chosen at random (50/50), and one ball is drawn. The ball is red. What is the probability that it came from Bag A?
Given Information:
- P(Bag A) = 0.5 P(Bag B) = 0.5
- P(Red | Bag A) = 3/10 = 0.30
- P(Red | Bag B) = 6/10 = 0.60
- Find: P(Bag A | Red)
Step-by-Step Solution:
Step 1 — Find P(Red):
P(Red) = P(Red|A) × P(A) + P(Red|B) × P(B)
P(Red) = (0.30 × 0.5) + (0.60 × 0.5)
P(Red) = 0.15 + 0.30 = 0.45
Step 2 — Apply Bayes’ Theorem:
P(A|Red) = (0.30 × 0.5) / 0.45
P(A|Red) = 0.15 / 0.45 = 1/3 ≈ 0.333
✅ Final Answer: 1/3 ≈ 33.3%
Interpretation: Although the bag was chosen with equal probability, drawing a red ball makes it less likely the ball came from Bag A (which has fewer red balls) — only a 1-in-3 chance compared to a 2-in-3 chance for Bag B.
Real-World Applications
Bayes’ Theorem is far more than a classroom exercise. Here is where it shapes the real world:
🏥
Medical Diagnosis
Doctors update the probability of a diagnosis as new test results and symptoms arrive. Bayesian reasoning is essential for interpreting screening tests for rare diseases.
📨
Spam Filtering
Email spam filters use Bayesian classifiers. Given that an email contains the word "free," what is the probability it is spam? The filter continuously updates with new data.
🤖
Machine Learning & AI
Naive Bayes classifiers power recommendation systems, text categorization, and language models. Bayesian neural networks quantify uncertainty in predictions.
📈
Risk Assessment
Insurance actuaries and financial analysts use Bayesian methods to update risk models as new loss data, market conditions, or environmental changes occur.
⚖️
Decision-Making
From self-driving cars interpreting sensor data to courtroom forensics evaluating DNA evidence — Bayes’ Theorem underpins rational decisions under uncertainty.
Common Mistakes
These are the errors that most frequently cost students marks in Bayes’ Theorem problems.
❌ Mistake 1: Confusing P(A|B) with P(B|A)
These are not the same. P(Disease|Positive) is very different from P(Positive|Disease). The first is what you usually want to find; the second is often what the question gives you. Always identify which direction the conditioning goes.
❌ Mistake 2: Forgetting to Calculate P(B)
Many students plug numbers into the numerator but forget to calculate the correct denominator. If P(B) is not given directly, always expand it using the law of total probability: P(B) = P(B|A)P(A) + P(B|A′)P(A′).
❌ Mistake 3: Mixing Up Prior and Posterior
The prior P(A) is your starting probability. The posterior P(A|B) is the updated probability. Using the posterior where the prior should be — or vice versa — will give a completely wrong answer.
❌ Mistake 4: Using the Wrong Denominator
A common error is putting only P(B|A) × P(A) in the denominator instead of the full P(B). The denominator must account for all ways that B can happen, not just via A.
❌ Mistake 5: Ignoring the Question Wording
Exam questions use careful language. "Given that the ball drawn is red..." means you must condition on red. Rushing through the problem setup without identifying which event is the condition leads to setting up the formula backwards.
Bayes’ Theorem Calculator
Enter your values below to instantly calculate P(A|B). You can enter decimals (e.g. 0.05) or percentages (e.g. 5).
Formula Substitution
RESULT
Practice Questions
Try each question yourself first, then click Show Answer to check your working.
Question 1
A factory has two machines. Machine X produces 70% of output and Machine Y produces 30%. Of Machine X’s output, 5% is defective. Of Machine Y’s output, 8% is defective. A randomly selected item is found defective. What is the probability it came from Machine X?
✅ Worked Answer:
P(Defective) = (0.05 × 0.70) + (0.08 × 0.30) = 0.035 + 0.024 = 0.059
P(X|Defective) = (0.05 × 0.70) / 0.059 = 0.035 / 0.059
P(X|Defective) ≈ 0.593 or 59.3%
Question 2
A box contains 4 fair coins and 1 two-headed coin. A coin is picked at random and flipped. It shows heads. What is the probability that the two-headed coin was selected?
✅ Worked Answer:
P(Two-headed) = 1/5, P(Fair) = 4/5
P(Heads|Two-headed) = 1 P(Heads|Fair) = 0.5
P(Heads) = (1 × 0.2) + (0.5 × 0.8) = 0.2 + 0.4 = 0.6
P(Two-headed|Heads) = (1 × 0.2) / 0.6 = 0.2 / 0.6
P(Two-headed|Heads) = 1/3 ≈ 33.3%
Question 3
In a survey, 55% of respondents are male and 45% are female. 20% of males and 35% of females own a bicycle. A bicycle owner is selected at random. What is the probability they are female?
✅ Worked Answer:
P(Bike) = (0.20 × 0.55) + (0.35 × 0.45) = 0.11 + 0.1575 = 0.2675
P(Female|Bike) = (0.35 × 0.45) / 0.2675 = 0.1575 / 0.2675
P(Female|Bike) ≈ 0.589 or 58.9%
Question 4
A security system correctly identifies an intruder 98% of the time and raises a false alarm 2% of the time when there is no intruder. On any given night, the probability of an actual intruder is 0.5%. An alarm goes off. What is the probability there is a real intruder?
✅ Worked Answer:
P(Intruder) = 0.005 P(No Intruder) = 0.995
P(Alarm) = (0.98 × 0.005) + (0.02 × 0.995) = 0.0049 + 0.0199 = 0.0248
P(Intruder|Alarm) = 0.0049 / 0.0248
P(Intruder|Alarm) ≈ 0.198 or 19.8%
Question 5 — Challenge
In an IB class, 40% of students take Higher Level Maths and 60% take Standard Level. In the mock exam, 75% of HL students scored above 80%, and 30% of SL students scored above 80%. A randomly selected student scored above 80%. What is the probability they are in HL?
✅ Worked Answer:
P(Above 80) = (0.75 × 0.40) + (0.30 × 0.60) = 0.30 + 0.18 = 0.48
P(HL|Above 80) = (0.75 × 0.40) / 0.48 = 0.30 / 0.48
P(HL|Above 80) = 0.625 or 62.5%
📝 Summary
- Bayes’ Theorem calculates the posterior probability P(A|B) by combining the prior P(A), the likelihood P(B|A), and the total probability P(B).
- The formula is: P(A|B) = [P(B|A) × P(A)] / P(B)
- When P(B) is not given, use the law of total probability: P(B) = P(B|A)P(A) + P(B|A′)P(A′).
- Always identify which event is the condition (after the vertical bar) before setting up the formula.
- P(A|B) and P(B|A) are different quantities — never swap them.
- A small prior probability P(A) can make P(A|B) surprisingly low even when P(B|A) is large — this is the base-rate effect.
- Bayes’ Theorem is the mathematical foundation for updating beliefs with evidence — essential in medicine, AI, statistics, and everyday decision-making.
Frequently Asked Questions
What is Bayes’ Theorem in simple terms?
Bayes’ Theorem is a mathematical formula that updates the probability of something being true when new evidence arrives. It answers the question: “Given that I just observed B, how probable is A now?” It combines what you already believed (the prior) with what the evidence tells you (the likelihood) to give a revised probability (the posterior).
What is the Bayes’ Theorem formula?
The standard formula is P(A|B) = [P(B|A) × P(A)] / P(B). When P(B) is not directly given, it can be expanded as P(B) = P(B|A)×P(A) + P(B|A′)×P(A′), where A′ is the complement of A.
Can you give a simple example of Bayes’ Theorem?
Suppose a disease affects 2% of people. A test has a 90% true positive rate and a 5% false positive rate. If someone tests positive, the probability they actually have the disease is found by: P(Disease|Positive) = (0.90 × 0.02) / [(0.90 × 0.02) + (0.05 × 0.98)] = 0.018 / 0.067 ≈ 26.9%. Despite the positive test, there is still only about a 27% chance they have the disease.
What is the difference between conditional probability and Bayes’ Theorem?
Conditional probability P(A|B) simply describes the probability of A given B — it is a concept. Bayes’ Theorem is the formula that lets you calculate P(A|B) by reversing the conditioning direction. In other words, Bayes’ Theorem is the tool you use when you know P(B|A) but need P(A|B).
Where is Bayes’ Theorem used in real life?
Bayes’ Theorem underpins spam email filters (does this email contain the word “win” — is it spam?), medical diagnosis (positive test result — does the patient have the disease?), AI and machine learning classifiers, insurance risk models, search-and-rescue probability calculations, and courtroom DNA evidence analysis. It is arguably the most practically important theorem in applied statistics.
