Introduction
I first encountered the Bayes theorem when I started learning about clustering techniques using machine learning algorithms. I instantly realized its usefulness when it comes to updating someone’s belief based on a piece of evidence.
To be honest, it was not easy for me to fully understand it. I believe that part of this difficulty is due to the jargon that surrounds the theorem (e.g. conditional probability, hypothesis, prior and posterior). In addition to the jargon, I find conditional probability very abstract (or maybe it was not my favourite topic in school!).
Therefore, I wanted to write a post that visualizes my understanding of the theorem and put it in a context the layman can relate to (rather than red and blue balls from a bag).
Sherlock Holmes (the data scientist)
In the following visuals, I describe a conversation between Sherlock Holmes following Bayesian reasoning while teaching Watson how to update his belief after encountering a piece of evidence.
For the sake of closing the gap between Sherlock Holmes lingo and Statistics jargon, I sometimes use the word ‘Theory’ instead of ‘Hypothesis’. Even though the two terms are not the same, I thought using the word ‘Theory’ is more relatable to the layman than the word ‘Hypothesis’.
I limited the use of jargon and mathematical expressions to some of the visuals (at the right-hand side — where the diagram is), so the reader can still relate to external resources when they want to read further about the topic.
![](https://mhdhabboub.com/wp-content/uploads/2022/08/copy-of-spatial-data-science-1-1.png?w=1024)
![](https://mhdhabboub.com/wp-content/uploads/2022/08/copy-of-spatial-data-science-1-2.png?w=1024)
![](https://mhdhabboub.com/wp-content/uploads/2022/08/copy-of-spatial-data-science-3-1.png?w=1024)
![](https://mhdhabboub.com/wp-content/uploads/2022/08/copy-of-spatial-data-science-4-1.png?w=1024)
![](https://mhdhabboub.com/wp-content/uploads/2022/08/copy-of-spatial-data-science-5-1.png?w=1024)
![](https://mhdhabboub.com/wp-content/uploads/2022/08/copy-of-spatial-data-science-6-1.png?w=1024)
Python code
To make this post as practical as possible, I attached a python script that the reader can use to plug in some numbers and experiment with them.
# What if Sherlock Holmes was a data scientist?
# calculate P(H|E) as function of
# P(H): The probability of the Hypothesis is true (the prior).
# P(E|H): The probability of the evidence to occur given that the Hypothesis is true (The likelihood)
# P(E|H is false) The probability of the evidence to occur given that the Hypothesis is False
def calculate_bayes_theorem(p_H, p_E_given_H, p_E_given_H_is_false):
p_H_is_false = 1 - p_H
p_E = (p_E_given_H_is_false * p_H_is_false) + (p_E_given_H * p_H )
try:
p_H_given_E = (p_E_given_H * p_H) / p_E
# printing results
print(f'P(H) = {p_H * 100}%')
print(f'P(E|H) = {p_E_given_H * 100}%')
print(f'P(E|H is false) = {p_E_given_H_is_false * 100}%')
print(f'P(H|E) = {round(p_H_given_E,2) * 100}%')
return p_H_given_E
except ZeroDivisionError:
print(f'ERROR: P(E) is equal to 0')
# P(H)
# There is someone behind the curtains 'the prior' or 'probability of the Hypothesis is true'.
p_H = 0.1
# P(E|H)
# Hearing a noise from behind the curtains, given there is someone there, 'the likelihood'
p_E_given_H = 0.6
# P(E|H is false)
# hearing some noises from behind the curtains given that There is no one there.
p_E_given_H_is_false = 0.2
# calculate P(H|E)
result = calculate_bayes_theorem(p_H, p_E_given_H, p_E_given_H_is_false)
Resources
If you made it this far, it means that you are interested in the topic. Here are some of the resources I found useful about the topic.
- The Signal and the Noise: The Art and Science of Prediction — by Nate Silver
- Bayes theorem, the geometry of changing beliefs — by 3Blue1Brown
- Bayes’ Theorem, Clearly Explained!!!! — by StatQuest with Josh Starmer
- A Gentle Introduction to Bayes Theorem for Machine Learning — by Jason Brownlee
#HappyDataSciencing and you always can find me on Twitter