# Bayes Theorem

In many applications, it can be difficult to directly calculate a conditional probability. The problem is that it is not always easy to determine a complete stochastic truth table. For example, suppose that you receive an email with the phrase "free money" in the subject, and you are wondering whether the email is spam. Let $H$ mean "the email is spam" and $E$ mean "the phrase 'free money' is in the subject of the email". You want to determine whether the evidence $E$ evidentially supports the hypothesis $H$. To do this, you need to determine $Pr(H\mid E)$. It is not obvious what probabilities you should assign to each row of a stochastic truth table for the atomic propositions $H$ and $E$. There is an indirect way to determine this conditional probability.

Since $Pr(H\mid E)=\frac{Pr(H\wedge E)}{Pr(E)}$, we have that

Since $Pr(E\mid H)=\frac{Pr(E\wedge H)}{Pr(H)}$, we have that

Now, since $H\wedge E$ and $E\wedge H$ are tautologically equivalent, we have that

Dividing both sides by $Pr(E)$, we have that:

So, we can determine $Pr(H\mid E)$ if we know three other probabilities 1. $Pr(E\mid H)$, 2. $Pr(H)$ and 3. $Pr(E)$. These are all probabilities that we can estimate with some investigation.

- $Pr(E\mid H)$: This is the probability that the phrase "free money" occurs in the subject line of spam email. That is, $Pr(E\mid H)$ is the probability that the subject line contains the phrase "free money" assuming that the email is spam. The phrase "free money" is in the list of spam trigger words. There are other phrases that might occur in spam email. A good estimate for this conditional probability is $Pr(E\mid H)=0.1$.
- $Pr(H)$: This is the prior probability of receiving spam email. Roughly, 55% of email received each day is classified as spam. That is, $Pr(H)=0.55$.
- $Pr(E)$: This is the prior probability of receiving an email with the phrase "free money" in the subject line. While it is not obvious how to estimate this probability directly, we can use the law of total probability: $Pr(E)=Pr(H)Pr(E\mid H) + Pr(\neg H)Pr(E\mid \neg H)$. We have already determined that $Pr(E\mid H)=0.1$ and $Pr(H)=0.55$. Thus, using the complement law, we know that $Pr(\neg H)=0.45$. The only thing that remains is to estimate $Pr(E\mid \neg H)$. That is, assuming an email is not spam, what is the probability that phrase "free money" occurs in the subject line. It is very unlikely that I would receive an email with the phrase "free money" in the subject line (your estimate of this conditional probability may be different). My estimate of this conditional probability is $Pr(E\mid \neg H)=0.001$. Then, using the law of total probability, we have$Pr(E)=Pr(H)Pr(E\mid H) + Pr(\neg H)Pr(E\mid \neg H)$$= 0.55 * 0.1 + 0.45 *0.001=0.05545$

Putting everything together, we have that:

The above equation is an instance of **Bayes Theorem**:

For all formulas $X$ and $Y$,

As noted above, we often use the law of total probability when applying Bayes Theorem:

For all formulas $X$ and $Y$,

- Lecture
- Slides

Applying Bayes Theorem can be tricky. Use Bayes Theorem to solve the following puzzles:

**Three Prisoner's Problem**: Three prisoners $A, B$ and $C$ have been tried for murder and their verdicts will told to them tomorrow morning. They know only that one of them will be declared guilty and will be executed while the others will be set free. The identity of the condemned prisoner is revealed to the very reliable prison guard, but not to the prisoners themselves.

Prisoner $A$ asks the guard ``Please give this letter to one of my friends --- to the one who is to be released. We both know that at least one of them will be released".

An hour later, $A$ asks the guard ``Can you tell me which of my friends you gave the letter to? It should give me no clue regarding my own status because, regardless of my fate, each of my friends had an equal chance of receiving my letter."

The guard told him that $B$ received his letter.

Prisoner $A$ then concluded that the probability that he will be released is 1/2 (since the only ones without a verdict are $A$ and $C$).

But, $A$ thinks to himself: "Before I talked to the guard my chance of being executed was 1 in 3. Now that he told me $B$ has been released, only $C$ and I remain, so my chances of being executed have gone from 33.33% to 50%. What happened? I made certain not to ask for any information relevant to my own fate..." Explain what is wrong with $A$'s reasoning.

**Monty Hall Dilemma**: Suppose you are on a game show, and you are given the choice of three doors. Behind one door is a car behind the others, goats. You pick a door, say number 1, and the host, who knows what's behind the doors, opens another door, say number 3, which has a goat. He says to you, "Do you want to pick door number 2?" Is it to your advantage to switch your choice of doors?

Trying answering the above questions before watching the following video.

- Lecture
- Slides

## Practice Questions

- Suppose that $Pr(B)=0.25$, $Pr(A\mid B)=0.75$, $Pr(A\mid\neg B)=0.3$, find $Pr(B\mid A)$. Explain how you arrived at your answer.

- Suppose that $Pr(P)=0.85$, $Pr(Q\mid P)=0.25$, $Pr(Q\mid\neg P)=0.5$, find $Pr(P\mid Q)$. Explain how you arrived at your answer.

- Suppose that $Pr(E)=0.85$, $Pr(E\mid H)=0.8$, $Pr(Q\mid\neg P)=0.5$, find $Pr(P\mid Q)$. Explain how you arrived at your answer.

- Suppose that you know that it rains 10% of days, it is cloudy in the morning 20% of the days and when it rains in the afternoon, 50% of the time there were clouds in the morning. Suppose that you see clouds in the morning, what is the probability that it will rain in the afternoon?

- Suppose we have the following information about a gene defect: 1% of people have a certain genetic defect; 90% of tests for the gene detect the defect (true positives); and 9.6% of the tests are false positives.

If a person gets a positive test result, what are the odds they actually have the genetic defect?