Philosophy 200 Bayes Theorem Common fallacies of probability: The Gamblers Fallacy Is assuming that the odds of a single truly random event are affected in any way by previous iterations of the same or other truly random event. Common fallacies of probability: The Gamblers Fallacy Is assuming that the odds of a single truly random event are affected in any way by previous iterations of the same or other truly random event. Ignoring the Law of Large Numbers Is assuming there must be other explanations for very

improbable events. Two Kinds of Probability a priori probability: The sort of probability achieved by dividing the number of desired outcomes vs. the total number of outcomes. Applies to random events. Statistical probability: The frequency at which a given event is observed to occur. Applies to events that are not truly random. An Example What is the a priori probability (expressed as a percent) of a batter in baseball getting

a hit in one at-bat? An Example What is the a priori probability (expressed as a percent) of a batter in baseball getting a hit in one at-bat? 50% An Example What is the a priori probability (expressed as a percent) of a batter in baseball getting a hit in one at-bat? 50% What is the statistical probability (expressed as a percent) of a major league batter getting a hit in one at-bat?

An Example What is the a priori probability (expressed as a percent) of a batter in baseball getting a hit in one at-bat? 50% What is the statistical probability (expressed as a percent) of a major league batter getting a hit in one at-bat? 25.4% A Case Study in probability:

Wendy has tested positive for colon cancer. Colon cancer occurs in .3% of the population (.003 statistical probability) If a person has colon cancer, there is a 90% chance that they will test positive (.9 statistical probability of a true positive) If a person does not have colon cancer, then there is a 3% chance that they will test positive (3% statistical probability of a false positive) Given that Wendy has tested positive, what is the statistical probability that she has colon cancer? Answer: The correct answer is 8.3% Answer: The correct answer is 8.3%

Most people (including many doctors) assume that the chances are much better than they really are that Wendy has colon cancer. The reason for this is that people tend to forget that a test must be absurdly specific to give a high probability of having a rare condition. Formal Statement of Bayess Theorem: Pr(h) * Pr(e|h) Pr(h|e) =[Pr(h) * Pr(e|h)] + [Pr(~h) * Pr(e| ~h)] h = the hypothesis e = the evidence for h Pr(h) = the statistical probability of h Pr(e|h) = the true positive rate of e as evidence for h Pr(e|~h) = the false positive rate of e as evidence for h

The Table Method: e ~e Total h ~h Total True Positives False Negatives Pr(h)*Pop. False Positives True Negatives Pr(~h)*Po

p. Pr(e)*Pop. Pr(~e)*Po p. Pop. = 10^n n = sum of decimal places in two most specific probabilities. The Table Method: e ~e Total h = Pr(e|h) * [Pr(h)*Pop .] = below above

Pr(h)*Pop. ~h = Pr(e| ~h) * [Pr(~h)*P op.] = below above Pr(~h)*Po p. Total Total of this row Total of this row Pop. The Table Method for Wendy: e

~e Total h = Pr(e|h) * [Pr(h)*Pop .] = below above Pr(h)*Pop. ~h = Pr(e| ~h) * [Pr(~h)*P op.] = below above Pr(~h)*Po p. Total Total of this row

Total of this row Pop. The Table Method for Wendy: has CC e ~e Total ~ have CC = Pr(e|h) = Pr(e| * ~h) * [Pr(h)*Pop [Pr(~h)*P .] op.] = below - = below above above

Pr(h)*Pop. Pr(~h)*Po p. Total Total of this row Total of this row Pop. The Table Method for Wendy: has CC tests positive ~ test positive Total ~ have CC

= Pr(e|h) = Pr(e| * ~h) * [Pr(h)*Pop [Pr(~h)*P .] op.] = below - = below above above Pr(h)*Pop. Pr(~h)*Po p. Total Total of this row Total of this row Pop. The Table Method for Wendy: has CC tests

positive ~ test positive Total ~ have CC = Pr(e|h) = Pr(e| * ~h) * [Pr(h)*Pop [Pr(~h)*P .] op.] = below - = below above above .003*Pop. .997*Pop. Total Total of this row Total of

this row 100,000 The Table Method for Wendy: has CC tests positive ~ test positive Total ~ have CC = Pr(e|h) = Pr(e| * ~h) * [Pr(h)*Pop [Pr(~h)*P .] op.] = below - = below above above

300 99,700 Total Total of this row Total of this row 100,000 The Table Method for Wendy: has CC tests positive ~ test positive Total ~ have CC

= True = Pr(e| Positive ~h) * Rate (.9) * [Pr(~h)*P 300 op.] = below - = below above above 300 99,700 Total Total of this row Total of this row 100,000 The Table Method for Wendy: has CC

tests positive 270 ~ test positive Total = below above 300 ~ have CC = Pr(e| ~h) * [Pr(~h)*P op.] = below above 99,700 Total Total of

this row Total of this row 100,000 The Table Method for Wendy: has CC tests positive 270 ~ test positive Total 30 300 ~ have CC

= Pr(e| ~h) * [Pr(~h)*P op.] = below above 99,700 Total Total of this row Total of this row 100,000 The Table Method for Wendy: has CC tests positive 270

~ test positive Total 30 300 ~ have CC = False positive rate (.03) * 99,700 = below above 99,700 Total Total of this row Total of this row 100,000

The Table Method for Wendy: has CC tests positive ~ test positive Total 270 30 300 ~ have CC 2,991 = below above 99,700 Total Total of this row

Total of this row 100,000 The Table Method for Wendy: has CC tests positive ~ test positive Total 270 ~ have CC 2,991 30 96,709

300 99,700 Total Total of this row Total of this row 100,000 The Table Method for Wendy: has CC tests positive ~ test positive Total Total 270

~ have CC 2,991 30 96,709 96,739 300 99,700 100,000 3,261 The Table Method for Wendy: has CC tests

positive 270 (true positive) ~ test positive 30 (false negative) Total 300 ~ have CC 2,991 (false positive) 96,709 (true negative)

99,700 Total 3,261 96,739 100,000 What are Wendys chances? tests positive has CC 270 (true positive) ~ have CC 2,991 (false positive) Total

3,261 Wendys Chances given that she tests positive are the true positives divided by the number of total tests. That is, 270/3261, which is .083 (8.3%). Those who misestimate that probability forget that colon cancer is rarer than a false positive on a test. How about a second test? Note that testing positive (given the test accuracy specified) raises ones chances of having the condition from .003(the base rate) to .083. If we use .083 as the new base rate, those who again test positive then have a 73.1% chance of having the condition. A third positive test (with .731 as the new base rate) raises the chance of having the condition to 98.8%