Probability (4.1 - 4.4) In this Lecture we Develop an understanding of probability as the relative frequency of occurrence of an event over a very large number of observations or repetitions of the phenomenon. Understand the basic event relations and probability laws. Understand what is meant by independent events. 3-1 Relative Frequency and Probability Relative frequency concept of probability: If an experiment (observation/measurement) is repeated a large number of times and event E occurs 30% of the time, then .30 should be a very good approximation to the probability of event E. Event: A collection of possible outcomes of an experiment. Outcome: Each possible distinct result of a simple experiment. Experiment Coin toss Outcome Head or Tail Event Head 5 Coin Tosses HHHHH or HTTHT At Least three heads

Select and weigh an individual a Weight weight < 50kg 3-2 Probability = relative frequency If there are 35 brown cows in a herd of 100 cows, the probability that one cow selected at random from the herd is brown is 0.35. This could be determined by the following. 1. First number the cows from 1 to 100. Set the total to 0. 2. Use a random number table to select a number (hence a cow) at random between 1 and 100. 3. Determine the cows color. Add a +1 to our total if the cow is brown, 0 otherwise. 4. Repeat steps 2 and 3 at least 10,000 times, each time selecting from all 100 cows (simple random selection with replacement). 5. Divide the total observed by the total number of iterations (10,000) and that fraction would be the estimated probability of selecting a brown cow at random. 3-3 Estimating Probabilities Basic Experiment: Measure an individuals Height and Weight. Weight and Height are random variables their values vary from individual to individual. We use the symbols, W and H to represent these random variables. Estimate the following Probabilities. P(W > 60kg) = 12/20 = .60 P( 50 < W < 60) = 5/20 = .25

P(H < 1.8m) = 5/20 = .25 P( W > 60kg and H < 1.8m) = 2/20 = .1 Can probability ever be greater than 1.0? Can probability ever be less than 0.0? Weight 43.5 45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2 76.3 84.7 Height 1.76 1.90 1.86 1.83 1.61 1.53 1.81

1.90 1.90 1.85 1.98 1.53 1.96 1.86 1.75 1.85 1.81 1.82 1.87 1.88 3-4 Probability Limits Since probability is computed from relative frequencies, it follows that the probability of an event must be between zero and one. 0 P ( A) 1 If all cows are white, the probability of selecting a white cow at random from the herd is 1, that is, you will get a white cow 100% of the time. Similarly, the probability of selecting a brown cow is 0, that is you will get a brown cow 0% of the time. 3-5 Mutually Exclusive Events The occurrence of one of the events excludes the possibility of the occurrence of the other event. Basic Experiment: 5 tosses of a fair coin. Event A: At least three heads in 5 tosses of an unbiased coin. Event B: At least three tails in 5 tosses of an unbiased coin.

Event A implies 3,4 or 5 heads which implies 2,1 or 0 tails. Event B implies 3,4 or 5 tails which implies 2,1 or 0 heads. Hence, if Event A occurs (e.g. we get 3 heads) then Event B cannot occur (e.g. we cannot get 3 tails as well). Events A and B are mutually exclusive. 3-6 Mutually Exclusive Events (cont) Basic Experiment: Measure an individuals weight and height. Event A: An observed weight greater than 60 kg. (W>60) Event B: An observed weight greater than 50 kg. (W>50) A and B are not mutually exclusive. We could observe a weight of 61kg which would satisfy both events. Event C: An observed weight less than 50 kg. (W<50) Event D: An observed weight greater than 60 kg. (W>60) C and D are mutually exclusive. If we observe a weight of less than 50kg we cannot simultaneously observe a weight of say 61kg. If two events C and D are mutually exclusive, then the probability that either event occurs = P(C or D) = P(C) + P(D). 3-7 Mutually Exclusive Probabilities P(W<50)=3/20 = .15, P(W>60)=12/20 = .60, P(C or D) = P(C) + P(D) then P(W<50 or W>60) = .15 + .60 = .75 =(3+12)/20 Weight

43.5 45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2 76.3 84.7 Height 1.76 1.90 1.86 1.83 1.61 1.53 1.81 1.90 1.90 1.85 1.98 1.53 1.96

1.86 1.75 1.85 1.81 1.82 1.87 1.88 3-8 Complementary Events Basic Experiment: Measure someones weight Event A: An observed weight of less than 60 kg. (W<60) Event B: An observed weight of at least 60 kg. (W60) Events A and B are mutually exclusive. But, more than that, the two events are complementary in that the outcome of the experiment must fall in one or the other of the two events. There are no other options. P(W<60)=.40, P(W 60)=.60, P(W< 60 or W 60) = .40 + .60 = 1.00 3-9 TTTTT TTTTH TTTHT TTTHH TTHTT TTHTH TTHHT TTHHH THTTT THTTH THTHT THTHH

THHTT THHTH THHHT THHHH HTTTT HTTTH HTTHT HTTHH HTHTT HTHTH HTHHT HTHHH HHTTT HHTTH HHTHT HHTHH HHHTT HHHTH HHHHT HHHHH Computing Probabilities Basic Experiment: Toss a coin 5 times. 32 possible outcomes to 5 coin toss experiment. Event A: At least 3 Heads P(3-5H in 5 tosses) =16/32 Event B: At least 2 Tails P(2-5T in 5 tosses) =26/32 Not Mutually Exclusive Event C: 1 or 2 T

Mutually Exclusive Event D: 1 or 2 H Event C: 3 or 4 H P(C) =(5 +10)/32 = 15/32 Event D: 3 or 4 T P(D) =(5 +10)/32 = 15/32 P(C or D)= 30/32 Count them to make sure. 3-10 Some Probability Properties If two events, A and B are mutually exclusive, then P(A) and P(B) must satisfy the following properties: 0 P ( A) 1 and 0 P ( B ) 1 P (either A or B ) P ( A) P ( B ) P ( A) P ( A ) 1 and P( B ) P( B ) 1 The last line also holds for any events A and B, not necessarily mutually exclusive, where: A Complement of A B Complement of B 3-11 Union and Intersection of Events AB The INTERSECTION of two events A and B is the set of all outcomes that are included in both A and B, and is denoted as A B. (read as A and B) A

B The UNION of two events A and B is the set of all outcomes that are included in either A or B (or both) and is denoted as A B). (read as A or B) AB General Rule: P( A B) P( A) P( B) P( A B) Important to remember Add the probabilities then subtract the overlap so that we dont double count. 3-12 Union and Intersection Example Basic Experiment: Measure an individuals Height and Weight. P( A B ) P( A) P( B) P ( A B) What is the probability of 601.8) P(601.8) =P(601.8) P(601.8) = 9/20 + 15/20 7/20 = (9+15-7)/20 = 17/20 = .85 P ( A B ) P ( A) P( B ) P( A B ) What is the probability of W>60 and W<70) P(W>60 W<70) = P(W>60) + P(W<70) P(W>60 W<70) = 12/20 + 17/20 20/20 = (12+17-20)/20 = 9/20 = .45 Weight 43.5

45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2 76.3 84.7 Height 1.76 1.90 1.86 1.83 1.61 1.53 1.81 1.90 1.90 1.85 1.98 1.53 1.96 1.86

1.75 1.85 1.81 1.82 1.87 1.88 3-13 Probability Algebra P(A B) = P(A) + P(B) P(A B) P(A B) + P(A B) = P(A) + P(B) Probabilities are like any algebraic symbols, we can add, subtract, multiply and divide them. P(A B) = P(A) + P(B) - P(A B) 1= [P(A) + P(B) - P(A B)]/ P(A B) {assuming P(A B) 0} If event A and B are complementary they have no overlap, hence P(A B) = 0, and P(A B) = P(A) + P(B), also, since A B involves all possible events, P(A B)=1, thus 1=P(A) + P(B) for complementary event, or P(A) = 1-P(B) What if A and B are mutually exclusive? 3-14 Conditional Probability Consider two events A and B with nonzero probabilities, P(A) and P(B). P( A B) P( A | B)

P( B) The conditional probability of event A given event B. Note: Event B must have nonzero probability. P( A B) P( B | A) P( A) The conditional probability of event B given event A. Note: Event A must have nonzero probability. When we compute a conditional probability we are essentially asking for the probability of an event under the constraint/knowledge that some second event has occurred. Ex: P(W>50 given that H>1.8) - That is, what is the probability of finding someone with weight greater than 50kg among individuals who are greater than 1.8m tall? 3-15 Conditional Probability Example Basic Experiment: Measure the height and weight of a sample. Event A: Weight(X) is greater than 50 kg. Event B: Height(Y) is greater than 1.8 m. P(X>50 and Y>1.8) = 13/20 P(Y > 1.8) = 15/20

What is the probability of observing a weight greater than 50kg among individuals greater than 1.8m? P( X 50 | Y 1.8) P( X 50 Y 1.8) P(Y 1.8) =(13/20)/(15/20) = 13/15 Weight 43.5 45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2 76.3 84.7 Height 1.76 1.90 1.86

1.83 1.61 1.53 1.81 1.90 1.90 1.85 1.98 1.53 1.96 1.86 1.75 1.85 1.81 1.82 1.87 1.88 3-16 Intersection Probability using the Conditional Probability The probability that two events occur together. P( A B ) P ( A) P ( B | A) P( B ) P( A | B ) P(X>50 | Y>1.8) = 13/15 P(Y>1.8) = 15/20 P(X>50) = 17/20 P(X>50 and Y>1.8) = P(Y>18)P(X>50 | Y>1.8) =(15/20)(13/15) = 13/20 Sometimes we know the conditional probability. e.g. P(Lung Cancer and Smoking) = P(Smoking)P(Lung Cancer | Smoking) From general population surveys.

Weight 43.5 45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2 76.3 84.7 Height 1.76 1.90 1.86 1.83 1.61 1.53 1.81 1.90 1.90 1.85 1.98

1.53 1.96 1.86 1.75 1.85 1.81 1.82 1.87 1.88 From a retrospective study of smokers. 3-17 Nesting and Recruitment of Birds Example Basic Experiment: Observe 1000 wading bird nests in the Everglades. Record number of nests in which eggs were laid, when eggs were laid and whether hatched chicks survived. Event A = Nest is successful (chick survives). Event B = Eggs laid in March. P(A) = fraction of nests that are successful. (300/1000 = .3) P(B) = fraction of nests with eggs laid in March. (100/1000 = .1) P(A B) = fraction of nests that are successful and whose eggs were laid in March. (60/1000 = .06) P(A|B) = Probability of a successful nest given that the eggs in the nest were laid in March P(A|B) = P(A B)/P(B) = 0.06/ 0.10 = 0.60 (=60/100) P(B|A) = Probability eggs in the nest were laid in March given that the nest is successful. P(B|A) = P(A B)/P(A) = 0.06/ 0.30 = 0.20 (=60/300) 3-18 Independent Events Events A and B are said to be independent if:

P( A | B) P( A), or P( B | A) P( B), or P( A B) P( A) P( B) (Probability of one event does not depend on what happens with the other event.) Event A: Weight(X) is greater than 50 kg. Event B: Height(Y) is greater than 1.8 m. Are A and B independent events? P(A) =P(W>50)= 17/20 = .85 P(A|B) =P(W>50|H>1.8)= 13/15 = .8667 Close, but NO; they are dependent! Weight 43.5 45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2 76.3 84.7 Height 1.76

1.90 1.86 1.83 1.61 1.53 1.81 1.90 1.90 1.85 1.98 1.53 1.96 1.86 1.75 1.85 1.81 1.82 1.87 1.88 This will become more important later when we develop test for independence. 3-19 Other Simple Experiments Toss a coin 5 times, count the number of heads. Outcome=number of heads. Throw six die. Outcome=sum of pips facing up. Throw two die. Outcome=sum of pips facing up. Toss a die until a 5 or 6 occurs. Outcome=number of tosses needed minus one. Toss a frame on spatial point patterns. Outcome=number of points in the frame. Draw a chip at random with replacement from bag.

Outcome=value written on chip. Replicate each experiment many times. Make a frequency chart of results. Discuss events and probabilities from distribution. 3-20 Probability Distribution A probability distribution (function) is a list of the probabilities of the values (simple outcomes) of a random variable. Table: Number of heads in two tosses of a coin Can be envisioned as a table. y outcome 0 1 2 P(y) probability 1/4 2/4 1/4 For some experiments, the probability of a simple outcome may be easily calculated mathematically using a specific probability function. If y is a simple outcome and p(y) is its probability. 0 p( y ) 1 p( y ) 1 all y 3-21

The Probability Density Function for Weight Each of the weight values are unique, hence the probability we would assign to each observation, in the absence of any other information, would simply be 1/n. Clearly this is not very informative. A histogram would provide a better estimate of the true underlying probabilities for the entire population, by binning the values. A more sophisticated approach would involve kernel density-based smoothing, in order to estimate the underlying continuous probability density function curve. We will discuss this concept in the next lecture. Weight 43.5 45.2 48.4 51.8 53.0 55.2 57.2 59.3 61.0 61.4 63.4 65.2 65.6 67.8 68.0 68.3 68.5 76.2

76.3 84.7 P(Y) 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 3-22