# Chapter 3, part C - Hanover College

Chapter 3, part C III. Uses of means and standard deviations Of course we dont just calculate measures of location and dispersion just because we can, they have very important uses. A. Z-scores A z-score measures the relative location of an item

in the data set. It also measures the number of standard deviations an observation lies from the mean. xi x zi s

For example, the airline price of \$175 has a z-score=(175-219)/45.47 = -.97. This means that a price of \$175 falls almost one standard deviation below the mean. B. Chebyshevs Theorem Chebyshevs: At least (1-1/k2) of the items in a data set must be within k standard deviations from the

mean, where k is any value greater than 1. In other words, the theorem tells us the % of items that must be within a specified number of standard deviations from the mean. Implications If k=2, at least 75% of the data lie within s=2 of the mean. How? (1-1/4)=.75 or 75%. If k=3, this fraction rises to 89% of the data.

If k=4, this fraction rises to 94% of the data. Example: A microeconomics exam has a mean of 72 with a standard deviation of 4. What % of the class falls between 64 and 80 on their exam? Calculate the z-scores for both 64 and 80 to find k and then use Chebyshevs theorem to answer the question. C. The Empirical Rule If the data are distributed normally (bell-shaped), the

empirical rule tells us that: Approximately 68% of the data will be within s=1 of the mean. 95% of the data will be within s=2 of the mean. all of the data will be within s=3 of the mean. D. Detecting Outliers The empirical rule says that almost all observations will fall within s=3 of the mean.

Thus, if an observation has a z-score of greater than 3 (in absolute value), it may be considered an outlier. What to do about an outlier? If its a case of an erroneous value (i.e. a typo), try to correct it. If its valid data, arguments can be made (for and against) dropping it from the sample.

