Module 28 Sample Size Determination This module explores the process of estimating the sample size required for detecting differences of a specified magnitude for three common circumstances. Reviewed 19 July 05/ Module 28 28 - 1 The General Situation An important issue in planning a new study is the determination of an appropriate sample size required to meet certain conditions. For example, for a study dealing with blood cholesterol levels, these conditions are typically expressed in terms such as How large a sample do I need to be able to reject the null hypothesis that two population means are equal if the difference between them is d = 10 mg/dl? 28 - 2 The General Approach We focus on the sample size required to test a specific hypothesis. In general, there exists a formula for calculating a sample size for the specific test statistic appropriate to test a specified hypothesis. Typically, these formulae require that the user specify the -level and Power = (1 ) desired, as well as the difference to be detected and the variability of the measure in question. Importantly, it is usually wise not to calculate a single number for the sample size. Rather, calculate a range of values by varying the assumptions so that you can get a sense of their impact on the resulting projected sample size. The you can pick a more suitable sample size from this range. 28 - 3 Three Common Situations In this module, we examine the process of estimating sample size for three common circumstances: 1. 2. 3. One-sample t-test and paired t-test, Two-sample t-test, and

Comparison of P1 versus P2 with a z-test. The tools required for these three situations are broadly applicable and cover many of the circumstances that are typically encountered. There are sophisticated software packages that cover much more than these three and most professional biostatisticians have them readily available. 28 - 4 1. One-sample t-test and Paired t-test For testing the hypothesis: H0 : = k vs. H1 : k with a two-tailed test, the formula is: ( z1 / 2 z1 ) n d 2 Note: this formula is used even though the test statistic could be a t-test. 28 - 5 One-Sample Example We are interested in the size for a sample from a population of blood cholesterol levels. We know that typically is about 30 mg/dl for these populations. The following table shows sample sizes for different levels of some of the factors included in the equation for a one sample t-test for differences between a specified population mean and the true mean. 28 - 6 One-Sample Example (contd.) = 0.05, = 25, d = 5.0, Power = 0.80 ( z1 / 2 z1 ) n d

2 1.96 0.842 25 n 5 14.01 2 2 196.28 n 197 28 - 7 Sample Size for One-Sample t-test Blood Cholesterol Levels: = 0.05, = 25 1-z1- = 25 0.5 d 0 0.5 9,604 1.0 2,401 3.0 267 5.0 96 10.0 24 20.0 6 30.0 3 0.8 0.842 19,628 4,907 545

196 49 12 5 0.85 1.036 22,440 5,610 623 224 56 14 6 0.9 1.282 26,276 6,569 730 263 66 16 7 0.95 1.645 32,490 8,123 903 325 81 20 9 28 - 8 Blood Cholesterol Levels: = 0.05, = 30 1-z1- = 30 d 0.5 1.0 3.0 5.0 10.0 20.0 30.0

0.5 0 13,830 3,457 384 138 35 9 4 0.8 0.842 28,264 7,066 785 283 71 18 8 0.85 1.036 32,314 8,078 898 323 81 20 9 0.9 1.282 37,838 9,460 1,051 378 95 24 11 0.95 1.645 46,786 11,696 1,300 468

117 29 13 28 - 9 Blood Cholesterol Levels: = 0.05, = 35 1-z1- = 35 d 0.5 1.0 3.0 5.0 10.0 20.0 30.0 0.5 0 18,824 4,706 523 188 47 12 5 0.8 0.842 38,471 9,618 1,069 385 96 24 11 0.85 1.036 43,982 10,996 1,222 440 110 27 12

0.9 1.282 51,502 12,875 1,431 515 129 32 14 0.95 1.645 63,681 15,920 1,769 637 159 40 18 28 - 10 2. Two Sample t-test For the hypothesis: H0: 1 = 2 vs. H1: 1 2 For a two tailed t-test, the formula is: 2 N n1 n2 4 ( z1 / 2 z1 ) (d 1 2 ) 2 2 28 - 11 Sample Size for Testing Two tailed t-test H0: 1 = 2 vs. H1: 1 2

How large a sample would be needed for comparing two approaches to cholesterol lowering using = 0.05, to detect a difference of d = 20 mg/dl or more with Power = 1- = 0.90 The formula is: 4 2 ( z1 / 2 z1 ) 2 N n1 n2 2 (d 1 2 ) Note: Textbooks do not always clearly indicate whether the formula they provide is for one group only or for both groups combined. 28 - 12 When = 30 mg/dl, = 0.10, = 0.05; z1-/2 = 1.96 Power = 1- ; z 1- = 1.282 , d = 20mg/dl 2 4(30) (1.96 1.282) N n1 n2 (20) 2 2 4 900 (3.242) 2 37,838.03 400 400 N 94.6 Hence about 50 for each group 28 - 13 Sample Sizes: = 25 mg/dl, = 0.05 = 25 d 0.5 1 3 5 10 20 30 0.5 0

38,416 9,604 1,067 384 96 24 11 1-/z1- 0.8 0.85 0.842 1.036 78,512 89,760 19,628 22,440 2,181 2,493 785 898 196 224 49 56 22 25 0.9 1.282 105,106 26,276 2,920 1,051 263 66 29 0.95 1.645 129,960 32,490 3,610 1,300 325 81 36 28 - 14

Sample Sizes: = 30 mg/dl, = 0.05 = 30 d 0.5 1 3 5 10 20 30 0.5 0 55,319 13,830 1,537 553 138 35 15 1-/z1- 0.8 0.85 0.842 1.036 113,057 129,255 28,264 32,314 3,140 3,590 1,131 1,293 283 323 71 81 31 36 0.9 1.282 151,352 37,838 4,204 1,514

378 95 42 0.95 1.645 187,143 46,786 5,198 1,871 468 117 52 28 - 15 Sample Sizes: = 35 mg/dl, = 0.05 = 35 d 0.5 1 3 5 10 20 30 0.5 0 75,295 18,824 2,092 753 188 47 21 1-/z1- 0.8 0.85 0.842 1.036 153,884 175,930 38,471 43,982 4,275 4,887

1,539 1,759 385 440 96 110 43 49 0.9 1.282 206,007 51,502 5,722 2,060 515 129 57 0.95 1.645 254,722 63,681 7,076 2,547 637 159 71 28 - 16 3. Two-sample proportions H0 : P1 = P2 vs. H1 : P1 P2 P1 P2 P1 P2 4( z1 / 2 z1 ) 1 2

2 N n1 n2 2 d P1 P2 2 28 - 17 Example: d = P1 - P2 = 0.7 - 0.5 = 0.2 When = 30 mg/dl, = 0.10, = 0.05; z1-/2 = 1.96 Power = 1- ; z1- = 1.282 , d = 20mg/dl (P1+P2)/2 = (0.7+0.5)/2 = 0.6 2 N (n1 n2 ) 4 1.96 1.282 (0.6)(1 0.6) (0.2) 2 4(3.242)2 (0.6)(0.4) (0.2)2 10.09 252.25 0.04 N 252.25 28 - 18 Sample size for testing P1- P2 with = 0.05 1-z1- P1 0.9 0.8 0.7 0.6 0.5 0.4 0.3

0.2 0.1 P2 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0 196 288 350 380 380 350 288 196 73 0.8 0.842 400 589 714 777 777 714 589 400 149 0.85 1.036 458 673 817 889 889 817

673 458 171 0.9 1.282 536 788 956 1,041 1,041 956 788 536 200 0.95 1.645 663 975 1,183 1,287 1,287 1,183 975 663 247 28 - 19 1-/z1- 0.8 0.85 0.842 1.036 P1 P2 0.5 0 0.9 1.282 0.95 1.645

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 61 81 92 96 92 81 61 35 126 165 188 196 188 165 126 71 144 188 215 224 215 188 144 81 168

221 252 263 252 221 168 95 208 273 312 325 312 273 208 117 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.6 0.5 0.4 0.3 0.2 0.1 0.0 32 39 42 42 39 32 22 65 79 86 86 79 65

44 75 91 99 99 91 75 51 88 106 116 116 106 88 60 108 131 143 143 131 108 74 28 - 20 1-z1- P1 0.9 0.8 0.7 0.6 0.5 0.4 P2 0.5 0.4 0.3 0.2 0.1 0.0 0.5 0 20

23 24 23 20 15 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 14 15 15 14 12 29 31 31 29 24 33 36 36 33 27 38 42 42 38 32 47 51 51 47 39

0.9 0.8 0.7 0.6 0.3 0.2 0.1 0.0 10 11 10 9 21 22 21 18 24 25 24 21 28 29 28 25 35 36 35 30 0.8 0.842 41 47 49 47 41

31 0.85 1.036 47 54 56 54 47 36 0.9 1.282 55 63 66 63 55 42 0.95 1.645 68 78 81 78 68 52 28 - 21