# Lecture 14 chi-square test, P-value - UCLA Statistics

Lecture 14 chi-square test, P-value Measurement error (review from lecture 13) Null hypothesis; alternative hypothesis Evidence against null hypothesis Measuring the Strength of evidence by P-value Pre-setting significance level Conclusion

Confidence interval Some general thoughts about hypothesis testing A claim is any statement made about the truth; it could be a theory made by a scientist, or a statement from a prosecutor, a manufacture or a consumer Data cannot prove a claim however, because there May be other data that could contradict the theory Data can be used to reject the claim if there is a contradiction to what may be expected Put any claim in the null hypothesis H0 Come up with an alternative hypothesis and put it as H1 Study data and find a hypothesis testing statistics which is

an informative summary of data that is most relevant in differentiating H1 from H0. Testing statistics is obtained by experience or statistical training; it depends on the formulation of the problem and how the data are related to the hypothesis. Find the strength of evidence by P-value : from a future set of data, compute the probability that the summary testing statistics will be as large as or even greater than the one obtained from the current data. If P-value is very small , then either the null hypothesis is false or you are extremely unlucky. So statistician will argue that this is a strong evidence against null hypothesis. If P-value is smaller than a pre-specified level (called significance

level, 5% for example), then null hypothesis is rejected. Back to the microarray example Ho : true SD denote 0.1 by 0) H1 : true SD > 0.1 (because this is the main concern; you dont care if SD is small) Summary : Sample SD (s) = square root of ( sum of squares/ (n-1) ) = 0.18 Where sum of squares = (1.1-1.3)2 + (1.2-1.3)2 + (1.4-1.3)2 + (1.5-1.3)2 = 0.1, n=4 The ratio s/ is it too big ? The P-value consideration: Suppose a future data set (n=4) will be collected. Let s be the sample SD from this future dataset; it is random; so what is

the probability that s/ will be As big as or bigger than 1.8 ? P(s/ 0 >1.8) P(s/ 0 >1.8) But to find the probability we need to use chisquare distribution : Recall that sum of squares/ true variance follow a chi-square distribution ; Therefore, equivalently, we compute P ( future sum of squares/ 02 > sum of squares from the currently available data/ 02), (recall0 is The value claimed under the null hypothesis) ; Once again, if data were generated again, then Sum of squares/ true variance is random and follows a chi-squared distribution

with n-1 degrees of freedom; where sum of squares= sum of squared distance between each data point and the sample mean Note : Sum of squares= (n-1) sample variance = (n-1)(sample SD)2 P-value = P(chi-square random variable> computed value from data)=P (chisquare random variable > 10.0) For our case, n=4; so look at the chi-square distribution with df=3; from table we see : P-value is between .025 and .01, reject null hypothesis at 5% significance level 9.348 11.34 The value computed from available data = .10/.01=10 (note sum of squares=.1, true variance =.12

Confidence interval A 95% confidence interval for true variance 2 is (Sum of squares/C2, sum of squares/C1) Where C1 and C2 are the cutting points from chisquare table with d.f=n-1 so that P(chisquare random variable > C1)= .975 P(chisquare random variable>C2)=.025 This interval is derived from P( C1< sum of squares/ 2

## Recently Viewed Presentations

• A virus hoax is a message warning the recipient of non-existent computer virus threat, usually sent as a chain email that tells the recipient to forward it to everyone they know. This is a form of social engineering that plays...
• A repeater channel is defined by having different receive and transmit frequencies (any channel that is defined via CPS to have different receive and transmit frequencies will be considered to be a repeater channel and the MOTOTRBO radio will expect...
• ANP 101: Introductory Animal Physiology. CELL ADAPTATION, CELL INJURY and CELL DEATH, contd. 3) Cell Injury. If the limits of adaptive response are exceeded, or in certain instances when adaptation is not possible, a sequence of events called .
• The Gateway to Technology. What is Microsoft Office? Microsoft Office is an office suite of: Desktop applications, Servers and . Services for the Microsoft Windows and Mac OSX operating systems. Desktop Applications. Word is a word processing application.
• career finder tool. is ideal for considering options after education, including searching for . apprenticeships. Security marking: PUBLIC. Our search tool has over 37,000 courses in the UK. It includes detailed information about the universities and colleges. You can search...
• These are two very important key words. A common mistake would be to write about many goals, even though the prompt tells you to focus on only one goal. Another common mistake would be to discuss goals that you want...
• Genetics Revision. Inheritance. To understand inheritance it is easier to study qualitative traits. There include traits such as . eye . and . hair colour. Quantitative. traits are more complex and include . height. and . intelligence.
• The Emergence of Modern Canada. 1896-1914. Introduction. ... and Donald Mann to extend their lines through the Yellowhead Pass, and down the Thompson River to Kamloops, and through to Vancouver. ... Officials and police officers tried boarding the ship, but...