You are on page 1of 13

Statistics 512 Notes 7 Hypothesis Testing Continued

Testing a normal mean Example: A highway patrol officer believes that the average speed of cars traveling over a certain stretch of highway exceeds the posted limit of 55 mph. The speeds of a random sample of 200 cars were recorded. The standard deviation of speeds is known to be 5 and it is assumed that the distribution of speeds is normal. Test the patrol officers claim.
Distributions Speeds

40

45

50

55

60

65

Moments
Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 55.8 4.4676391 0.3159098 56.42296 55.17704 200

Suppose X 1 ,K , X n iid N ( , 2 ) with the variance known. We want to test H 0 : = 0 vs. H1 : > 0 . X 0 z= Consider the test statistic and critical region n C = {z : z c} . What do we need to choose c to be so that the size of the test is 0.05?

X 0 P = c = P (Z c ) 0 n where Z is a standard normal random variable. Thus, we want to choose c to be the 0.95 quantile of the standard normal distribution which equals 1.645. X 0 55.8 55 z= = = 2.26 . For the speed limit data, 5 n 200 Since z>1.645, we reject the null hypothesis there is strong evidence that the average speed is above 55 MPH.
Suppose we wanted to test H 0 : 0 vs. H1 : > 0 . X 0 z= The size of the test with test statistic and n

critical region C = {z : z c} is

max 0

X 0 P c n

. We have X 0 0 X P c = P c+ = n n n

0 P Z c + n = 1 c + 0 Because P Z c + 0 is an n n increasing function of , the size of the test is X 0 P = c . Thus a test of size 0.05 for testing 0 n H 0 : 0 vs. H1 : > 0 is the same as the test of size 0.05 for testing H 0 : = 0 vs. H1 : > 0 -- the critical X 0 z= region is C = {z : z 1.645} where . n
Power function: The power function of the test with critical region C = {z : z 1.645} is the following

X 0 C ( ) = P 1.645 = n X 0 0 0 P + 1.645 + = n n n X P 1.645 + 0 = n n 1 1.645 + 0 n For H 0 : = 0 vs. H a : > 0 and = 1 , the power function is shown below for n=10 and n=100.

Power when n=10


1.0 Power 0.0 -1.0 0.2 0.4 0.6 0.8

-0.5

0.0

0.5 mu

1.0

1.5

2.0

Power when n=100


1.0 Power 0.0 -1.0 0.2 0.4 0.6 0.8

-0.5

0.0

0.5 mu

1.0

1.5

2.0

Two sided tests: Suppose we want to test H 0 : = 0 vs. X 0 z= H1 : 0 . Using the test statistic still seems n reasonable but now it makes sense to reject for both very large and very small values of z . We can use a critical region of the form C = {z :| z | c} . A test of size 0.05 has

critical region C = {z :| z | 1.96} because X 0 P = 0 c = P = 0 ( | Z | c) n Duality between tests and confidence intervals Suppose we want to test H 0 : = 0 vs. H1 : 0 and use the rejection region C = {z :| z | 1.96} . Then, the set of 0 for which the H 0 : = 0 is not rejected is X 0 X 0 < 1.96} = {0 : 1.96 < < 1.96} = n n {0 : X 1.96 < 0 < X + 1.96 } n n which is the 95% confidence interval for that we have used. {0 : In general, there is a duality between tests and confidence intervals. Suppose we have a family of tests of size of H 0 : = 0 vs. H a : 0 for each . Then { 0 : test of H 0 : = 0 vs. H1 : 0 is not rejected} is a (1 ) confidence interval for .

Proof: Let CI ( X 1 ,K , X n ) = { 0 : test of H 0 : = 0 vs. H1 : 0 is not rejected} Then P 0 [0 CI ( X 1 , K , X n)] = 1 size(test of H0 : = 0 vs. Ha : 0 )

= 1 Conversely, suppose we have a (1 ) confidence interval CI ( X 1 ,K , X n ) for . Then a test of size at most of H 0 : = 0 vs. H a : 0 is to reject the null hypothesis if
and only if 0 does not belong to the confidence region. Proof: We have P 0 [ 0 CI ( X 1 , K , X n)] because CI ( X 1 ,K , X n ) is a (1 ) confidence interval. Thus, the test is of size at most . Large sample tests for mean One of the issues that came up in a recent municipal election was the high cost of housing. A candidate seeking to unseat an incumbent claimed that the average family spends more than 30% of its annual income on housing. A housing expert was asked investigate the claim. A random sample of 125 households was drawn, and each household was asked to report the percentage of household income spent on housing costs. Is there strong evidence in favor of the candidates claim?

Distributions Costs

15

20

25

30

35

40

45

50

Moments
Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 31.952 7.1907826 0.6431632 33.225 30.679 125

We want to test H 0 : 30 vs. H a : > 30 . More generally, test H 0 : 0 vs. H a : > 0 .


n X 0 t= i =1 ( X i X ) 2 is the Test statistic where S 2 = S n 1 n sample variance. Consider the test with critical region {t : t > c} . By the central limit theorem,

X 0 X 0 0 0 P > c = P + > c+ S S = S S n n n n 0 0 X P > c+ 1 c + S S S n n n


Note that the approximate probability of rejecting the null hypothesis is an increasing function of so that the size is equal to the probability of rejecting the null hypothesis 1 when = 0 . Thus, if we choose c = (1 ) = z where is the standard normal CDF, then the approximate size of the test that has critical region {t : t > c} is 1 1(1 ) = .

For the data on family spending on annual housing, 31.952 30 t= = 3.03 . Since t>1.645, we reject the null 7.191 125 hypothesis at the 0.05 significance level; there is strong evidence for the candidates claim that the average family spends more than 30% of its annual income on housing. t-test for normal mean

Suppose X 1 ,K , X n iid N ( , 2 ) with the variance unknown. Suppose we want to test H 0 : = 0 vs.

X 0 t= H a : > 0 . Consider the test statistic . S n The test with rejection region {t : t > t ,n 1} [where t ,n 1 is the (1 ) quantile of the t-distribution with n-1 degrees of freedom, i.e., = P(T > t ,n 1 ) ] has exact size because X 0 t= when = 0 , has a t-distribution with n-1 S n degrees of freedom.
Note the difference between the rejection rule {t : t > t ,n 1} and {t : t > z } . The large sample {t : t > z } has approximate size , while {t : t > t ,n 1} has exact size . Of course, we now have to assume that X i has a normal distribution. In practice, we may not be willing to assume that the population is normal. In general t-critical values are larger than z critical values (i.e., t ,n 1 > z ) so the t-test is conservative relative to the large sample test. So in practice, many statisticians often use the t-test even if they do not believe the data is normally distributed. Note that lim t ,n 1 = z . n How well does the t-test work in moderate sized samples when the data is not normal, i.e., what is its true size in

moderate sized samples? We will look at this question using the Monte Carlo method (Section 5.8) on Thursday. Review of hypothesis testing Goal: Decide between two hypotheses about a parameter of interest H 0 : 0

H1 : 1 ,
where 0 U 1 = . Null vs. Alternative Hypothesis: The alternative hypothesis is the hypothesis we are trying to see if there is strong evidence for. The null hypothesis is the default hypothesis that we will retain unless there is strong evidence for the alternative hypothesis. Test statistic and critical region: Test is defined by test statistic and critical region. Critical region is region of values of test statistic for which we will reject the null hypothesis. Errors in hypothesis testing: Type I and Type II errors. Size of test, power of test: Power function of test = C ( ) = P (W ( X 1 ,K , X n ) C ) = Probability of rejecting null hypothesis when true parameter is . Size of test = max 0 C ( )

Power at an alternative 1 = C ( ) Neyman-Pearson paradigm: Choose size of test to be reasonably small to protect against Type I error, typically 0.05 or 0.01. Among tests which have prescribed size, choose the most powerful test. P-values: Measure of evidence against the null hypothesis. Smallest sized test in a family of tests for which we would reject the null hypothesis. In chapter 8, we will discuss how to choose most powerful tests.

You might also like