You are on page 1of 4

1.017/1.

010 Class 16
Testing Hypotheses about a Single Population

Formulating Hypothesis Testing Problems

Hypotheses about a random variable x are often formulated in terms of its


distributional properties. Example, if property is a:

Null hypothesis H0: a = a0

Objective of hypothesis testing is to decide whether or not to reject this


hypothesis. Decision is based on estimator aˆ of a:

Reject H0: If observed estimate â lies in rejection region


Ra0 ( aˆ ∈ Ra 0 )

Do not reject H0: Otherwise ( aˆ ∉ Ra 0 )

Select rejection region to obtain desired error properties:

Test Result
Do not reject H0 Reject H0
aˆ ∉ Ra 0 aˆ ∈ Ra 0
P(~H0|H0) = α
H0 true P(H0|H0) =1 - α
(Type I Error)
True situation
P(H0|~H0) = β P(~H0|~H0) =
H0 false
(Type II Error) 1- β

Type I error probability α is called the test significance level.

Deriving Hypothesis Rejection Regions for Large Sample Tests

Hypothesis test is often based on a standardized statistic that depends


on unknown true property and its estimate. Basic concepts are the same
as used to derive confidence intervals (see Class 14).

An example is the z statistic:

aˆ - a
z (aˆ , a) =
SD[aˆ ]

If the estimate is unbiased E[z] = 0 and Var[z] = 1.

1
Define a rejection region Rz0 in terms of z as:

R z 0 : z (aˆ , a0 ) ≤ z L
z (aˆ , a0 ) ≥ zU

As rejection region grows Type I error increases and Type II error


decreases (test is more likely to reject hypothesis).

As rejection region shrinks Type I error decreases and Type II error


increases (test is less likely to reject hypothesis)

Usual practice is to select rejection region to insure that Type I error


probability is equal to a specified value α.

For a two-sided test require that Type I error probability is distributed


equally between intervals below zL (probability = α/2) and above zU
(probability = α/2).

These probabilities are:

α
P[ z (aˆ , a) ≤ z L | H0] = P[ z (aˆ , a0 ) ≤ z L ] = Fz ( z L ) =
2
α
P[ z (aˆ , a) ≥ zU | H0] = P[ z (aˆ , a0 ) ≥ zU ] = 1 − Fz ( zU ) =
2
α   α
z L = Fz-1   zU = Fz-1 1 − 
2  2

For large samples z (aˆ , a0 ) has a unit normal distribution. Use the
MATLAB function norminv to evaluate Fz-1 .

If the definition of z is applied a two-sided rejection region Ra0 can also be


written directly in terms of the estimate aˆ :

α 
Ra 0 : aˆ ≤ a L = a0 + Fz-1   SD[aˆ ]
2
 α
aˆ ≥ aU = a0 + Fz-1 1 −  SD[aˆ ]
 2

p Values

p value is largest significance level resulting in acceptance of H0.


For a symmetric two-sided rejection region and a large sample:

2
 aˆ − a0 
p / 2 = 1 − Fz   aˆ ≥ a
 SD(aˆ ) 
 aˆ − a0 
p / 2 = Fz   aˆ ≤ a0 0
 SD(aˆ ) 

For large samples use the MATLAB function normcdf to


compute p from aˆ and SD[ â ].

Special Case -- Sample mean

Consider hypothesis about value of population mean a = E[x]:

H0: a = E[x] = a0

Base test on sample mean estimator mx. Obtain SD[mx] from sample
standard deviation:

SD[ x ] sx
SD[m x ] = ≈
N N

Example: Testing whether mean is significantly different from zero

Suppose a0 = 0, sx = 3, N = 9, mx = 1.2 and α = .05:

 0.05  3
Ra 0 : m x ≤ a L = 0 + Fz-1   = −1.96
 2  9
 0.05  3
m x ≥ aU = 0 + Fz-1 1 −  = +1.96
 2  9

In this case hypothesis is not rejected since mx = 1.2 does not lie in Ra0.
The two-sided p-value is (see plot):

 m − a0  1.2 − 0 
1 − p / 2 = Fz  x  = Fz   = Fz [1.2] = .89
 x
s / N   3 / 9 

p = 0.22

3
1-p/2

Copyright 2003 Massachusetts Institute of Technology


Last modified Oct. 8, 2003

You might also like