Handout: Two Sample Hypothesis Testing and Inference For

Class 9.
07 Fall2004
Handout: Two Sample Hypothesis Testing And Inference For
Dierence In Means
Hypothesis Testing
I. Two independent samples from Normal distributions.
Suppose X
1
,...,X
n
1
is an independent sample from Normal(
1
,
1
2
) distribu-
tion. Independently of the rst sample, suppose Y
1
,...,Y
n
2
is an independent
samplefromNormal(
2
,
2
2
)distribution(possiblydierentfromtherstone):
Mean
Variance
Data
Sample size
Sample Mean
StandardDeviation
Distribution
Group1 Group2 Known?
2
Unknown
2
1
2
2
Either
X
1
,...,X
n
1
Y
1
,...,Y
n
2
Known
n
1
n
2
Known
m
1
=
X m
2
=
Y Known
SD
1
SD
2
Known
Normal(
1
,
2
1
) Normal(
2
,
2
2
) Assumed
Reasonable estimate forthe dierence ofthe population means
1
2
is
m
1
m
2
=X
Y .
Note that
E(m
1
m
2
) =
1
2
and
SE(m
1
m
2
) = V ar(m
1
m
2
) = V ar(m
1
) + V ar(m
2
) =
1
2
/n
1
+
2
2
/n
2
.
forindependent samples X
1
,...,X
n
1
and Y
1
,...,Y
n
2
.
1

For testing
H
0
:
1
=
2
(1)
against
H
1
:1)
1
=
2
or
2)
1
<
2
or
3)
1
>
2
use
test statistics d
=
m
1
m
2
obt
SE(m
1
m
2
)
which follows some distribution d .

Theorem. Under the above assumptions about the two samples X
1
,...,X
n
1
and Y
1
,...,Y
n
2
, for testing test H
0
:
1
2
= 0 at - signicance level
vs
1)H
1
:
1

=0. Reject H
0
if |d
obt
| d
crit
(/2)
2)H
1
:
1
2
<0. Reject H
0
if d
d
crit
()
obt
3)H
1
:
1
2
>0. Reject H
0
if d
d
crit
()
obt
Computation of SE(m
1
m
2
) and choice of distribution d
:
1.
1
and
2
are known
SE(m
1
m
2
) =
1
2
/n
1
+
2
2
/n
2
m
1
m
2
Test statistics d
obt
= z
obt
=
2
follows standard Normal distri-
1
/n
1
+
2
2
/n
2
bution z.
2.
1
and
2
are unknown, but n
1
and n
2
arelarge ( 30)
In this case can omit Normality assumption.
SE(m
1
m
2
) =
1
2
/n
1
+
2
2
/n
2
SD
2
2
/n
2
and
1
/n
1
+SD
2
m
1
m
2
teststatisticsd
obt
=z
obt
=
1
/n
1
+SD
2
approximatelyfollowsstandard
SD
2
2
/n
2
Normal distribution z.
2
3.
1
and
2
are unknown, and n
1
andn
2
arenot largeenough (30)
a)
2
=
2
=
2
(unknown)
1 2
SE(m
1
m
2
) =
2
(1/n
1
+ 1/n
2
) with
1
+(n
2
1)SD
2
pooledestimate of
2
:
2
=
(n
1
1)SD
2
2
and
pool n
1
+n
2
2
m
1
m
2
test Statistics d
obt
= t
obt
=
2
follows t distribution
pooled
(1/n
1
+1/n
2
)
with df =n
1
+n
2
2 degrees of freedom.
=
2
2
areunknown b)
2
1

SE(m
1
m
2
) =
1
2
/n
1
+
2
2
/n
2
) SD
2
2
/n
2
and
1
/n
1
+SD
2
m
1
m
2
test Statistics d
obt
=t
obt
=
1
/n
1
+SD
2
follows t distribution with
SD
2
2
/n
2
degrees of freedom:
(SD
2
2
/n
2
)
2
1
/n
1
+SD
2
df =
(SD
2
2
/n
2
)
2
(estimated by MATLAB)
1
/n
1
)
2
(SD
2
+
n
1
1 n
2
1
oralternatively df min(n
1
1, n
2
1).
Ifinstead oftesting (1), want totest
H
0
:
1
2
=d (2)
against
H
1
:
1
2
=d(< or >)
use
m
1
m
2
d
test statistics d
obt
=
SE(m
1
m
2
)
II. Proportions
For a random variable X drawn from a Binomial(n
1
, p
1
) distribution, and an
independent random variable Y drawn from a Binomial(n
2
, p
2
) distribution,
let p
1
=X/n
1
and p
2
=Y/n
2
. For testing
H
0
: p
1
=p
2
(3)
against
H
1
:1)p
1
=p
2
or
2)p
1
<p
2
or
3)p
1
>p
2
3
p
1
p
2
use test statistics d
=
obt SE(
.
p
1
p
2
)
Since foraBinomial(n,p) randomvariableX and p) =
p(1p)
, p=X/n, V ar(
n

p
1
(1p
1
) p
2
(1p
2
)
p
1
p
2
) = V ar( p
1
) + V ar(
n
SE( p
1
p
2
) = V ar( p
2
) = +
1
n
2
Then
p
1
p
2
test statistics d
obt
=z
obt
=
p
1
(1p
1
) p
2
(1p
2
)
+
n
1
n
2
hasapproximatelyNormalzdistributionifn
1
p
1
,n
1
(1p
1
),n
2
p
2
, and n
2
(1p
2
)
10.
III. Dependent samples (Paired data)
Forpairedmeasurements(X
1
, Y
1
),...,(X
n
, Y
n
)(eg.,measurementsbeforeand
after)previous theory does nothold.
SampleX
1
,...,X
n
(anindependent samplefromNormal(
1
,
1
2
)distribution)is
notindependentofsampleY
1
,...,Y
n
(anindependentsamplefromNormal(
2
,
2
2
)
distribution), then
SE(m
1
m
2
) = V ar(m
1
m
2
) = V ar(m
1
) + V ar(m
2
)Cov(m
1
, m
2
) =
V ar(m
1
) + V ar(m
2
)
=0fornon-independent data! since Cov(m
1
, m
2
)
Insuchcase,testing (1)isequivalent totesting one-samplehypothesis fordata
D1(=X
1
Y
1
), ... ,D
n
(=X
n
Y
n
):
H
0
:
D
= 0 (4)
against
H
1
:1)
D
= 0 or
2)
D
<0 or
3)
D
>0.
4
Condence Intervals
m
1
m
2
For a test statistics d
obt
=
SE(m
1
m
2
)
we reject H
0
if |d
obt
| d
crit
(/2). If

< d
crit
(/2) d
crit
(/2)< d
obt
weconcludethatevidenceagainstH
0
isnotstatisticallysignicantat- signicance
level.
Concence interval for
1
2
is computed by inverting non-rejection region
m
1
m
2
< d
crit
(/2) d
crit
(/2)<
SE(m
1
m
2
)

d
crit
(/2)SE(m
1
m
2
)< m
1
m
2
< d
crit
(/2)SE(m
1
m
2
)
with (1)100%condence interval for
1
2
:

((m
1
m
2
)d
crit
(/2)SE(m
1
m
2
); (m
1
m
2
) + d
crit
(/2)SE(m
1
m
2
))
5

Handout: Two Sample Hypothesis Testing and Inference For

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Handout: Two Sample Hypothesis Testing and Inference For

Uploaded by

Copyright:

Available Formats

Class 9.

which follows some distribution d .

You might also like