You are on page 1of 11

Chapter6:ConfidenceIntervalsandHypothesisTesting

Whenanalyzingdata,wecantjustacceptthesamplemeanorsampleproportionastheofficial

(samplemeanandsample
meanorproportion.Whenweestimatethestatistics x , p
proportion),wegetdifferentanswersduetovariability.

Sowehavetoperformstatisticalinference:

ConfidenceInterval:whenyouwanttoestimateapopulationparameter

SignificanceTesting:whenwewanttoassesstheevidenceprovidedbythedatainfavorof
someclaimaboutthepopulation.

Section6.1:ConfidenceIntervals

allowustoestimatearangeofvaluesforthepopulationmeanorproportion.

Thetruemeanorproportionforthepopulationexistsandisafixednumber,butwedontknow
it!

Usingsamplestatisticswegetanestimateofwhereweexpectthepopulationparametertobe.

Ifwetakeasinglesample,oursingle
confidenceintervalnetmayormaynot
includethepopulationparameter.

Howeverifwetakemanysamplesofthe
samesizeandcreateaconfidenceinterval
fromeachsamplestatistic,overthelongrun
95%ofourconfidenceintervalswillcontain
thetruepopulationparameter(ifweare
usinga95%confidencelevel).

Ifyouincreaseyoursamplesize(n),youdecreaseyourmarginoferror

Ifyouincreaseyourconfidencelevel(C),thenyouincreaseyourmarginoferror

Asmallermarginoferrorisgoodbecausewegetasmallerrangeofwheretoexpectthetrue
populationparameter.
Confidenceintervalformulaslooklikeestimatemarginoferror.
Wewritetheintervalsas(lowerbound,upperbound).

ConfidenceIntervalforaPopulationMean,:

x z*

x
n

wherez*isthevalueonthestandardnormalcurvewithareCbetweenz*andz*.
z*

1.645

1.960

2.576

90%

95%

99%

(TableDinthebackofthebookcontainsmorevalues,butthesearethemostcommon)

SampleSize,n,forDesiredMarginofError,m:

z * x
n

m
2

Notethatitisthesamplesize,n,thatinfluencesthemarginoferror.Thepopulationsizehas
nothingtodowithit.

Waystoreduceyourmarginoferror:
1.) Increasesamplesize
2.) Usealowerlevelofconfidence(smallerC)
3.) Reduce x

x z*

n undercertaincircumstances:
Becareful!!!!Youcanonlyusetheformula
DatamustbeanSRSfromthepopulation.

DonotuseifthesamplingisanythingmorecomplicatedthananSRS.

Datamustbecollectedcorrectly(nobias).Themarginoferrorcoversonlyrandom
samplingerrors.Undercoverageandnonresponsearenotcovered.

Outlierscanhaveabigeffectontheconfidenceinterval.(Thismakessensebecausewe
usethemeanandstandarddeviationtogetaCI.)

Youmustknowthestandarddeviationofthepopulation,

x .

EXAMPLE1:Aquestionnaireofspendinghabitswasgiventoarandomsampleofcollege
students.Eachstudentwasaskedtorecordandreporttheamountofmoneytheyspenton
textbooksinasemester.Thesampleof130studentsresultedinanaverageof$422with
standarddeviationof$57.

a)
Givea90%confidenceintervalforthemeanamountofmoneyspentbycollege
studentsontextbooks.

b)
Isittruethat90%ofthestudentsspenttheamountofmoneyfoundinthe
intervalfrompart(a)?Explainyouranswer.

c)
Whatisthemarginoferrorforthe90%confidenceinterval?

d)
Howmanystudentsshouldyousampleifyouwantamarginoferrorof$5fora
90%confidenceinterval?

EXAMPLE2:Asampleof12STAT301studentsyieldsthefollowingExam1scores:

78
62
99
85
94
53
88
90
86
92
75
92

Assumethatthepopulationstandarddeviationis10.Thesamplemeancanbecalculatedusing
SPSSorcalculatortobe82.83.

(Note:DoNOTuseanySPSSconfidenceintervalstheyaregoodonlyforChapter7,notthis
typeofCI.YoumustgettheseZconfidenceintervalsbyhand.)

a)
Findthe90%confidenceintervalforthemeanscoreforSTAT301students.

b)
Findthe95%confidenceinterval.

c)
Findthe99%confidenceinterval.

d)
Howdothemarginsoferrorin(b),(c),and(d)changeastheconfidencelevel
increases?Why?

Section6.2:HypothesisTesting

The4stepscommontoalltestsofsignificance:

1.
StatethenullhypothesisH0andthealternativehypothesisHa.

2.
Calculatethevalueoftheteststatistic.

3.
DrawapictureofwhatHalookslike,andfindthePvalue.

4.
Stateyourconclusionaboutthedatainasentence,usingthePvalueand/orcomparing
thePvaluetoasignificancelevelforyourevidence.

STEP1:StatethenullhypothesisH0andthealternativehypothesisHa.

Todoasignificancetest,youneed2hypotheses:

H0,NullHypothesis:thestatementbeingtested,usuallyphrasedasnoeffectorno
difference.

Ha,AlternativeHypothesis:thestatementwehopeorsuspectistrueinsteadofH0.

Hypothesesalwaysrefertosomepopulationormodel.Nottoaparticularoutcome.

Hypothesescanbeonesidedortwosided.

Onesidedhypothesis:coversjustpartoftherangeforyourparameter

H0:=10

OR

H0:=10

Ha:<10
Ha:>10

Twosidedhypothesis:coversthewholepossiblerangeforyourparameter

H0:=10
Ha:10

EventhoughHaiswhatwehopeorbelievetobetrue,ourtestgivesevidencefororagainst
H0only.

WeneverproveH0true,wecanonlystatewhetherwehaveenoughevidencetorejectH0
(whichisevidenceinfavorofHa,butnotproofthatHaistrue)orthatwedonthaveenough
evidencetorejectH0.
6


Example(Exercise6.37,p.418):
Eachofthefollowingsituationsrequiresasignificancetestaboutapopulationmean.Statethe
appropriatenullhypothesisH0andalternativehypothesisHaineachcase:

a. CensusBureaudatashowsthatthemeanhouseholdincomeintheareaservedbyashopping
mallis$72,500peryear.Amarketresearchfirmquestionsshoppersatthemalltofindout
whetherthemeanhouseholdincomeofmallshoppersishigherthanthatofthegeneral
population.

b. Lastyear,yourcompanysservicetechnicianstookanaverageof1.8hourstorespondtotrouble
callsfrombusinesscustomerswhohadpurchasedservicecontracts.Dothisyearsdatashowa
differentaverageresponsetime?

STEP2:Calculatethevalueoftheteststatistic.

AteststatisticmeasurescompatibilitybetweentheH0andthedata.Theformulafortheteststatistic
willvarybetweendifferenttypesofproblems.InproblemslikethosewestudiedinChapter6,thetest
statisticwillbetheZscore.

STEP3:DrawapictureofwhatHalookslike,andfindthePvalue.

Pvalue:theprobability,computedassumingthatH0istrue,thattheteststatisticwouldtakeavalueas
extremeormoreextremethanthatactuallyobservedduetorandomfluctuation.Itisameasureofhow
unusualyoursampleresultsare.

ThesmallerthePvalue,thestrongertheevidenceagainstH0providedbythedata.

CalculatethePvaluebyusingthesamplingdistributionoftheteststatistic(onlythenormal
distributionforChapter6).

STEP4:CompareyourPvaluetoasignificancelevel.Stateyourconclusionaboutthedataina
sentence.
ComparePvaluetoasignificancelevel,.

IfthePvalue,wecanrejectH0.

IfyoucanrejectH0,yourresultsaresignificant.

IfyoudonotrejectH0,yourresultsarenotsignificant.

ZTestforaPopulationMean

TotestthehypothesisH0:=0basedonanSRSofsizenfromapopulationwithunknownmeanand
knownstandarddeviation,

Z0

x 0
/ n

computetheteststatistic:

thePvaluesforatestofH0against:

Ha:>0isP(ZZ0)

Ha:<0isP(ZZ0)

Ha:0is2*P(Z|Z0|)

ThesePvaluesareexactifthepopulationisnormallydistributed,andareapproximatelycorrectfor
largeninothercases.

EXAMPLES

1.

LastyearthegovernmentmadeaclaimthattheaverageincomeoftheAmericanpeoplewas
$33,950.However,asampleof50peopletakenrecentlyshowedanaverageincomeof$34,076
withapopulationstandarddeviationof$324.Isthegovernmentsestimatetoolow?Conducta
significancetesttoseeifthetruemeanismorethanthereportedaverage.Usean=0.01.

2.

Anenvironmentalistcollectsaliterofwaterfrom45differentlocationsalongthebanksofa
stream.Hemeasurestheamountofdissolvedoxygenineachspecimen.Themeanoxygenlevel
is4.62mg,withtheoverallstandarddeviationof0.92.Awaterpurifyingcompanyclaimsthatthe
meanlevelofoxygeninthewateris5mg.Conductahypothesistestwith=0.001todetermine
whetherthemeanoxygenlevelislessthan5mg.

3.
Anagroeconomistexaminesthecellulosecontentofavarietyofalfalfahay.Supposethatthe
cellulosecontentinthepopulationhasastandarddeviationof8mg.Asampleof15cuttingshasa
meancellulosecontentof145mg.Apreviousstudyclaimedthatthemeancellulosecontentwas140
mg.Performahypothesistesttodetermineifthemeancellulosecontentisdifferentfrom140mgif
=0.05.

Howdoesrelatetoconfidenceintervals?
Ifyouhavea2sidedtest,andiftheandconfidenceleveladdto100%,youcanrejectH0if0(the
numberyouwerechecking)isnotintheconfidenceinterval.

a)

Finda95%confidenceintervalforthemeancellulosecontentfromtheaboveexample.

b)

Nowtrythetestfrompartnumber3againusingtheconfidenceintervalfrompartbtodo
thehypothesistest.(Theresultshouldbethesame.)

AnnualDrinkingWaterQualityReport,2004,TownofBrookston,IN
Impleasedtoreportthatourdrinkingwaterissafeandmeetsfederalandstaterequirements.
TestResults(MCListhemaximumcontaminantlevel,thehighestlevelofacontaminantthatisallowed
indrinkingwater.)
Contaminant

Violation
Y/N

Level
Detected

Unit
measurement

MCL

Beta/photonemitters N

2.1 3.2

mrem/yr

Alphaemitters

0 1.6

pCi/l

15

Barium

0.216

ppm

Copper

0.039to0.453 ppm

1.3

Fluoride

0.01

ppm

Sodium

0.0

ppm

N/A

Oneoftheseviolationreportsshouldactuallybeayesinsteadofano.Whichoneisitandwhy?
Whathypothesesgoalongwiththeseconfidenceintervals?
Note:WhenIcalledthetownofBrookstonofficetoaskthemaboutthis,thewatermanagercalledthe
stateEPAofficetogetmoreinformation.Whattheytoldhimwasthat,yes,technicallyIwascorrect,but
thattheydontusetheconfidenceintervalsthatarereported.ApparentlythesearetheFEDERALEPA
rules.Theyonlyusethemean.Itriedtogetsamplesizeorotherinformation,butIwasntabletolearn
anythingmore.
10

Pvaluescanbemoreinformativethanareject/donotrejectH0basedon.AsPvaluegetssmallerthe
evidenceforrejectingH0getsstronger.
Justbecauseweuse=0.05alotdoesntmeanthatsthelevelyouhavetouseitsjustthemost
common.Theresnothingparticularlyspecialaboutthatlevel.
Inalargesample,eventinydeviationsfromthenullhypothesiscanbeimportant.
IfwefailtorejectH0,itmaybebecauseH0istrueorbecauseoursamplesizeisinsufficienttodetect
thealternative.
PlotyourdataandlookatyourPvaluetodetermineyourconclusions.Couldoutliersbepartofthe
problem?
Aconfidenceintervalactuallyestimatesthesizeofaneffectratherthansimplyaskingifitistoolargeto
reasonablyoccurbychancealone.
Youmusthaveawelldesignedexperimentinorderforstatisticalinferencetowork.Randomizationis
important.

11

You might also like