Professional Documents
Culture Documents
Whenanalyzingdata,wecantjustacceptthesamplemeanorsampleproportionastheofficial
(samplemeanandsample
meanorproportion.Whenweestimatethestatistics x , p
proportion),wegetdifferentanswersduetovariability.
Sowehavetoperformstatisticalinference:
ConfidenceInterval:whenyouwanttoestimateapopulationparameter
SignificanceTesting:whenwewanttoassesstheevidenceprovidedbythedatainfavorof
someclaimaboutthepopulation.
Section6.1:ConfidenceIntervals
allowustoestimatearangeofvaluesforthepopulationmeanorproportion.
Thetruemeanorproportionforthepopulationexistsandisafixednumber,butwedontknow
it!
Usingsamplestatisticswegetanestimateofwhereweexpectthepopulationparametertobe.
Ifwetakeasinglesample,oursingle
confidenceintervalnetmayormaynot
includethepopulationparameter.
Howeverifwetakemanysamplesofthe
samesizeandcreateaconfidenceinterval
fromeachsamplestatistic,overthelongrun
95%ofourconfidenceintervalswillcontain
thetruepopulationparameter(ifweare
usinga95%confidencelevel).
Ifyouincreaseyoursamplesize(n),youdecreaseyourmarginoferror
Ifyouincreaseyourconfidencelevel(C),thenyouincreaseyourmarginoferror
Asmallermarginoferrorisgoodbecausewegetasmallerrangeofwheretoexpectthetrue
populationparameter.
Confidenceintervalformulaslooklikeestimatemarginoferror.
Wewritetheintervalsas(lowerbound,upperbound).
ConfidenceIntervalforaPopulationMean,:
x z*
x
n
wherez*isthevalueonthestandardnormalcurvewithareCbetweenz*andz*.
z*
1.645
1.960
2.576
90%
95%
99%
(TableDinthebackofthebookcontainsmorevalues,butthesearethemostcommon)
SampleSize,n,forDesiredMarginofError,m:
z * x
n
m
2
Notethatitisthesamplesize,n,thatinfluencesthemarginoferror.Thepopulationsizehas
nothingtodowithit.
Waystoreduceyourmarginoferror:
1.) Increasesamplesize
2.) Usealowerlevelofconfidence(smallerC)
3.) Reduce x
x z*
n undercertaincircumstances:
Becareful!!!!Youcanonlyusetheformula
DatamustbeanSRSfromthepopulation.
DonotuseifthesamplingisanythingmorecomplicatedthananSRS.
Datamustbecollectedcorrectly(nobias).Themarginoferrorcoversonlyrandom
samplingerrors.Undercoverageandnonresponsearenotcovered.
Outlierscanhaveabigeffectontheconfidenceinterval.(Thismakessensebecausewe
usethemeanandstandarddeviationtogetaCI.)
Youmustknowthestandarddeviationofthepopulation,
x .
EXAMPLE1:Aquestionnaireofspendinghabitswasgiventoarandomsampleofcollege
students.Eachstudentwasaskedtorecordandreporttheamountofmoneytheyspenton
textbooksinasemester.Thesampleof130studentsresultedinanaverageof$422with
standarddeviationof$57.
a)
Givea90%confidenceintervalforthemeanamountofmoneyspentbycollege
studentsontextbooks.
b)
Isittruethat90%ofthestudentsspenttheamountofmoneyfoundinthe
intervalfrompart(a)?Explainyouranswer.
c)
Whatisthemarginoferrorforthe90%confidenceinterval?
d)
Howmanystudentsshouldyousampleifyouwantamarginoferrorof$5fora
90%confidenceinterval?
EXAMPLE2:Asampleof12STAT301studentsyieldsthefollowingExam1scores:
78
62
99
85
94
53
88
90
86
92
75
92
Assumethatthepopulationstandarddeviationis10.Thesamplemeancanbecalculatedusing
SPSSorcalculatortobe82.83.
(Note:DoNOTuseanySPSSconfidenceintervalstheyaregoodonlyforChapter7,notthis
typeofCI.YoumustgettheseZconfidenceintervalsbyhand.)
a)
Findthe90%confidenceintervalforthemeanscoreforSTAT301students.
b)
Findthe95%confidenceinterval.
c)
Findthe99%confidenceinterval.
d)
Howdothemarginsoferrorin(b),(c),and(d)changeastheconfidencelevel
increases?Why?
Section6.2:HypothesisTesting
The4stepscommontoalltestsofsignificance:
1.
StatethenullhypothesisH0andthealternativehypothesisHa.
2.
Calculatethevalueoftheteststatistic.
3.
DrawapictureofwhatHalookslike,andfindthePvalue.
4.
Stateyourconclusionaboutthedatainasentence,usingthePvalueand/orcomparing
thePvaluetoasignificancelevelforyourevidence.
STEP1:StatethenullhypothesisH0andthealternativehypothesisHa.
Todoasignificancetest,youneed2hypotheses:
H0,NullHypothesis:thestatementbeingtested,usuallyphrasedasnoeffectorno
difference.
Ha,AlternativeHypothesis:thestatementwehopeorsuspectistrueinsteadofH0.
Hypothesesalwaysrefertosomepopulationormodel.Nottoaparticularoutcome.
Hypothesescanbeonesidedortwosided.
Onesidedhypothesis:coversjustpartoftherangeforyourparameter
H0:=10
OR
H0:=10
Ha:<10
Ha:>10
Twosidedhypothesis:coversthewholepossiblerangeforyourparameter
H0:=10
Ha:10
EventhoughHaiswhatwehopeorbelievetobetrue,ourtestgivesevidencefororagainst
H0only.
WeneverproveH0true,wecanonlystatewhetherwehaveenoughevidencetorejectH0
(whichisevidenceinfavorofHa,butnotproofthatHaistrue)orthatwedonthaveenough
evidencetorejectH0.
6
Example(Exercise6.37,p.418):
Eachofthefollowingsituationsrequiresasignificancetestaboutapopulationmean.Statethe
appropriatenullhypothesisH0andalternativehypothesisHaineachcase:
a. CensusBureaudatashowsthatthemeanhouseholdincomeintheareaservedbyashopping
mallis$72,500peryear.Amarketresearchfirmquestionsshoppersatthemalltofindout
whetherthemeanhouseholdincomeofmallshoppersishigherthanthatofthegeneral
population.
b. Lastyear,yourcompanysservicetechnicianstookanaverageof1.8hourstorespondtotrouble
callsfrombusinesscustomerswhohadpurchasedservicecontracts.Dothisyearsdatashowa
differentaverageresponsetime?
STEP2:Calculatethevalueoftheteststatistic.
AteststatisticmeasurescompatibilitybetweentheH0andthedata.Theformulafortheteststatistic
willvarybetweendifferenttypesofproblems.InproblemslikethosewestudiedinChapter6,thetest
statisticwillbetheZscore.
STEP3:DrawapictureofwhatHalookslike,andfindthePvalue.
Pvalue:theprobability,computedassumingthatH0istrue,thattheteststatisticwouldtakeavalueas
extremeormoreextremethanthatactuallyobservedduetorandomfluctuation.Itisameasureofhow
unusualyoursampleresultsare.
ThesmallerthePvalue,thestrongertheevidenceagainstH0providedbythedata.
CalculatethePvaluebyusingthesamplingdistributionoftheteststatistic(onlythenormal
distributionforChapter6).
STEP4:CompareyourPvaluetoasignificancelevel.Stateyourconclusionaboutthedataina
sentence.
ComparePvaluetoasignificancelevel,.
IfthePvalue,wecanrejectH0.
IfyoucanrejectH0,yourresultsaresignificant.
IfyoudonotrejectH0,yourresultsarenotsignificant.
ZTestforaPopulationMean
TotestthehypothesisH0:=0basedonanSRSofsizenfromapopulationwithunknownmeanand
knownstandarddeviation,
Z0
x 0
/ n
computetheteststatistic:
thePvaluesforatestofH0against:
Ha:>0isP(ZZ0)
Ha:<0isP(ZZ0)
Ha:0is2*P(Z|Z0|)
ThesePvaluesareexactifthepopulationisnormallydistributed,andareapproximatelycorrectfor
largeninothercases.
EXAMPLES
1.
LastyearthegovernmentmadeaclaimthattheaverageincomeoftheAmericanpeoplewas
$33,950.However,asampleof50peopletakenrecentlyshowedanaverageincomeof$34,076
withapopulationstandarddeviationof$324.Isthegovernmentsestimatetoolow?Conducta
significancetesttoseeifthetruemeanismorethanthereportedaverage.Usean=0.01.
2.
Anenvironmentalistcollectsaliterofwaterfrom45differentlocationsalongthebanksofa
stream.Hemeasurestheamountofdissolvedoxygenineachspecimen.Themeanoxygenlevel
is4.62mg,withtheoverallstandarddeviationof0.92.Awaterpurifyingcompanyclaimsthatthe
meanlevelofoxygeninthewateris5mg.Conductahypothesistestwith=0.001todetermine
whetherthemeanoxygenlevelislessthan5mg.
3.
Anagroeconomistexaminesthecellulosecontentofavarietyofalfalfahay.Supposethatthe
cellulosecontentinthepopulationhasastandarddeviationof8mg.Asampleof15cuttingshasa
meancellulosecontentof145mg.Apreviousstudyclaimedthatthemeancellulosecontentwas140
mg.Performahypothesistesttodetermineifthemeancellulosecontentisdifferentfrom140mgif
=0.05.
Howdoesrelatetoconfidenceintervals?
Ifyouhavea2sidedtest,andiftheandconfidenceleveladdto100%,youcanrejectH0if0(the
numberyouwerechecking)isnotintheconfidenceinterval.
a)
Finda95%confidenceintervalforthemeancellulosecontentfromtheaboveexample.
b)
Nowtrythetestfrompartnumber3againusingtheconfidenceintervalfrompartbtodo
thehypothesistest.(Theresultshouldbethesame.)
AnnualDrinkingWaterQualityReport,2004,TownofBrookston,IN
Impleasedtoreportthatourdrinkingwaterissafeandmeetsfederalandstaterequirements.
TestResults(MCListhemaximumcontaminantlevel,thehighestlevelofacontaminantthatisallowed
indrinkingwater.)
Contaminant
Violation
Y/N
Level
Detected
Unit
measurement
MCL
Beta/photonemitters N
2.1 3.2
mrem/yr
Alphaemitters
0 1.6
pCi/l
15
Barium
0.216
ppm
Copper
0.039to0.453 ppm
1.3
Fluoride
0.01
ppm
Sodium
0.0
ppm
N/A
Oneoftheseviolationreportsshouldactuallybeayesinsteadofano.Whichoneisitandwhy?
Whathypothesesgoalongwiththeseconfidenceintervals?
Note:WhenIcalledthetownofBrookstonofficetoaskthemaboutthis,thewatermanagercalledthe
stateEPAofficetogetmoreinformation.Whattheytoldhimwasthat,yes,technicallyIwascorrect,but
thattheydontusetheconfidenceintervalsthatarereported.ApparentlythesearetheFEDERALEPA
rules.Theyonlyusethemean.Itriedtogetsamplesizeorotherinformation,butIwasntabletolearn
anythingmore.
10
Pvaluescanbemoreinformativethanareject/donotrejectH0basedon.AsPvaluegetssmallerthe
evidenceforrejectingH0getsstronger.
Justbecauseweuse=0.05alotdoesntmeanthatsthelevelyouhavetouseitsjustthemost
common.Theresnothingparticularlyspecialaboutthatlevel.
Inalargesample,eventinydeviationsfromthenullhypothesiscanbeimportant.
IfwefailtorejectH0,itmaybebecauseH0istrueorbecauseoursamplesizeisinsufficienttodetect
thealternative.
PlotyourdataandlookatyourPvaluetodetermineyourconclusions.Couldoutliersbepartofthe
problem?
Aconfidenceintervalactuallyestimatesthesizeofaneffectratherthansimplyaskingifitistoolargeto
reasonablyoccurbychancealone.
Youmusthaveawelldesignedexperimentinorderforstatisticalinferencetowork.Randomizationis
important.
11