Professional Documents
Culture Documents
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
PublishedonSTAT510(https://onlinecourses.science.psu.edu/stat510)
Home>1.1OverviewofTimeSeriesCharacteristics
1.1OverviewofTimeSeries
Characteristics
Inthislesson,welldescribesomeimportantfeaturesthatwemustconsiderwhendescribing
andmodelingatimeseries.Thisismeanttobeanintroductoryoverview,illustratedby
example,andnotacompletelookathowwemodelaunivariatetimeseries.Here,wellonly
considerunivariatetimeseries.Wellexaminerelationshipsbetweentwoormoretimeseries
lateron.
Definition:
Aunivariatetimeseriesisasequenceofmeasurementsofthesamevariablecollected
overtime.Mostoften,themeasurementsaremadeatregulartimeintervals.
Onedifferencefromstandardlinearregressionisthatthedataarenotnecessarily
independentandnotnecessarilyidenticallydistributed.Onedefiningcharacteristicoftime
seriesisthatthisisalistofobservationswheretheorderingmatters.Orderingisvery
importantbecausethereisdependencyandchangingtheordercouldchangethemeaningof
thedata.
BasicObjectivesoftheAnalysis
Thebasicobjectiveusuallyistodetermineamodelthatdescribesthepatternofthetime
series.Usesforsuchamodelare:
1.
2.
3.
4.
Todescribetheimportantfeaturesofthetimeseriespattern.
Toexplainhowthepastaffectsthefutureorhowtwotimeseriescaninteract.
Toforecastfuturevaluesoftheseries.
Topossiblyserveasacontrolstandardforavariablethatmeasuresthequalityof
productinsomemanufacturingsituations.
TypesofModels
Therearetwobasictypesoftimedomainmodels.
1. Modelsthatrelatethepresentvalueofaseriestopastvaluesandpastprediction
errorsthesearecalledARIMAmodels(forAutoregressiveIntegratedMoving
Average).Wellspendsubstantialtimeonthese.
2. Ordinaryregressionmodelsthatusetimeindicesasxvariables.Thesecanbehelpful
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
1/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
foraninitialdescriptionofthedataandformthebasisofseveralsimpleforecasting
methods.
ImportantCharacteristicstoConsiderFirst
Someimportantquestionstofirstconsiderwhenfirstlookingatatimeseriesare:
Isthereatrend,meaningthat,onaverage,themeasurementstendtoincrease(or
decrease)overtime?
Isthereseasonality,meaningthatthereisaregularlyrepeatingpatternofhighsand
lowsrelatedtocalendartimesuchasseasons,quarters,months,daysoftheweek,and
soon?
Aretheiroutliers?Inregression,outliersarefarawayfromyourline.Withtimeseries
data,youroutliersarefarawayfromyourotherdata.
Istherealongruncycleorperiodunrelatedtoseasonalityfactors?
Isthereconstantvarianceovertime,oristhevariancenonconstant?
Arethereanyabruptchangestoeithertheleveloftheseriesorthevariance?
Example1
Thefollowingplotisatimeseriesplotoftheannualnumberofearthquakesintheworldwith
seismicmagnitudeover7.0,fora99consecutiveyears.Byatimeseriesplot,wesimply
meanthatthevariableisplottedagainsttime.
Somefeaturesoftheplot:
Thereisnoconsistenttrend(upwardordownward)overtheentiretimespan.The
seriesappearstoslowlywanderupanddown.Thehorizontallinedrawnatquakes=
20.2indicatesthemeanoftheseries.Noticethattheseriestendstostayonthesame
sideofthemean(aboveorbelow)forawhileandthenwanderstotheotherside.
Almostbydefinition,thereisnoseasonalityasthedataareannualdata.
Therearenoobviousoutliers.
Itsdifficulttojudgewhetherthevarianceisconstantornot.
OneofthesimplestARIMAtypemodelsisamodelinwhichweusealinearmodeltopredict
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
2/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
thevalueatthepresenttimeusingthevalueattheprevioustime.ThisiscalledanAR(1)
model,standingforautoregressivemodeloforder1.Theorderofthemodelindicates
howmanyprevioustimesweusetopredictthepresenttime.
AstartinevaluatingwhetheranAR(1)mightworkistoplotvaluesoftheseriesagainstlag1
valuesoftheseries.Letxtdenotethevalueoftheseriesatanyparticulartimet,soxt1
denotesthevalueoftheseriesonetimebeforetimet.Thatis,xt1isthelag1valueofxt.
Asashortexample,herearethefirstfivevaluesintheearthquakeseriesalongwiththeirlag
1values:
t xt
xt1(lag1value)
1 13 *
2 14 13
3 8
14
4 10 8
5 16 10
Forthecompleteearthquakedataset,heresaplotofxtversusxt1:
Although,itsonlyamoderatelystrongrelationship,thereisapositivelinearassociationso
anAR(1)modelmightbeausefulmodel.
TheAR(1)model
Theoretically,theAR(1)modeliswritten
xt = + 1 xt1 + wt
Assumptions:
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
3/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
iid
,meaningthattheerrorsareindependentlydistributedwithanormal
distributionthathasmean0andconstantvariance.
Propertiesoftheerrorswt areindependentofx .
2
wt N (0, w )
Thisisessentiallytheordinarysimplelinearregressionequation,butthereisonedifference.
Althoughitsnotusuallytrue,inordinaryleastsquaresregressionweassumethatthex
variableisnotrandombutinsteadissomethingwecancontrol.Thatsnotthecasehere,but
inourfirstencounterwithtimeserieswelloverlookthatanduseordinaryregression
methods.Welldothingstherightwaylaterinthecourse.
FollowingisMinitaboutputfortheAR(1)regressioninthisexample:
quakes=9.19+0.543lag1
98casesused,1casescontainmissingvalues
Predictor Coef
SECoef T
Constant
9.191
1.819
lag1
5.05 0.000
S=6.12239RSq=29.7%RSq(adj)=29.0%
Weseethattheslopecoefficientissignificantlydifferentfrom0,sothelag1variableisa
helpfulpredictor.TheR2valueisrelativelyweakat29.7%,though,sothemodelwontgive
usgreatpredictions.
ResidualAnalysis
Intraditionalregression,aplotofresidualsversusfitsisausefuldiagnostictool.Theideal
forthisplotisahorizontalbandofpoints.Followingisaplotofresidualsversuspredicted
valuesforourestimatedmodel.Itdoesntshowanyseriousproblems.Theremightbeone
possibleoutlieratafittedvalueofabout28.
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
4/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
Example2
Theplotatthetopofthenextpageshowsatimeseriesofquarterlyproductionofbeerin
Australiafor18years.
Someimportantfeaturesare:
Thereisanupwardtrend,possiblyacurvedone.
Thereisseasonalityaregularlyrepeatingpatternofhighsandlowsrelatedto
quartersoftheyear.
Theremightbeincreasingvariationaswemoveacrosstime,althoughthatsuncertain.
ThereareARIMAmethodsfordealingwithseriesthatexhibitbothtrendandseasonality,but
forthisexamplewelluseordinaryregressionmethods.
Classicalregressionmethodsfortrendandseasonaleffects
Tousetraditionalregressionmethods,wemightmodelthepatterninthebeerproduction
dataasacombinationoftrendovertimeandquarterlyeffectvariables.
Supposethattheobservedseriesisxt ,fort
= 1, 2, , n
Foralineartrend,uset(thetimeindex)asapredictorvariableinaregression.
Foraquadratictrend,wemightconsiderusingbothtandt2.
Forquarterlydata,withpossibleseasonal(quarterly)effects,wecandefineindicator
variablessuchasSj=1ifobservationisinquarterjofayearand0otherwise.There
are4suchindicators.
iid
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
5/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
Toaddaquadratictrend,whichmaybethecaseinourexample,themodelis
xt = 1 t + 2 t
+ 1 S1 + 2 S2 + 3 S3 + 4 S4 + t
Notethatwevedeletedtheinterceptfromthemodel.Thisisntnecessary,butifweinclude
itwellhavetodroponeoftheseasonaleffectvariablesfromthemodeltoavoidcollinearity
issues.
BacktoExample2:FollowingistheMinitaboutputforamodelwithaquadratictrendand
seasonaleffects.Allfactorsarestatisticallysignificant.
Predictor
Coef
Noconstant
SECoef
0.2193
2.68
0.009
Time
0.5881
tsqrd
quarter_1
261.930
3.937
66.52 0.000
quarter_2
212.165
3.968
53.48 0.000
quarter_3
228.415
3.994
57.18 0.000
quarter_4
310.880
4.018
77.37 0.000
ResidualAnalysis
Forthisexample,theplotofresidualsversusfitsdoesntlooktoobad,althoughwemightbe
concernedbythestringofpositiveresidualsatthefarright.
Whendataaregatheredovertime,wetypicallyareconcernedwithwhetheravalueatthe
presenttimecanbepredictedfromvaluesatpasttimes.Wesawthisintheearthquakedata
ofexample1whenweusedanAR(1)structuretomodelthedata.Forresiduals,however,
thedesirableresultisthatthecorrelationis0betweenresidualsseparatedbyanygiventime
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
6/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
span.Inotherwords,residualsshouldbeunrelatedtoeachother.
SampleAutocorrelationFunction(ACF)
Thesampleautocorrelationfunction(ACF)foraseriesgivescorrelationsbetweentheseries
xtandlaggedvaluesoftheseriesforlagsof1,2,3,andsoon.Thelaggedvaluescanbe
writtenasxt1,xt2,xt3,andsoon.TheACFgivescorrelationsbetweenxtandxt1,xtandxt
2,andsoon.
TheACFcanbeusedtoidentifythepossiblestructureoftimeseriesdata.Thatcanbetricky
goingasthereoftenisntasingleclearcutinterpretationofasampleautocorrelation
function.WellgetstartedonthatinLesson1.2thisweek.TheACFoftheresidualsfora
modelisalsouseful.TheidealforasampleACFofresidualsisthattherearentany
significantcorrelationsforanylag.
FollowingistheACFoftheresidualsfortheExample1,theearthquakeexample,wherewe
usedanAR(1)model.Thelag(timespanbetweenobservations)isshownalongthe
horizontal,andtheautocorrelationisonthevertical.Theredlinesindicatedboundsfor
statisticalsignificance.ThisisagoodACFforresiduals.Nothingissignificantthatswhat
wewantforresiduals.
TheACFoftheresidualsforthequadratictrendplusseasonalitymodelweusedforExample
2looksgoodtoo.Again,thereappearstobenosignificantautocorrelationintheresiduals.
TheACFoftheresidualfollows:
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
7/8
9/1/2015
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
Lesson1.2willgivemoredetailsabouttheACF.Lesson1.3willgivesomeRcodefor
examplesinLessons1.1and1.2.
SourceURL:https://onlinecourses.science.psu.edu/stat510/node/47
https://onlinecourses.science.psu.edu/stat510/print/book/export/html/47
8/8