You are on page 1of 30

Lecture 11: An introduction to time series

Moursli Mohamed Reda


University of Gothenburg

February 5, 2013

Moursli Mohamed Reda

Lecture 11: An introduction to time series

What we previously covered


I

The definition of a random process yt

White noise process, usually a property we seek in the error


term of a regression

Weak Stationarity: an essential set of assumptions about the


moments of our process, which guarantees a well behaving
series

The assumptions required for the Gauss-Markov Theorem to


apply

The importance of properly handeling Trends in the data.

Serial correlation
Moursli Mohamed Reda

Lecture 11: An introduction to time series

Recap
Autocorrelation Function
I

We generalize the concept of correlation to that of


autocorellation when looking at the linear dependence within
a (weakly) stationary series zt .

the autocorrelation between zt and zts is given as:


s = p

Cov(zt , zts )
Var(zt )Var(zts )

Cov(zt , zts )
Var(zt )

In the result above, the constant variance assumption implied


by the weak stationarity is used.

We can also estimate the Lag-1 sample autocorrelation as


above.

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Recap
Autocorrelation Function
I

Assume we have a sample of returns rt with t = 1, ..., T, r is


the sample mean then the Lag-1 sample autocorrelation of rt
is given by:
Tt=2 (rt r)(rt1 r)
Tt=1 (rt r)2
In general, the Lag-s sample autocorrelation of rt is defined as:
1 =

s =

Tt=s+1 (rt r)(rts r)


,
Tt=1 (rt r)2

0 s < T1
I

The statistics defined above, 1 , 2 , ... are called the ACF of


rt , and allow us to capture the linear dynamics of a time series
data.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The following four slides show the ACF and PACF of the monthly
US Stock Market Return Index, and that of US Dividend Yield

0.20

Autocorrelations of rusa
0.10
0.00
0.10

0.20

Figure : ACF Return Index US 12 Lags

10

15

Lag
Bartletts formula for MA(q) 95% confidence bands

Moursli Mohamed Reda

Lecture 11: An introduction to time series

0.20

Partial autocorrelations of rusa


0.10
0.00
0.10

0.20

Figure : PACF Return Index US 12 Lags

10

15

Lag
95% Confidence bands [se = 1/sqrt(n)]

Moursli Mohamed Reda

Lecture 11: An introduction to time series

1.00

Autocorrelations of pdusa
0.50
0.00
0.50

1.00

Figure : ACF Dividend Yield US 36 Lags

10

20
Lag

30

40

Bartletts formula for MA(q) 95% confidence bands

Moursli Mohamed Reda

Lecture 11: An introduction to time series

0.00

Partial autocorrelations of pdusa


0.50

1.00

Figure : PACF Dividend Yield US 36 Lags

10

20
Lag

30

40

95% Confidence bands [se = 1/sqrt(n)]

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Weakly Dependent Series


I

We define now another concept closely related to stationarity,


but slightly different:
I

Two series are said to be weakly dependent when they start


becoming more independent the longer the time horizon

assume we have the following series xt and xt+h then the


higher h is the more independent the two series become,
which in turn implies that:
I

Corr[xt , xt+h ] 0 as h
In the case above, we call our series to be covariance
stationary or asymptotically uncorrelated

Moursli Mohamed Reda

Lecture 11: An introduction to time series

AR models
Introduction
I

A natural starting point for a forecasting model is to use past


values of yt to forecast yt .

An AR model is a regression model in which yt is regressed


against its own lagged values.

The pth order AR model, denoted AR(p), is written


yt = + 1 yt1 + ... + p ytp + ut
where {ut } is white noise with variance 2 .

and 1 , ..., p do not have causal interpretations.

If 1 = ... = p = 0, then yt1 , ..., ytp are not useful for


forecasting yt .
Moursli Mohamed Reda

Lecture 11: An introduction to time series

AR models
Properties
I

Consider the AR(1) model:


yt = 1 yt1 + ut

What are the mean, variance and covariance of {yt }?

The mean can be obtained from


E(yt ) = 1 E(yt1 ) + E(ut ) = 1 E(yt1 )
If {yt } is stationary, then E(yt ) = E(yt1 ) = , and so we get
= 1 , or
=

0
=0
1 1

Consider next the variance. It holds that


var(yt ) = 12 var(yt1 ) + var(ut ) = 12 var(yt1 ) + 2
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The variance: Continued.


If {yt } is stationary, then var(yt ) = var(yt1 ) = 0 , giving
0 = 12 0 + 2
or
0 =

2
1 12

The covariance at one lag is given by


1 = E(yt yt1 ) = E((1 yt1 + ut )yt1 )

= 1 E(y2t1 ) + E(ut yt1 ) = 1 0


The general pattern is
k = 1k 0
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The kth order autocorrelation is given by


k =

We need |1 | 6= 1 as otherwise
0 =

k 0
k
= 1 = 1k
0
0

2
2
=
=
0
1 12

We can also not have |1 | > 1 because then


0 =

2
<0
1 12

In other words, we need |1 | < 1 in order to ensure that {yt }


is well-behaved.

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Estimation
I

Consider the AR(1) model:


yt = 1 yt1 + ut

As long as yt1 is independent of ut , this model is just like any


other regression model, which can be estimated using OLS.

One problem: Write


1 =

Tt=2 yt1 yt
Tt=2 yt1 (1 yt1 + ut )
=
Tt=2 y2t1
Tt=2 y2t1

= 1 +

Tt=2 yt1 ut
T
1
2
T t = 2 yt 1

1
T

At this point, we want to use the LLN to show that the last
term goes to zero. However, since {yt } is serially correlated
our usual LLN and CLT results cannot be used here.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

Fortunately, there are extensions of the LLN and CLT that


apply when the observations are serially dependent.

Assumptions:
1. E(ut |yt1 ) = 0.
2. {yt } is stationary.
3. yt and ytk become independent as k increases.
4. E(y4t ) is nonzero and finite.

Let It1 = {yt1 , yt2 , ..., y1 } be the past history of yt . In


time series, if
E ( u t | It 1 ) = 0
then ut is called a martingale difference sequence (MDS).
Assumption 1 implies this.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

Assumption 2 is the time series counterpart of the identically


distributed part of the iid assumption when applied to
(y1 , ..., yT ).

Assumptions 3 is the time series counterpart of the


independently distributed part of iid. It replaces the usual
assumption that yt and ytk are independent with the time
series requirement that they become independent as k
increases. This assumption is sometimes referred to as weak
dependence or ergodicity.

Under assumptions 1-4 the OLS estimators are aymptotically


normally distributed, and it also implies that the usual OLS
standard errors, t and F statistics, and LM statistics are
asymptotically valid.

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Non-Stationarity
I

A non-stationary process is a process that violates one of the


stationarity assumptions mentioned above, it is also called a
unit root process

A stationary process is a I(0) process, while an non-stationary


process is an I(1) process

A non-stationary serie that is I(1) can be rendered stationary


by first differencing:
I

If we have a random walk as follow:


yt = 0 + 1 yt1 + et

and if 1 = 1 and et is weakly stationary then by first


differencing the series we get:
yt = yt yt1 = et
Moursli Mohamed Reda

Lecture 11: An introduction to time series

Identifying if the series have a unit root


I

The identification of a unit root can be done based on


graphical inspection when there is a clear trend in the data

Othewise there are several ways to decide on the


non-stationarity of series (we will not cover a formal procedure
in this chapter)

The first order autocorrelation can be an informal way of doing


so: from the AR(1) model we saw that stationarity requires
that |1 | < 1

So if we estimate the first order autocorrelation 1 and it is


close to 1 then we can suspect that we have non-stationarity

The reason of the result above is that the 1 is a consistant


estimator of 1

Moursli Mohamed Reda

Lecture 11: An introduction to time series

The deterministic trend model


I

A deterministic trend is a nonrandom function of time, for


example, t or t2 .

Consider the following deterministic trend model:


yt = + t + ut
where ut is iid.

In the deterministic trend model {yt } is non-stationary.

Proof: Note that


E(yt ) = + t + E(ut ) = + t
which depends on t, thus violating stationarity condition 1. 
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The Effect of the Presence of a Trend

R2 = 1

SSR
p 1
SST

where
T

SSR =

u 2t

t=1

u t = yt t
I

Thus, R2 p 1 even though the observations do not get


closer to the regression line as T . This is therefore a
spurious result!

The reason for why SSR/T is not exploding in the same way
as SST/T is that ut = yt t is stationary.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The spuriously high R2 is due to the fact that while SSR


accounts for the trend SST does not.

This result does not only hold in a regression of yt on a


constant and trend, but extends to all regressions where the
dependent variable is trending, and the trend is included on
the right-hand side.

Solution: Replace yt with the de-trended series u t when


running regressions.

The point here is that the mean and variance have no


meaning if {yt } is non-stationary, as the mean is time-varying
and the variance is exploding.

Moursli Mohamed Reda

Lecture 11: An introduction to time series

The stochastic trend model


I

Unlike a deterministic trend, a stochastic trend is random and


varies over time.

An important example of a stochastic trend is a random walk:


yt = yt 1 + ut
where ut iid(0, 2 ).

Because this is an AR(1) with unit AR slope coefficient, the


characteristic equation is given by
1 1 z = 1 z = 0
which has a root at unity, z = 1.

We therefore say that yt has a unit root.


Moursli Mohamed Reda

Lecture 11: An introduction to time series

The DF test
I

How do you detect trends?

Plot the data

There is a regression-based test for a random walk the


DickeyFuller (DF) test for a unit root.

The AR(1) model can be written as


yt = (1 1)yt1 + ut = yt1 + ut

The unit root hypothesis is given by


H0 : = 0 (or 1 = 1)
H1 : < 0 (or |1 | < 1)

This is a one-sided test; if there is no unit root, yt is


stationary.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The DF statistic is the usual t-statistic for testing = 0:


tDF =

SE()

The difference is that the distribution is no longer standard


normal. New critical values are therefore needed.

How to treat the presence of nonzero intercept and trend


terms?

The decision to use the intercept-only DF test or the intercept


and trend DF test depends on what the alternative is and
what the data look like.
I

In the intercept-only specification, the alternative is that yt is


stationary around a constant.
In the intercept and trend specification, the alternative is that
yt is stationary around a linear time trend.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

when we have serial correlation in our error term, then we


would rather use an augmented version of the DF test

The ADF test includes lags to the standard DF, based on the
persistance of the serial correlation in our error term

The general format of the ADF would look like:


l

yt = (1 1)yt + i yti + ut
i=1

A common problem in testing for the presence of unit root is


the low power of the tests used.

Another issue is that if the data has pronounced shifts, then


tests such as the DF or the ADF are not really suitable.
Moursli Mohamed Reda

Lecture 11: An introduction to time series

Serial Correlation
I

We have seen that for the Gauss-Markov theorem to hold, we


should have no serrial correlation among the error terms of
our model, and the errors should be homoskedastic

If we have serial correlation in the residuals then:


I

The OLS estimator is no longer BLUE

The standard erros and test statistics are not valid (even
asymptotically)

However the R-squared would be still a good measure of


goodness of fit as long as we assume stationarity and weak
dependence (the variances are constant)

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Testing for Serial Correlation


I
I

There are several ways to test for the presence of serial


correlation
We look at a t-test for AR(1) serial correlation with strictly
exogenous regressors
yt = 0 + 1 xt1 + ... + k xtk + ut
ut = ut1 + et
we assume the following:
E[et |ut1 , ...] = 0Var(et |ut1 ) = e2
The null hypothesis would be that H0 : = 0; however the
issue is that ut is unobserved

I
I

I
I

the solution is to use the t from the OLS estimation to


approximate ut
Regress t on t1
Finally use the t-statistic associated with to test for the null
Lecture 11: An introduction to time series
hypothesis Moursli Mohamed Reda

Correcting for Serial Correlation


I

We can correct for serial correlation by using the FGLS


procedure
yt = 0 + 1 xt + ut
yt1 = 0 + 1 xt1 + ut1

If we compute yt yt1 we will get new series for yt and xt ,


and thus we can rewrite our regression as follow:
y t = yt yt1 , x t = xt xt1 ,
yt = 0 + 1 x t + ut
Moursli Mohamed Reda

Lecture 11: An introduction to time series

The steps are as follow:


I

First we run OLS on our regression and we extract the


residuals

Then we run the regression on the error terms and extract

Apply OLS to our new equation to estimate the coefficients

In order to get serial correlation-robust standard errors we


follow the Newey West method

The robust standard errors applied in case of serial correlation


are similar to the ones used for heteroskedasticity

yt = 0 + 1 xt1 + 2 xt2 + ut

Moursli Mohamed Reda

Lecture 11: An introduction to time series

The interest is to get a serial correlation robust standard


errors for 1 . So the way to go is:
I

Estimate our main model using OLS and extract se( 1 ), and
the OLS residuals u t
Run a regression of xt1 on the other explanatory variables
xt1 = 0 + 2 xt2 + rt

compute the residuals rt and then form a t = rt u t for each t

define v as in 12.42 in the book by choosing g

Finally compute se( 1 )

Moursli Mohamed Reda

Lecture 11: An introduction to time series

You might also like