Lecture 11 Vt13

Lecture 11: An introduction to time series
Moursli Mohamed Reda

University of Gothenburg
February 5, 2013
What we previously covered

I
The definition of a random process yt
White noise process, usually a property we seek in the error

term of a regression
Weak Stationarity: an essential set of assumptions about the

moments of our process, which guarantees a well behaving
series
The assumptions required for the Gauss-Markov Theorem to

apply
The importance of properly handeling Trends in the data.
Serial correlation
Recap
Autocorrelation Function
I
We generalize the concept of correlation to that of

autocorellation when looking at the linear dependence within
a (weakly) stationary series zt .
the autocorrelation between zt and zts is given as:

s = p
Cov(zt , zts )
Var(zt )Var(zts )
Cov(zt , zts )
Var(zt )
In the result above, the constant variance assumption implied

by the weak stationarity is used.
We can also estimate the Lag-1 sample autocorrelation as

above.
Recap
Autocorrelation Function
I
Assume we have a sample of returns rt with t = 1, ..., T, r is

the sample mean then the Lag-1 sample autocorrelation of rt
is given by:
Tt=2 (rt r)(rt1 r)
Tt=1 (rt r)2
In general, the Lag-s sample autocorrelation of rt is defined as:
1 =
s =
Tt=s+1 (rt r)(rts r)

,
Tt=1 (rt r)2
0 s < T1
I
The statistics defined above, 1 , 2 , ... are called the ACF of

rt , and allow us to capture the linear dynamics of a time series
data.
The following four slides show the ACF and PACF of the monthly
US Stock Market Return Index, and that of US Dividend Yield
0.20
Autocorrelations of rusa
0.10
0.00
0.10
0.20
Figure : ACF Return Index US 12 Lags
10
15
Lag
Bartletts formula for MA(q) 95% confidence bands
0.20
Partial autocorrelations of rusa

0.10
0.00
0.10
0.20
Figure : PACF Return Index US 12 Lags
10
15
Lag
95% Confidence bands [se = 1/sqrt(n)]
1.00
Autocorrelations of pdusa
0.50
0.00
0.50
1.00
Figure : ACF Dividend Yield US 36 Lags
10
20
Lag
30
40
Bartletts formula for MA(q) 95% confidence bands
0.00
Partial autocorrelations of pdusa

0.50
1.00
Figure : PACF Dividend Yield US 36 Lags
10
20
Lag
30
40
95% Confidence bands [se = 1/sqrt(n)]
Weakly Dependent Series

I
We define now another concept closely related to stationarity,

but slightly different:
I
Two series are said to be weakly dependent when they start

becoming more independent the longer the time horizon
assume we have the following series xt and xt+h then the

higher h is the more independent the two series become,
which in turn implies that:
I
Corr[xt , xt+h ] 0 as h
In the case above, we call our series to be covariance
stationary or asymptotically uncorrelated
AR models
Introduction
I
A natural starting point for a forecasting model is to use past

values of yt to forecast yt .
An AR model is a regression model in which yt is regressed

against its own lagged values.
The pth order AR model, denoted AR(p), is written

yt = + 1 yt1 + ... + p ytp + ut
where {ut } is white noise with variance 2 .
and 1 , ..., p do not have causal interpretations.
If 1 = ... = p = 0, then yt1 , ..., ytp are not useful for

forecasting yt .
AR models
Properties
I
Consider the AR(1) model:

yt = 1 yt1 + ut
What are the mean, variance and covariance of {yt }?
The mean can be obtained from

E(yt ) = 1 E(yt1 ) + E(ut ) = 1 E(yt1 )
If {yt } is stationary, then E(yt ) = E(yt1 ) = , and so we get
= 1 , or
=
0
=0
1 1
Consider next the variance. It holds that

var(yt ) = 12 var(yt1 ) + var(ut ) = 12 var(yt1 ) + 2
The variance: Continued.

If {yt } is stationary, then var(yt ) = var(yt1 ) = 0 , giving
0 = 12 0 + 2
or
0 =
2
1 12
The covariance at one lag is given by

1 = E(yt yt1 ) = E((1 yt1 + ut )yt1 )
= 1 E(y2t1 ) + E(ut yt1 ) = 1 0

The general pattern is
k = 1k 0
The kth order autocorrelation is given by

k =
We need |1 | 6= 1 as otherwise
0 =
k 0
k
= 1 = 1k
0
0
2
2
=
=
0
1 12
We can also not have |1 | > 1 because then

0 =
2
<0
1 12
In other words, we need |1 | < 1 in order to ensure that {yt }

is well-behaved.
Estimation
I
Consider the AR(1) model:

yt = 1 yt1 + ut
As long as yt1 is independent of ut , this model is just like any

other regression model, which can be estimated using OLS.
One problem: Write

1 =
Tt=2 yt1 yt
Tt=2 yt1 (1 yt1 + ut )
=
Tt=2 y2t1
Tt=2 y2t1
= 1 +
Tt=2 yt1 ut
T
1
2
T t = 2 yt 1
1
T
At this point, we want to use the LLN to show that the last
term goes to zero. However, since {yt } is serially correlated
our usual LLN and CLT results cannot be used here.
Fortunately, there are extensions of the LLN and CLT that

apply when the observations are serially dependent.
Assumptions:
1. E(ut |yt1 ) = 0.
2. {yt } is stationary.
3. yt and ytk become independent as k increases.
4. E(y4t ) is nonzero and finite.
Let It1 = {yt1 , yt2 , ..., y1 } be the past history of yt . In

time series, if
E ( u t | It 1 ) = 0
then ut is called a martingale difference sequence (MDS).
Assumption 1 implies this.
Assumption 2 is the time series counterpart of the identically

distributed part of the iid assumption when applied to
(y1 , ..., yT ).
Assumptions 3 is the time series counterpart of the

independently distributed part of iid. It replaces the usual
assumption that yt and ytk are independent with the time
series requirement that they become independent as k
increases. This assumption is sometimes referred to as weak
dependence or ergodicity.
Under assumptions 1-4 the OLS estimators are aymptotically

normally distributed, and it also implies that the usual OLS
standard errors, t and F statistics, and LM statistics are
asymptotically valid.
Non-Stationarity
I
A non-stationary process is a process that violates one of the

stationarity assumptions mentioned above, it is also called a
unit root process
A stationary process is a I(0) process, while an non-stationary

process is an I(1) process
A non-stationary serie that is I(1) can be rendered stationary

by first differencing:
I
If we have a random walk as follow:

yt = 0 + 1 yt1 + et
and if 1 = 1 and et is weakly stationary then by first

differencing the series we get:
yt = yt yt1 = et
Identifying if the series have a unit root

I
The identification of a unit root can be done based on

graphical inspection when there is a clear trend in the data
Othewise there are several ways to decide on the

non-stationarity of series (we will not cover a formal procedure
in this chapter)
The first order autocorrelation can be an informal way of doing

so: from the AR(1) model we saw that stationarity requires
that |1 | < 1
So if we estimate the first order autocorrelation 1 and it is

close to 1 then we can suspect that we have non-stationarity
The reason of the result above is that the 1 is a consistant

estimator of 1
The deterministic trend model

I
A deterministic trend is a nonrandom function of time, for

example, t or t2 .
Consider the following deterministic trend model:

yt = + t + ut
where ut is iid.
In the deterministic trend model {yt } is non-stationary.
Proof: Note that

E(yt ) = + t + E(ut ) = + t
which depends on t, thus violating stationarity condition 1.
The Effect of the Presence of a Trend
R2 = 1
SSR
p 1
SST
where
T
SSR =
u 2t
t=1
u t = yt t
I
Thus, R2 p 1 even though the observations do not get

closer to the regression line as T . This is therefore a
spurious result!
The reason for why SSR/T is not exploding in the same way
as SST/T is that ut = yt t is stationary.
The spuriously high R2 is due to the fact that while SSR

accounts for the trend SST does not.
This result does not only hold in a regression of yt on a

constant and trend, but extends to all regressions where the
dependent variable is trending, and the trend is included on
the right-hand side.
Solution: Replace yt with the de-trended series u t when

running regressions.
The point here is that the mean and variance have no

meaning if {yt } is non-stationary, as the mean is time-varying
and the variance is exploding.
The stochastic trend model

I
Unlike a deterministic trend, a stochastic trend is random and

varies over time.
An important example of a stochastic trend is a random walk:

yt = yt 1 + ut
where ut iid(0, 2 ).
Because this is an AR(1) with unit AR slope coefficient, the

characteristic equation is given by
1 1 z = 1 z = 0
which has a root at unity, z = 1.
We therefore say that yt has a unit root.

The DF test
I
How do you detect trends?
Plot the data
There is a regression-based test for a random walk the

DickeyFuller (DF) test for a unit root.
The AR(1) model can be written as

yt = (1 1)yt1 + ut = yt1 + ut
The unit root hypothesis is given by

H0 : = 0 (or 1 = 1)
H1 : < 0 (or |1 | < 1)
This is a one-sided test; if there is no unit root, yt is

stationary.
The DF statistic is the usual t-statistic for testing = 0:

tDF =
SE()
The difference is that the distribution is no longer standard

normal. New critical values are therefore needed.
How to treat the presence of nonzero intercept and trend

terms?
The decision to use the intercept-only DF test or the intercept

and trend DF test depends on what the alternative is and
what the data look like.
I
In the intercept-only specification, the alternative is that yt is

stationary around a constant.
In the intercept and trend specification, the alternative is that
yt is stationary around a linear time trend.
when we have serial correlation in our error term, then we

would rather use an augmented version of the DF test
The ADF test includes lags to the standard DF, based on the
persistance of the serial correlation in our error term
The general format of the ADF would look like:

l
yt = (1 1)yt + i yti + ut
i=1
A common problem in testing for the presence of unit root is

the low power of the tests used.
Another issue is that if the data has pronounced shifts, then

tests such as the DF or the ADF are not really suitable.
Serial Correlation
I
We have seen that for the Gauss-Markov theorem to hold, we

should have no serrial correlation among the error terms of
our model, and the errors should be homoskedastic
If we have serial correlation in the residuals then:

I
The OLS estimator is no longer BLUE
The standard erros and test statistics are not valid (even
asymptotically)
However the R-squared would be still a good measure of

goodness of fit as long as we assume stationarity and weak
dependence (the variances are constant)
Testing for Serial Correlation

I
I
There are several ways to test for the presence of serial

correlation
We look at a t-test for AR(1) serial correlation with strictly
exogenous regressors
yt = 0 + 1 xt1 + ... + k xtk + ut
ut = ut1 + et
we assume the following:
E[et |ut1 , ...] = 0Var(et |ut1 ) = e2
The null hypothesis would be that H0 : = 0; however the
issue is that ut is unobserved
I
I
I
I
the solution is to use the t from the OLS estimation to

approximate ut
Regress t on t1
Finally use the t-statistic associated with to test for the null
hypothesis Moursli Mohamed Reda
Correcting for Serial Correlation

I
We can correct for serial correlation by using the FGLS

procedure
yt = 0 + 1 xt + ut
yt1 = 0 + 1 xt1 + ut1
If we compute yt yt1 we will get new series for yt and xt ,

and thus we can rewrite our regression as follow:
y t = yt yt1 , x t = xt xt1 ,
yt = 0 + 1 x t + ut
The steps are as follow:

I
First we run OLS on our regression and we extract the

residuals
Then we run the regression on the error terms and extract
Apply OLS to our new equation to estimate the coefficients
In order to get serial correlation-robust standard errors we

follow the Newey West method
The robust standard errors applied in case of serial correlation

are similar to the ones used for heteroskedasticity
yt = 0 + 1 xt1 + 2 xt2 + ut
The interest is to get a serial correlation robust standard

errors for 1 . So the way to go is:
I
Estimate our main model using OLS and extract se( 1 ), and
the OLS residuals u t
Run a regression of xt1 on the other explanatory variables
xt1 = 0 + 2 xt2 + rt
compute the residuals rt and then form a t = rt u t for each t
define v as in 12.42 in the book by choosing g
Finally compute se( 1 )

Lecture 11 Vt13

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 11 Vt13

Uploaded by

Copyright:

Available Formats

Lecture 11: An introduction to time series

Moursli Mohamed Reda

Moursli Mohamed Reda

Lecture 11: An introduction to time series

What we previously covered

The definition of a random process yt

White noise process, usually a property we seek in the error

Weak Stationarity: an essential set of assumptions about the

The assumptions required for the Gauss-Markov Theorem to

The importance of properly handeling Trends in the data.

Lecture 11: An introduction to time series

We generalize the concept of correlation to that of

the autocorrelation between zt and zts is given as:

In the result above, the constant variance assumption implied

We can also estimate the Lag-1 sample autocorrelation as

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Assume we have a sample of returns rt with t = 1, ..., T, r is

Tt=s+1 (rt r)(rts r)

The statistics defined above, 1 , 2 , ... are called the ACF of

Lecture 11: An introduction to time series

Figure : ACF Return Index US 12 Lags

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Partial autocorrelations of rusa

Figure : PACF Return Index US 12 Lags

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Figure : ACF Dividend Yield US 36 Lags

Bartletts formula for MA(q) 95% confidence bands

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Partial autocorrelations of pdusa

Figure : PACF Dividend Yield US 36 Lags

95% Confidence bands [se = 1/sqrt(n)]

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Weakly Dependent Series

We define now another concept closely related to stationarity,

Two series are said to be weakly dependent when they start

assume we have the following series xt and xt+h then the

Moursli Mohamed Reda

Lecture 11: An introduction to time series

A natural starting point for a forecasting model is to use past

An AR model is a regression model in which yt is regressed

The pth order AR model, denoted AR(p), is written

and 1 , ..., p do not have causal interpretations.

If 1 = ... = p = 0, then yt1 , ..., ytp are not useful for

Lecture 11: An introduction to time series

Consider the AR(1) model:

What are the mean, variance and covariance of {yt }?

The mean can be obtained from

Consider next the variance. It holds that

Lecture 11: An introduction to time series

The variance: Continued.

The covariance at one lag is given by

= 1 E(y2t1 ) + E(ut yt1 ) = 1 0

Lecture 11: An introduction to time series

The kth order autocorrelation is given by

We can also not have |1 | > 1 because then

In other words, we need |1 | < 1 in order to ensure that {yt }

Moursli Mohamed Reda

Lecture 11: An introduction to time series

Consider the AR(1) model: