You are on page 1of 39

1

Lectures on modelling non stationary time series


by

1. UNIVARIATE PRELIMINARY ANALYSIS From a statistical p. o. v., a time series is a sequence of random variables ordered in time; we introduce the concept of STOCASTIC PROCESS (SP): {Xt}, t = 1, 2, ..., T

Roberto Golinelli
1. 2. 3. 4. 5. 6. 7. 8.

The probability structure of a stochastic process is determined by its joint distribution.


Example of a SP: the white noise model 2 10 30 34 40 49 73 74 [1] xt = c + t t ~ n.i.d. (0, 2) xt is normally and independently distributed over time with constant variance and c mean (also constant). Q: IS IT AN APPROPRIATE MODEL FOR THE MACROECONOMIC TIME SERIES? Eviews/phil/series u (unemployment rate)/descript. stats
12 10 8 6 4 2
Series: U Sample 1960 1999 Observations 40 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera Probability 6.945875 5.616000 12.25100 2.835000 3.130564 0.428753 1.673676 4.157416 0.125092

Univariate preliminary analysis The stationarity issue in AR models: the unit root tests Unit roots and spurious regressions The dynamic specification (ARDL) Long run relationships and cointegrated variables Modelling systems Guidelines for the preparation of applied econometrics projects Reading list and acknowledgements

CIDEs PhD Lectures, Bertinoro (FO), June 2005

0 2 3 4 5 6 7 8 9 10 11 12 13

Department of Economics Strada Maggiore, 45 40125 Bologna (Italy) golinell@spbo.unibo.it www.dse.unibo.it/golinelli

a corresponding artificial series can be generated with same sample mean and standard deviation of the historical u: genr uaswn = 6.94 + 3.13 * nrnd genr meanline = 6.94 plot u uaswn meanline

14 12 10 8

0.00 -0.02 -0.04 -0.06

6
-0.08

4
-0.10

2 0 60 65 70 U 75 80 85 90 MEANLINE 95
-0.12 55 60 LQR 65 70 75 80 85 90 95

UASWN

LQRASWN

MEANLINE

The white noise (WN) model for the unemployment rate in Italy would state that: u randomly fluctuates around a constant mean (6.94) with constant variance (3.132). But the white noise model does not fit actual data for u because it does not feature the time series most common characteristic: PERSISTENCE. In fact, the actual u is by far more persistent than the simple WN process under and above the natural rate of about 7%. Q: IS THIS RESULT PECULIAR TO UNEMPLOYMENT? Eviews/lqr/series lqr (logs of capacity utilisation ratio). plot lqr lqraswn meanline From the plot below it is evident that the capacity utilisation has a completely different path with respect to the unemployment rate: in fact, lqr is markedly less persistent than u. However, the capacity utilisation ratio still persists more than the corresponding artificial series (generated as a white noise realisation).

A: WE HAVE TO FIND OTHER REFERENCE MODELS. MORE


REALISTIC STATISTICAL MODELS ARE COMBINATIONS OF DIFFERENT ; THEY ARE CALLED ARMA MODELS.

Example of another SP: the AR(1) model [2] xt = c + xt-1 + t t ~ n.i.d. (0, 2) is a WN The variable xt is not independently distributed over time because it depends on xt-1. In a model for u, we can estimate c and parameters of equation [2] by using the OLS method: ls u c u(-1)
Method: Least Squares Sample(adjusted): 1961 1999 Included observations: 39 after adjusting endpoints Variable Coefficient Std. Error t-Statistic C 0.123144 0.215444 0.571584 U(-1) 1.011932 0.028916 34.99588

Prob. 0.5711 0.0000

14 12 10 8 2 1 0 -1 -2 65 70 75 80 85 Actual 90 95 Fitted 6 4 2

the residual tests are all fine (white noise errors); the AR(2) model equally fits well; the sum of the two estimates is close to one. Now, also try with the lqr variable: ls lqr c lqr(-1)
Dependent Variable: LQR Sample(adjusted): 1952 1997 Included observations: 46 after adjusting endpoints Variable Coefficient Std. Error t-Statistic Prob. C -0.022713 0.007179 -3.163683 0.0028 LQR(-1) 0.604811 0.119754 5.050447 0.0000 R-squared 0.366970 Mean dependent var -0.057308 Adjusted R-squared 0.352583 S.D. dependent var 0.018119 S.E. of regression 0.014579 Durbin-Watson stat 1.696

Residual

from previous regression output we note that: the estimate of parameter is very close to one; the AR(1) model fits unemployment quite well. Since residuals are estimates of t (white noise processes), we have to check the classical assumptions by using the diagnostic (mispecification) tests: Under the null: no autocorrelation no heteroschedasticity normality AR(1) residuals rejected not rejected not rejected

In this case, a first order model is enough to avoid residuals problems and the alpha estimate is equal to about 0.6 (< 1). Note that the dynamics of the capacity utilisation rate (less persistent than the unemployment rate) is more difficult to be fitted by the AR model (R2 = 0.367, against 0.976). Preliminary findings: a) data persistence is explained by AR models; b)the sum of the AR parameter estimates is often close to one; c) the more persistent the path, the easier to fit the data by AR models and the closer to one is the sum of alpha estimates. In addition, note that not all the economic series are untrended, and in case of trended variables we must introduce deterministic components in our statistical models in order to (potentially) give account of this further feature.

We can react to autocorrelation by increasing to 2 the order of the AR process: the AR(2) model is written as t ~ WN [3] xt = c + 1 xt-1 + 2 xt-2 + t where there is one more parameter, and the dynamics is extended to the second lag. ls u c u(-1) u(-2) (results not reported).

From an economic p. o. w., the nature of previous u and lqr variables excludes the presence of a deterministic trend, being both measured by ratios. On the other side, there are many variables whose levels can continuously grow over time (output, real wages, prices, etc.). For example if we define logs of the real wage as: genr lwp = log(w/p), and if plot lwp, we can note that its path over time is trended, explained by some causal effects (e.g. labour productivity growth). The same apply to ly (logs of real output), equally trended, or consumer price levels p. Previous statistical models can be easily extended to this feature by including a deterministic trend (t); e.g. the equation [1] becomes: c+t + t [1'] xt = and we can fit this WN plus deterministic trend model to actual lwp data: ls lwp c @trend
13.5 13.0 12.5 0.2 0.0 -0.2 -0.4 60 65 70 Residual 75 80 85 Actual 90 Fitted 95 12.0 11.5

The WN plus trend model does not fit data, and the regression residuals are very much persistent (strong positive autocorrelation). we have to introduce wage dynamics with the AR(1) plus trend model (an extension of the equation [2]): c+t + xt-1 + t [2] xt = ls lwp c @trend lwp(-1)
Dependent Variable: LWP Method: Least Squares Sample(adjusted): 1961 1999 Included observations: 39 after adjusting endpoints Variable Coefficient Std. Error C 0.538676 0.260299 @TREND -0.001136 0.000675 LWP(-1) 0.961725 0.021464

t-Statistic 2.069449 -1.683461 44.80662

Prob. 0.0457 0.1009 0.0000

The inclusion is very important indeed: thanks to dynamics the residuals are now fine (results not reported); the relevance of time trend vanishes while the autoregressive parameter estimate is close to one (as in many cases of AR model estimates). Some first tentative conclusions confirm previous preliminary findings: a) despite the inclusion of a deterministic trend, lwp persistence needs an AR dynamics in general, many economic series can be represented by AR models of different orders, with or without deterministic trends; b)the (sum of) AR parameter estimates is very often close to one

10

Q: WHAT DOES POINT B) IMPLY IN TERMS OF THE AR MODELS STATISTICAL PROPERTIES? The next step will be the study of the statistical properties of the AR models with (or without) unit roots: a unit root is found in the SP of an AR model when the sum of the alpha parameters is equal to one (necessary condition).

2. THE STATIONARITY ISSUE IN AR MODELS: THE UNIT ROOT TESTS Consider the AR(1) model in the equation [2], and for the moment lets ignore the deterministic components: [2] xt = xt-1 + t t ~ n.i.d. (0, 2) is a WN L0xt = xt By introducing the lag operator: L2xt = LLxt = Lxt-1 = xt-2 Lxt = xt-1 we can redefine equation [2]: and if || < 1 we have that: 1/(1 - L) = 1 + L + L + L + ... =
2 2 3 3

xt = Lxt + t (1 - L) xt = t xt = t /(1 - L)

iLi
i=0

a geometric series converges if the absolute ratio of successive terms is less than 1. [2*] xt = (1 + L + 2L2 + 3L3 + ... ) t = t + t-1 + 2t-2 + 3t-3 + ... In equation [2*] the AR(1) process is written in the corresponding MA() form (Wold representation). E(xt) = 0 (this result depends on the absence of deterministic components); Var(xt) = E[xt-E(xt)]2 = E[t + t-1 + 2t-2 + 3t-3 + ...]2 = E[t2 + 22t-1 + 42t-2 + 62t-3 + ...] = 2 [1 + 2 + 4 + 6 + ...] = 2 /(1 - 2) Var(xt-k) Cov(xt,xt-k) = E{[xt-E(xt)] [xt-k-E(xt-k)]} = E{[t + t-1 + 2t-2 + ... + kt-k + k+1t-k-1 + k+2t-k-2 + ...] [t-k + t-k-1 + 2t-k-2 + ...]} = k2 /(1 - 2) = k Var(xt) k Var(xt-k)

11

12

Autocorrelation coefficient of order k k = Cov(xt,xt-k)/Var(xt) = Cov(xt,xt-k)/Var(xt-k) = k If || < 1, the AR(1) model is STATIONARY since its moments do not depend on t. The autocorrelation coefficient k decreases when k increases (the memory of the process decreases with k). Example: if = 0.6 (as in the case of the AR(1) model for the capacity utilisation ratio in logs) then: xt = t + 0.6 t-1 + 0.36 t-2 + 0.216 t-3 + 0.13 t-4 + 0.08 t-5 + 0.047 t-6 + ... after six periods, the shock is no longer economically significant. An easy way to appreciate the path of the shock is to draw the impulse-responses function of a series. IMPULSE-RESPONSES IN THE STATIONARY AR(1) MODEL horizon timing impulse shocked xi responses = x'i = x'i - xi 0 1 t t+1 s 0 x't=xt + s x't+1=x't+t+1= xt+s+t+1= xt+1+s x't+2=x't+1+t+2= xt+1+s+t+2= xt+2+2s ... ... ... s s

The responses decrease because || < 1. Example: the AR(1) model for the capacity utilisation ratio. Eviews/lqr/quick/estimate VAR/lqr 1 1/impulse h=10 (multiple graphs)
Response of LQR to One S.D. LQR Innovation
0.020

0.015

0.010

0.005

0.000

-0.005 1 2 3 4 5 6 7 8 9 10

What is depicted is the path of a transitory shock: given that lqr variable is explained by a stationary AR(1) model, |0.6|<1, the response to the impulse vanishes over time. A transitory shock can be interpreted as a demand shock: an increase in demand (positive shock) causes a short run increase in output, but leaves unaffected the long run potential output of the economy (given by the supply side).

2 3 ... h

t+2 t+3 ... t+h

0 0 ... 0

2 s 3 s ... h s

Contrast the stationarity (|| < 1) case with the unit roots case ( = 1 in equation [2]):
xt = xt-1 + t Repeated backwards substitution allows to write:

13

14

xt = x0 + t + t-1 + t-2 + t-3 + ... + 2 + 1 where x0 is assumed to be a fixed initial value for the process. In a process with unit roots, second moments depend on time (non stationarity): E(xt) = x0 Var(xt) = E[xt - x0] = E[ i]2 = t 2
2
i=1 t

While in the stationary case a shock (innovation) has an effect on x that diminishes with t (transitory shock), in the unit root case has a sustained (permanent) effect. In the case of the unemployment rate in Italy, a 0.5% shock is very persistent: after 20 years it is still 0.5% (permanent shock). The unit root model has an infinite memory. DETERMINISTIC COMPONENTS Previous outcomes do not substantially change by adding deterministic components to the AR model: yt = dt + xt where: dt are the deterministic variables, and xt is a zero mean AR(1) process. By using the Wold representation of the AR(1) model: yt = dt + t /(1 - L) (1 - L) yt = (1 - L) dt + t yt = yt-1 + dt - dt-1 + t case (a): dt = 0 (only the constant term) yt = (1-) 0 + yt-1 + t by defining (1-) 0 = c, we have the equation [2]. case (b): dt = 0 + 1 t (linear trend) yt = yt-1 + 0 + 1 t - [0 + 1 (t-1)] + t = = yt-1 + 0 + 1t - 0 - 1t + 1 + t = = (1-)0+1 + 1(1-)t + yt-1 +t by defining (1-)0+1 = c, and 1(1-) = we have the equation [2]. Summary of two useful models (a) yt = c + yt-1 + t (b) yt = c + t + yt-1 +t

Cov(xt,xt-k) = E{[xt - x0] [xt-k - x0]} = E{[t + t-1 + t-2 + ... + t-k + t-k-1 + t-k-2 + ... + 2 + 1] [t-k + t-k-1 + t-k-2 + ... + 2 + 1]} = 2 = (t-k) Var(xt-k) k = Cov(xt,xt-k)/[Var(xt) Var(xt-k)]0.5 = (t-k)2/2[t(t-k)]0.5 = =
t

t k t
t t

lim Var(xt) = ; lim Cov(xt,xt-k) = ; lim k = 1 ; Eviews/phil/quick/estimate VAR/u 1 2/impulse h=20 (multiple graphs): the unemployment rate in Italy.
Response of U to One S.D. U Innovation
2.0

1.5

1.0

0.5

0.0

-0.5 2 4 6 8 10 12 14 16 18 20

15

16

E(yt) = E(dt) + E(xt) = dt

(a) E(yt) = 0 (b) E(yt) = 0 + 1 t

We can summarise the univariate unit root concept by following Johansen (1997), who notes that the specific unitroot model: or: xt = xt-1 + t can be also written as: xt = xo + 1 + 2 + ... + t This is the random walk: a person that starts walking from a square (xo), and takes steps (i) of random size and direction: xt is his position after t steps when he starts at xo. By modelling a variable by a random walk, we do not try to reproduce its sample path, because we decide that these details are not important to explain (in fact, we model them as random) only the qualitative behaviour of the path matters: it is a float (once it has reached a level, it stays there until it reaches a new level). On the other hand, in the model:
unpredictable (random) part

Var(yt) = E[yt-E(yt)]2 = E(xt)2 = Var(xt) Cov(yt,yt-k) = E{[yt - E(yt)] [yt-k - E(yt-k)]} = E{xt xt-k} = = Cov(xt,xt-k) The deterministic variables in yt only change the mean (that in any case is non stochastic); second moments are the same as those of xt (zero mean) variable. the stationary condition is still || < 1

x t = t

if || < 1:
model (a) is a stationary AR(1) model with mean 0 0 (MEAN REVERTING) model (b) is a stationary AR(1) plus trend model (TREND REVERTING)

for: stationary non drifting time series drifting trend stationary (TS) time series for: non drifting difference stationary (DS) time series drifting difference stationary (DS) time series

if = 1:
model (a): yt = yt-1 + t is the RANDOM WALK model (b): yt = 1 + yt-1 + t is the RANDOM WALK WITH
DRIFT

xt =

xt-1 + t
predictable part of the movement

or:

xt = xt-1 + t

When = 1 we talk about random walks because the

non stationary AR is a first order model.

(= - 1) represents the glue of the process; if 0 (then, 1) neighbouring values of xt are more often close together, and we got a wave-like behaviour. While if: -1 (then, 0), neighbouring values of xt are almost unrelated (independent). In general, when -2 < < 0, or -1 < < 1, the path of xt exhibits a mean reversion.

17

18

Summary exercise: Practice with the stationary AR(1) model

Use the model (a) above: yt = c + yt-1 + t with: t ~ n.i.d. (0, 2) under stationary condition || < 1 we have that E(yt) = E(yt-1) = 0 We can simpler summarise first and second moments. E(yt) = c + E(yt-1) + E(t ) hence: 0 = c /(1 - ) If we substitute this definition in model (a), and take the expected value of the square: (yt - 0) = (yt-1 - 0) + t E(yt - 0)2 = 2 E(yt-1 - 0)2 + E(t )2 Var(yt) = 0 = 2 /(1 - 2) Finally, multiply the demeaned equation times (yt-k - 0): (yt - 0) (yt-k - 0) = (yt-1 - 0) (yt-k - 0) + t (yt-k - 0) if we define: E[(yt - 0) (yt-k - 0)] = k then: k = k-1 + 0 and: k = k-1 = 0 =
k k

' %0 = mean ' %1 = alpha ' %2 = ratio between s.d.(y)/mean(y) ' %3 = sign of alpha

(id number) (e.g. -0.6 or 0.6 -----> 60) (e.g. 50% ---> 50) (0=positive, 1=negative)

scalar s = %2/100*%0*(1-( (-1)^%3 *%1/100)^2)^.5 smpl 1970.1 2000.4 rndseed %0 genr e%0%1%2_%3 = s*nrnd ' set the initial value (random number from long run mean & variance of y ' alternatively you can start from zero or deterministically from the mean %0 smpl 1970.1 1970.1 genr y%0%1%2_%3=%0 + e%0%1%2_%3/(1-( (-1)^%3 *%1/100)^2)^.5 ' iterate the other values smpl 1970.2 2000.4 genr y%0%1%2_%3 = %0*(1-((-1)^%3*%1/100)) + (-1)^%3*%1/100*y%0%1%2_%3(-1)+ e%0%1%2_%3 smpl 1970.1 2000.4

With alternative simulations we can assess ... (a) the issue of the initial value of the dynamic processes what happens if we start far away from the mean? (e.g. 100 90 5 0) (b) that the persistence of the path of yt depends on the parameter (with 0 < 1 ): it grows with 1 (c) what happens when -1 < 0 (e.g. 100 0/40/80/99 50 0/1) (d) what role played by ratio Hints: open a new working file your_name.wf1 (quarterly data from 1970.1 to 2000.4) open the program simular1.prg run various scenarios (in parentheses above) compare plots, correlograms, impulse-response functions save all useful results

(note that

0 = 1)

Simulation analysis can be used to verify a number of stylised facts (procedure: simular1.prg). In order to simulate an AR(1) model, we need to set three 0 parameters: 0 , , and the ratio = o in order to obtain the three genuine parameters of the AR(1) model: c = 0 (1 - ) In the procedure = ratio 0 (1 - 2) 0=%0 =%1 ratio=%2 sign of =%3 (0=positive, 1=negative)

19

20

HOW TO TEST DS VS TS MODELS The data generating process (DGP) is DS if = 1, while it is TS (or mean reverting) if || < 1. The main inferential point is to find a significance-test for estimates. The testing models are two: model (a) includes a constant term, model (b) includes both a constant and a trend. Models (a) and (b) can be conveniently reparametrised in order to ease inference: (a) yt = c + yt-1 + t (b) yt = c + t + yt-1 +t where: = - 1

Sometimes, residuals from models (a) or (b) are autocorrelated: if so, they tell that first-order autoregressive dynamics is not enough. In these cases we have to pass from the DF (first-order) test to the Augmented Dickey-Fuller, the ADF(p) test analyses a more general p+1 order dynamics. The corresponding (a) and (b) models for ADF(p) testing are: (a) yt = c + yt-1 +

i=1

i yt-i + t

(b) yt = c + t + yt-1 +

i=1

i yt-i + t

THE DICKEY-FULLER (DF) UNIT ROOTS TEST


H0: = 0 = 1 H1: < 0 < 1 yt variable is a random walk yt variable is a stationary AR(1)

Under the null, yt is first order integrated: I(1) because it is stationary after one difference. Under the alternative it is a mean reverting AR(1) in model (a) or a trend stationary variable in model (b); in both cases yt is I(0) because it is stationary without (zero) difference. The choice of the deterministic components [i.e. model (a) or (b)?] depends on the economic nature of the variable: model (a) for ratios and rates, model (b) for levels; the historical pattern: model (a) for non drifting variables (remember that level variables are often drifting).

Note that the augmentation is made in order to obtain white noise residuals; the augmentation is mainly suggested for high frequency observations (e.g. monthly, quarterly). The rule of thumb states the augmentation of the DF test is often quite similar (or equal) to data periodicity. Example. If the variable yt is quarterly, it is appropriate to start with a fourth-fifth order dynamics. Suppose we are using the model (a), and that a fourth order dynamics is appropriate to explain the path of the variable under scrutiny [AR(4) model with constant and without trend]: yt = c + 1yt-1 + 2yt-2 + 3yt-3 + 4yt-4 + t it can be conveniently rearranged as follows: yt-yt-1 = c + 1yt-1 - yt-1 2yt-1 3yt-1 4yt-1 + 2yt-2 3yt-2 4yt-2 + 3yt-3 4yt-3 + 4yt-4 + t

21

22

yt = c + (1+2+3+4-1) yt-1 - (2+3+4) yt-1- (3+4) yt-2 - 4 yt-3 + t this specification coincides with ADF(3) testing model by defining: = (1+2+3+4-1); 1 = - (2+3+4); 2 = - (3+4); and 3 = - 4 ; yt is I(1); no a random walk, but simply DS yt is I(0) stationary H1: < 0 1+2+3+4 < 1 AR(4) model The critical values of the ADF(p) test are the same as the DF test. The choice of the starting augmentation order depends on: data periodicity (see above) significance of i estimates white noise residuals After preliminary estimation, non significant parameter augmentation can be dropped in order to enjoy more efficient estimates. For this reason, Campbell-Perron (1991) intuitively suggest a dropping down procedure from pmax . Then, such procedure has been supported (and refined) by the findings of Hall (1994), and Ng-Perron (1995, 2001). In particular, simulations carried out e.g. in Ng-Perron (1995) show a strong association between the choice of p and the severity of size distortions (over-rejections) and/or the extent of power loss (too few rejections). Some ADF unit-root test applications The unemployment data for Italy
Eviews/phil/u/view/line graph/unit root test/levels/intercept

ADF Test Statistic

-0.470304

1% Critical Value* 5% Critical Value 10% Critical Value

-3.6117 -2.9399 -2.6080

*MacKinnon critical values for rejection of hypothesis of a unit root. Dependent Variable: D(U) Method: Least Squares Sample(adjusted): 1962 1999 Included observations: 38 after adjusting endpoints Variable U(-1) D(U(-1)) C R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient -0.013118 0.432667 0.218240 0.179620 0.132741 0.500489 8.767134 -26.05472 1.845302 Std. Error t-Statistic 0.027892 -0.470304 0.157136 2.753456 0.202729 1.076510 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) Prob. 0.6411 0.0093 0.2891 0.219526 0.537428 1.529196 1.658479 3.831583 0.031280

H0: = 0 1+2+3+4 = 1

The same test is accomplished for u first differences:


ADF Test Statistic 1% Critical Value* -3.6171 5% Critical Value -2.9422 10% Critical Value -2.6092 *MacKinnon critical values for rejection of hypothesis of a unit root. Augmented Dickey-Fuller Test Equation Dependent Variable: D(U,2) Method: Least Squares Sample(adjusted): 1963 1999 Included observations: 37 after adjusting endpoints Variable Coefficient Std. Error t-Statistic D(U(-1)) -0.745106 0.177804 -4.190608 D(U(-1),2) 0.230343 0.163098 1.412300 C 0.177567 0.089767 1.978085 R-squared 0.352465 Mean dependent var Adjusted R-squared 0.314374 S.D. dependent var S.E. of regression 0.489707 Akaike info criterion Sum squared resid 8.153625 Schwarz criterion Log likelihood -24.52030 F-statistic Durbin-Watson stat 2.098816 Prob(F-statistic) -4.190608

Prob. 0.0002 0.1669 0.0561 0.010108 0.591415 1.487584 1.618199 9.253393 0.000619

23

24

The unemployment rate in Italy is generated by a statistical process with one unit root: u is I(1), and u is I(0). This fact is apparently impossible, since u is a ratio limited between zero and 100%; the same can be said with reference to other ratios or rates (interest rates, the inflation rate, etc.). An explanation comes by quoting, among the others, Hall, Anderson, Granger (1992, note 5): The conclusion that yields to maturity are integrated processes can not be true in a very strict sense because integrated series are unbounded, while nominal yields are bounded below by zero. Nevertheless it is evident from the data that the statistical characteristics of yields are closer to those of I(1) series than I(0) series, so that for the purposes of building models of the term structure it is appropriate to treat these yield series as if they were I(1). Practice: does previous result change if we take u in logs (variable lu) instead of in levels? Try both reference models (a) and (b); do the answers change with models? Another application: the logs of capacity utilisation in Italy.
ADF Test Statistic 1% Critical Value* 5% Critical Value 10% Critical Value *MacKinnon critical values for rejection of hypothesis of a unit root. Augmented Dickey-Fuller Test Equation Dependent Variable: D(LQR) Method: Least Squares Sample(adjusted): 1953 1997 Included observations: 45 after adjusting endpoints Variable Coefficient Std. Error LQR(-1) -0.493511 0.132552 D(LQR(-1)) 0.250065 0.149150 C -0.028317 0.007893 -3.723136 -3.5814 -2.9271 -2.6013

R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat

0.248158 0.212356 0.014446 0.008765 128.3800 1.998996

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

-4.54E-05 0.016277 -5.572444 -5.452000 6.931416 0.002504

The logs of capacity utilisation ratio are I(0), as also suggested by the profile of the impulse-responses. Example: Do the Treasury bills interest rates have a unit root? Eviews/termine/ plot rbot3 rbot6 rbot12
0.25

0.20

0.15

0.10

0.05

0.00 80 82 84 RBOT3 86 88 90 RBOT6 92 94 96 98

RBOT12

ADF(12) test for rbot3 (levels):


ADF Test Statistic 1% Critical Value* -3.4625 5% Critical Value -2.8752 10% Critical Value -2.5740 *MacKinnon critical values for rejection of hypothesis of a unit root. -0.990188

ADF(12) test for d(rbot3) (first differences):


ADF Test Statistic 1% Critical Value* -3.4627 5% Critical Value -2.8753 10% Critical Value -2.5740 *MacKinnon critical values for rejection of hypothesis of a unit root. -3.475083

t-Statistic -3.723136 1.676600 -3.587552

Prob. 0.0006 0.1010 0.0009

The variable rbot3 is I(1), for an explanation see above and read the following phrase.

25

26
2 0 -2

Yet, interest rates are almost certainly stationary in levels. Interest rates were about 6% in ancient Babylon; they are about 6% now. The chances of a process with a random walk component displaying this behaviour are infinitesimal. Pr(|r1991<100%| r4000BC = 6%) it is infinitesimal if r are or contain a random walk; it is near one if interest rates are an AR(1) with a coefficient of 0.99, Cochrane (1991, p. 208). Last sentence introduce the issue of the relevance of the time span, rather than the number of observations. Given an economic variable, e.g. the inflation rate, different data periodicity imply different time spans (inflation data, see below): name periodicity lypc 1 lpq 4 (2) lpm 12 time span # of observ. 1895-1997 103 (1) 70q1-97q4 112 72m1-98m7 319

0.8

0.6

0.4
-4

0.2
-6 -8 -10 1900 1920 1940 1960 1980

0.0
LPC

DLPC -0.2 1900 1920 1940 1960 1980

and in fact, the ADF test results are:


ADF Test Statistic 1% Critical Value* -4.0485 5% Critical Value -3.4531 10% Critical Value -3.1519 *MacKinnon critical values for rejection of hypothesis of a unit root. Augmented Dickey-Fuller Test Equation Dependent Variable: D(LPC) Sample(adjusted): 1894 1997 Included observations: 104 after adjusting endpoints Variable Coefficient Std. Error t-Statistic LPC(-1) -0.042398 0.012989 -3.264109 D(LPC(-1)) 0.769148 0.097420 7.895145 D(LPC(-2)) 0.044517 0.124285 0.358188 D(LPC(-3)) -0.005133 0.098845 -0.051929 C -0.420880 0.134976 -3.118186 @TREND(1890) 0.004323 0.001332 3.245647 -3.264109

Given the number of observations T, the wider the time span, the higher the power of unit roots tests (case 1). Given the time span, high frequency data do not relevantly improve the power of the tests (case 2). The power gain from increasing the data span is bigger than the power gain from increasing the sample size while leaving the data span fixed. It would be surprising if simple time disaggregation helped in the estimation of long run relations. Hendry (1986, OBES). An example: Is the inflation rate I(1) or I(0)? If we use annual data (lypc.wf1), we have:

Prob. 0.0015 0.0000 0.7210 0.9587 0.0024 0.0016

While, in first differences, the null is 1% rejected:


ADF Test Statistic 1% Critical Value* -3.4946 5% Critical Value -2.8895 10% Critical Value -2.5815 *MacKinnon critical values for rejection of hypothesis of a unit root. -3.725069

Result: the inflation rate is I(0), and price levels are I(1). But a completely different picture emerges with high frequency data (quarterly, lpq.wf1, and monthly, lpm.wf1); in fact, the common outcome is that inflation rate is I(1), and price levels I(2). Understandable, if we look at the plots below:

27
0.25 4*DLP D4LP

28

0.5 0.4 0.3

0.20

0.15

D12LP 0.2

12*DLP

0.10

0.1
0.05

0.0 -0.1 72 74 76 78 80 82 84 86 88 90 92 94 96 98

0.00 70 72 74 76 78 80 82 84 86 88 90 92 94 96

DF-GLS and ADF share the same alternative univariate models: (a) without or (b) with a deterministic linear trend; and test for the same hypotheses: H0: yt has a unit root H1: yt is stationary The DF-GLS test is accomplished in two steps. 1st step: GLS detrending. case (a) case (b)
y()1 = y1 ; c()1 = 1 ; y()1 = y1 ; c()1 = 1 ; t()1 = 1 ( ) * y t = yt - yt-1 for t = 2, 3, .., T y()t = yt -* yt-1 for t = 2, 3, .., T c()t = 1 - * for t = 2, 3, .., T c()t = 1 - * for t = 2, 3, .., T t()t = t - *(t-1) for t = 2, 3, .., T
where * = 1 -13.5/T

THE ELLIOTT-ROTHEMBERG-STOCK (DF-GLS) UNIT ROOTS TEST


Fact: While the presence-absence of a unit root has important implications, many remain skeptical about the conclusions drawn from such tests why?
remember Size of the test: probability that the test actually rejects the null when the null is true. Power of the test: probability that the test correctly rejects the null when the alternative is true.

where * = 1 -7/T

OLS estimates of the 0 and 1 parameters y()t = 0 c()t + 1 t()t + et y()t = 0 c()t + et detrended yt is defined as ydt ydt = yt - 0
^

ydt = yt - ( 0 + 1 t)

2nd step: ADF tests of the detrended series ydt. y t = y


d d t-1

The ADF test suffers from severe size distortions (overrejection of the unit root hypothesis) when the movingaverage polynomial of the first differenced series has a large negative root. The ADF test has low power when the root of the autoregressive polynomial is close to but less than unity Elliott, Rothemberg, Stock (1996) find that local GLS detrending of the data yields substantial power gains. Ng, Perron (2001) show that size and power may be further improved when the truncation lag is appropriately selected (e.g. with their specific MAIC p-selection rule).

i=1

i ydt-i + t

The DF-GLS test corresponds to the Student-t of the OLS estimate of in the equation above (note that all the deterministic variables are excluded, and the equation is the same in both cases). While DF-GLS t-ratio follows the ADF distribution (but without constant) in the case (a), the asymptotic distribution differs when case (b) is considered (c.v. are simulated in the ERS paper).

29

30

Concluding remarks (Stock and Watson, ch. 12, 2002) In case the dependent variable and/or regressors are non stationary, then autoregressive coefficients are biased towards zero, OLS t-statistics nonnormal under the null, spurious regression. Hence: the conventional hypothesis tests, confidence intervals, and forecasts are unreliable. The precise created by the nonstationarity, and the solution to that problems depend on its nature. Main sources of nonstationarity are trends and breaks. Trend: a persistent long run movement of a variable over time. It is of two types: deteministic (nonrandom function of time) stochastic (random, varies over time) (examples: ly and dlpc in lypc.wf1) Many econometricians think it is more appropriate to model economic time series as having stochastic rather than deterministic trends. Economics is a complicated stuff. It is hard to reconcile the predictability implied by a deterministic trend with the complications and surprises faced year after year by workers, businesses, and governments (... examples ...). For these reasons, our treatment of trends in economic time series focuses on stochastic rather than deterministic trends. (p. 458)
Break: arises when the population regression function changes over time over the course of the sample. In economics it occurs for changes in economic policy and/or in the structure of the economy, for inventions, etc. Usually, it entails changes in the regression parameters and poorer-than-expected forecasting performance. Again, the nature of the break suggests the best solution (switching/evolving parameter regressions).

3. UNIT ROOTS AND SPURIOUS REGRESSIONS Most econometric analyses are based on sample variance and covariance estimates among variables. Non stationarity causes problems (unconditional moments are not defined): a likely result is spurious regression, and the use of standard large samples theory for valid estimation and inference in the linear model is not allowed. Historical background (since 1920s). Spurious correlation is an observed sample correlation between two series which, though appearing statistically significant, is a reflection of a common trend rather than a reflection of any genuine underlying association. Allen (1949, p. 156): There is a stronger positive correlation between the birth rate and the number of storks in Sweden, since each has been declining for various reasons. Correlation is a statistical concept which is neutral as regards causal relations (economic concept). The non stationarity cancels a number of standard statistical properties and tools. Consider the model: y t = c + zt + t where we suppose that yt and zt are independent. Classical assumptions of regression: I. the regressors are either deterministic or stationary random variables (uncorrelated with the error term) II. E(t) = 0; E(t)2 = 2; E(tt-k) = 0 k > 0.

31

32

If both assumptions hold, then: H0: = 0 Pr(|t| > 1.96) = 0.05 while, if yt and zt are I(1) variables, assumption I. clearly fails, and the effect on t-statistic distribution is: H0: = 0 Pr(|t| > 1.96) 0.753 (i.e. appearance of a false significant regression well over the 5% significance level: problems in the size of the test). In addition, such regressions are characterised by:

2 ( y y ) t R2 =1 t t t (y t y )2

is very high, close to one

(since the variables are both trended, the ratio above can be very small) Very low Durbin-Watson (DW) statistic of 1st order autocorrelation, close to zero. DW 2 (1 - ), where is the 1st order autocorrelation coefficient of the regression residuals. If DW 0 then 1 and it suggests regression residuals are probably I(1). Example: Spurious regressions with artificial data. Eviews/new/workfile/quarterly/1970.1-2000.4 (T=124). open/program/montecarlo.prg/ run it various times and check: t, R2 and DW.

Suggested remedies in literature Granger-Newbold (1974, JE) suggest a rule of thumb to detect spurious regressions: when R2 >> DW. the remedy they suggest is to impose a = (1-L) filter to I(1) series in order to make them stationary, and improve inference: ls D(Y) C D(Z) In this way, t-statistics are no longer significant (as expected, since the two variables were independently simulated), R2 is close to zero, and DW test suggests non autocorrelated residuals (close to 2). Sims-Stock-Watson (1990, E) note that, in levels static regressions, t-statistics test the following hypothesis: H0: = 0 yt = c + t is FALSE because yt is I(1) H1: 0 is FALSE too, since yt and zt are not related both null and alternative hypotheses are false; this raises further inference problems. The main problem with these spurious regressions is that nothing in the model gives account of yt persistence (only the residuals, since zt is not related to yt). the remedy they suggest is to add lags in order to reach white noise residuals (since the persistency is caught by the dynamic specification): ls Y C Z Y(-1) Z(-1) In addition, the true model (y is a random walk) is nested in the dynamic model. Things improve, but we still miss statistical foundations for inference (with I(1) variables, tstatistics are non standard and R2 are uninformative). Some preliminary findings: dynamics matters very much (white noise residuals); always remember what model is underlying both the null and the alternative hypotheses.

33

34

Example of a crazy regression: the US consumers look at logs of UK incomes when they purchase goods (logs of US consumption)! Static model. Eviews/ardlusuk/ ls lcus c lyuk
Dependent Variable: LCUS Sample(adjusted): 1959:1 1998:1 Included observations: 157 after adjusting endpoints Variable Coefficient Std. Error t-Statistic C -5.612676 0.160374 -34.99740 LYUK 1.208592 0.014419 83.81657 R-squared 0.978413 Mean dependent var S.E. of regression 0.052291 Akaike info criterion Durbin-Watson stat 0.140469 Prob(F-statistic)

4. THE DYNAMIC SPECIFICATION (ARDL) The problems related to non stationarity can be partly solved in a dynamic framework because many potentially right models are nested in it. Empirical analysis of level (long run) relationships has been an integral part of time series econometrics and pre-dates the literature on unit roots and cointegration, see HendryPagan-Sargan (1984). The fundamental contribution of this literature is on the specification and estimation of level relationships, rather than testing for their presence, since co-integration theory was missing (not yet fully established). Q: HOW TO USE ECONOMIC THEORIES WHEN CONSTRUCTING AN EMPIRICAL MODEL? Two extreme approaches (see Granger, 1999, ch. 1):
(1) Theory contains the only pure truth, so has to be at the basis of the model, leaving little place for stochastics, uncertainty or exogenous shocks to the system. (2) Theory is useless; better atheoretical models based just on examination of the data and using any apparent regularities and relationships found in it. Most applied economists take a middle ground, using theory to provide the initial specification (variables of interest) and then data exploration techniques to extend or refine the starting model, leading to a form that better represents data.

Prob. 0.0000 0.0000 7.824778 -3.051334 0.000000

If we impose first difference transformation: ls d(lcus) c d(lyuk) as suggested by GrangerNewbold (results not reported), we obtain positive and negative findings: non significant t-statistic and very low R2 (positive side); we loose levels and the information of economic theory (negative side); residual autocorrelation (negative side). Reconsider now spurious regression for US consumption in the context of the dynamic model by augmenting the static regression with lags up to one year (four lags). ls lcus c lyuk(0 to 4) lcus(-1 to 4) as suggested by Sims-Stock-Watson (results not reported). Main findings: the residuals are white noise; the sum of lagged consumption parameter estimates is close to one, and the sum of UK income parameter estimates is close to zero.

Theory Reality (data)

static statements about economic relations (long run relations) dynamic fashion

35

36

A: TO PRODUCE A BRIDGE FROM THE PRISTINE THEORY TO


THE MORE PRAGMATIC DATA ANALYSIS

given the level zt-1, y*t-1 = zt-1 measures the target level, and [yt-1- y*t-1] is the error(equilibrium)-correction term. The ECM form of the model may be seen as comprising the short-run transitory effect and the long run relationship, and describes how the long run solution is achieved via error correction feedback. In fact, if -2 < < 0 (that corresponds to -1 < 1 < 1) the equation [4] equilibrates in presence of a discrepancy between yt-1 and y*t-1: it guarantees that, in the long run, y will converge to its target y*. If yt is not on its long run path, and suppose that yt-1 > y*t-1 {yt-1 < y*t-1} from the ECM representation -2 < < 0 ensures that there is pressure, from the error-correction term, for yt < 0 {yt > 0}. In other terms, if -2 < < 0, in disequilibrium yt will move towards its long run path; both from above and below, and the movement will be in proportion of the last periods error given by [yt-1- y*t-1]. also measures the speed (and the path) of the adjustment of yt to disequilibrium. >0 the model is explosive =0 the model does not adjust -1 < < 0 stable process of adjustment = -1 the model adjusts in one period -2 < < -1 overshooting adjustment = -2 the model continuously oscillates < -2 the model is explosive Big forecast problems in presence of level-breaking relationships in y*t (see Clements-Hendry, 1999).

When economic theory proposes an equilibrium relationship between two variables this may be seen as the long-run steady-state solution of a dynamic model. Define the simple 1st order Auto-Regressive DistributedLags (ARDL) model: [4] yt = c + 1 yt-1 + 0 zt + 1 zt-1 + t where t ~ n.i.d. (0, 2). A long run relation is something that the dynamic process would satisfy if all errors were switched-off, and the equations would then bring back the process to a set of values where the long run relation is satisfied (steady state). The long-run-steady-state-non-stochastic solution of eq. [4] is obtained by setting: yt = yt-1 = y*; zt = zt-1 = z*; t = 0 y* = c/(1-1) + (0+1)/(1-1) z* and the long run (level) relationship is measured by the parameter = (0+1)/(1-1). Model [4] can be reparametrised in a convenient way in order to better understand the mechanism of adjustment towards the long run relationship: [4] yt = c+0zt+(1-1) [yt-1-(0+1)/(1-1)zt-1] + t by defining = (1-1) and, as above, = (0+1)/(1-1), we obtain the specification of the error correction mechanism (ECM) model: [4] yt = c + 0zt + [yt-1- zt-1] + t

37

38

Testing for the existence of a level-relationship It is more convenient to use another parametrisation of [4]: [4*] yt = c + 0 zt + 1 yt-1 + 2 zt-1 + t where: 1 = (1-1) = ; and 2 = (0+1) = - There are two testing possibilities: (t-statistic); i. H0: 1 = 0 ii. H0: 1 = 2 = 0 (F-statistic). Note that t and F distributions are not standard; asymptotic critical values are tabulated by Pesaran-Shin-Smith (2001), and they depend on: the variables are either I(0) or I(1); the presence of the deterministic trend and/or the constant; the number k of explanatory (forcing) variables. Example: Unrestricted constant and no trend (5% asymptotic c.v.)
t k 1 2 3 4 I(0) -2.86 -2.86 -2.86 -2.86 I(1) -3.22 -3.53 -3.78 -3.99 I(0) 4.94 3.79 3.23 2.86 F I(1) 5.73 4.85 4.35 4.01

The null hypothesis of both (t and F) tests is the absence of a long run relationship. Critical values are ad hoc tabulated: one set of c.v. are obtained by assuming that all the variables are I(0), the other set by assuming all the variables are I(1). In the case where zt and t are correlated, the ARDL procedure requires estimation of an augmented version of the original model. Hence, the important issue in the application of the ARDL procedure is the choice of the order of the distributed lag function on yt and zt. Example: Does exist a long run stable relationship between consumption and income in the US? Eviews/ardlusuk/ardl
genr dlcus = d(lcus) genr dlyus = d(lyus) ls dlcus c dlcus(-1 to 3) dlyus(0 to -3) lcus(-1) lyus(-1)

Both t and F tests do not reject H0 spurious relation? Probably its better to think in terms of omitted variables
e.g. the quarterly inflation rate: pius (Deaton consumption model) is added to previous ARDL specification:
genr dpius = d(pius) ls dlcus c dlcus(-1 to 3) dlyus(0 to -3) dpius(0 to 3)lcus(-1) lyus(-1) pius(-1)

Main features of the ARDL approach: Developed in the field of the Auto Regressive Distributed Lags models (dynamic specification, Sargan, LSE, etc.).

Given LM(4) autocorrelated residuals, we added dlcus lags 4 and 5: the results are in /equation/ardlfinal/
ls dlcus c dlcus(-1 to 5) dlyus(0 to -3) dpius(0 to 3)lcus(-1) lyus(-1) pius(-1)

(model used in estimating/testing for the long run relationship).

39

40

Both t and F tests reject H0 either by using I(0) or I(1) c.v., since statistics are: t = -3.5 and F = 8.46 while corresponding I(0)-5% c.v. are t = -2.86, F = 3.79; and I(1)5% c.v. are t = -3.53, F = 4.85 (see Pesaran, Shin, Smith, 2001, p. T.2 and T.4).

5. LONG RUN RELATIONSHIPS AND COINTEGRATED VARIABLES Ex ante, by pretesting with ADF test, suppose we know that yt and zt are I(1): does a long run relationship between y and z exist? are y and z co-integrated? Cointegration definition: Two integrated I(d) variables are co-integrated if there exists a linear combination of them which is integrated I(c), with c < d. The case d = 1 and c = 0 is interesting in that cointegration implies an ECM representation (EqCM is the Hendrys &co update) which allows to rewrite a dynamic model in levels I(1) as a dynamic model which involves only variables I(0). THE ENGLE AND GRANGER PROCEDURE: [A] OLS estimation of the static (cointegration) regression y t = c + zt + u t Note that if the combination ut=yt(c+zt) is I(0), then the integrated y and z variables are also cointegrated. [B] Unit roots test of the cointegration regression residuals z) t = yt - ( c + u t
(OLS estimator) [C] If y and z are cointegrated, then is superconsistent.

The estimates of the long run parameters are: y = 0.105 = 0.94 ; pi = 0.251= 2.24 . 0.112 0.112 = -0.112: each The estimate of the loading parameter is quarter, lcus adjusts by about 10% towards the target (equilibrium) level given by: lcus* = 0.94 lyus 2.24 pius. Often the absence of a long run relationship is the symptom of the omission of relevant variables. In fact, the partial long run relationship (lcus-lyus) is very persistent (i.e. reverts slowly), while the additional information from pius path can further explain savings inertia, and form a cointegrated relationship together with (lcus-lyus). plot lcus-lyus pius
0.06 0.04

pius
-0.14 -0.16 -0.18 -0.20 -0.22 -0.24 -0.26 60 65 70 75 80 85 90 95 00

0.02 0.00

lcus-lyus

-0.02

[D] Dynamic (short run) ECM modelling of yt, zt and t . Note that under the hypothesis of cointegration all u the variables in the ECM model are I(0).

41

42

RATIONALE OF STEPS [A, B]: COINTEGRATION AND COMMON TRENDS At univariate level, suppose: yt = 1t + v1t where: 1t = 1 + 1t-1 + 1t 1t ~ i.i.d. (0, 211) v1t = v1 + 1 v1t-1 + 1t 1t ~ i.i.d. (0, 212) 1t is a random walk (the non stationary component of yt), and v1t is an AR(1) with |1|<1 (the stationary component of yt). The same is supposed for zt: zt = 2t + v2t where: 2t = 2 + 2t-1 + 2t 2t ~ i.i.d. (0, 221) v2t = v2 +2 v2t-1 + 2t 2t ~ i.i.d. (0, 222) In the static regression: yt = c + zt + ut , we have that: ut = yt - zt c = 1t + v1t - (2t + v2t) c = = [1t - 2t] + [v1t - v2t] c where: [1t - 2t] are the I(1) components, and [v1t - v2t] are the I(0) components. The cointegration condition is [1t - 2t] = 0 (when combined, the I(1) components of y and z variables cancel each other). Under the cointegration condition, the error term is ut = (v1t - v2t c) ~ I(0) while, if the cointegration condition is not satisfied, ut ~ I(1). In other terms, ut ~ I(0) means that 1t = 2t: the I(1) component of yt is the same as that of zt up to a scalar , the parameter that converts 2t in 1t. 2t is the stochastic common trend of yt and zt ; in fact, under the cointegration assumption, we have that: zt = 2t + v2t yt = 2t + v1t

2t is the common source of nonstationarity; and, by substituting 2t definition in the cointegrated yt we have: yt = (zt - v2t)+ v1t = c + zt + (v1t v2t c) Again, the static regression residuals are stationary (though autocorrelated) if yt and zt are cointegrated. INTUITION BEHIND STEP [C]: SUPERCONSISTENCY If yt and zt are cointegrated, the OLS method yields a superconsistent estimator of the cointegrating parameters since the effect of the common trend dominates the effect of the stationary component. The omission of the dynamics is not very relevant if the variables are I(1) and cointegrated. The cointegrated combination is a strong linear relationship, but with relevant biases in small samples (see Banerjee et al. (1993) results). Example with simulated data. Eviews/new/workfile/ undated/1 200/open/program/supercon.prg/: run the program, and it will display different dispersions around the regression lines in I(0), yi0 against zi0, and I(1), yi1 against zi1, variables. All data were simulated with long run parameter equal to 1, with both I(0) and I(1) variables, and adjustment parameter equal to . The regression output from I(1) variables ls yi1 c zi1 shows that 200 observation are enough to enjoy superconsistence of the OLS estimator of cointegrated relations. The recursive estimation gives the intuition of the issue: with samples < 100 observations the amount of the

43

44

bias is very relevant (this result confirms Banerjee et al. (1993) outcomes).
8 6 4 2 YI0

50

40

Previous results with simulated I(1) time series suggests a further question: why the static (cointegration) regression does estimate the contegration (long run) parameter even though the DGP is dynamic?. Hypotheses: (i) y and z are I(1) and cointegrated; (ii) the true DGP is (see eq. 4 p. 33): [t~nid(0,2)] yt =c+ 1 yt-1+ 0 zt+ 1 zt-1+ t Fact: if y and z are cointegrated, the OLS regression of y on z alone (with no lags) yields a slope that is a (super)consistent estimator of the long run parameter = (0+1)/(1-1) to minimise because: the OLS criterion of picking the sum of squared residuals forces it towards [(0+1)/(1-1)]. In fact, if 0 [(0+1)/(1-1)] yt = zt + u1t where u1t = 1 zt-1 +1 yt-1 +t [(0+1)/(1-1)] yt = zt + u2t ; where u2t = 1/(1-1) zt + 1/(1-1) yt + t/(1-1)

YI1
0 -2 -4 -6 -4 -2 0 ZI0 2 4

30

20

10 0 10 20 ZI1 30 40 50

1.5

1.0

0.5

0.0

-0.5 20 40 60 80 100 120 140 160 180 200

The residuals of the static regression are autocorrelated (LM test and correlogram) but stationary. Following the Engle and Granger (1987) approach, white noise residuals are not essential at this stage. Save residuals and perform the ADF test without deterministic components (CRDW and CRDF with different c.v., see Engle-Granger, 1987). The same regression with I(0) variables ls yi0 c zi0 shows short term (and not long term) parameter estimate, and autocorrelated residuals; by adding the lagged dependent variable, we are able to find a consistent estimate of the long run parameter.

while, if

Given the hypotheses (i) and (ii), u1t ~ I(1) and u2t ~ I(0). Since the sum of squares of an I(1) variable increases without size as the sample goes to infinity, the OLS estimator will pick the estimate such that the corresponding residuals are closer to an estimate of u2t instead of u1t. Of course, this fact per se does not prevent u2t from being autocorrelated.

45

46

Example: the Engle and Granger application to the Phillips curve for Italy. Eviews/phil/ [1st step] ls dlw c lu DW = 0.4, R2 = 0.35

Another symptom of the absence of a stable (cointegrated) long run relationship among the variables of interest is the recursive estimate plots of the long run parameters:
0.6 0.3 0.2 0.4 0.1 0.0 0.2 -0.1

Dependent Variable: DLW Method: Least Squares Sample(adjusted): 1961 1999 Included observations: 39 after adjusting endpoints Variable Coefficient Std. Error t-Statistic C 0.247763 0.031744 7.804992 LU -0.075388 0.016687 -4.517788
0.25 0.20 0.15 0.10 0.10 0.05 0.00 0.05 0.00 -0.05

Prob. 0.0000 0.0001

0.0

-0.2 -0.3

-0.2 70 75 80 85 90 95 2 S.E. Recursive C(1) Estimates

-0.4 70 75 80 85 90 95 2 S.E. Recursive C(2) Estimates

What is the reason for such result? The scatter of wageunemployment trade-off can be suggestive: scat lu dlw (connect/line)
0.25 0.20

-0.05 -0.10

0.15 DLW
65 70 75 80 85 Actual 90 95 Fitted

0.10 0.05

Residual

Figure above suggests very persistent residuals, and the magnitude of ADF residuals test reported below confirm visual inspection:
ADF Test Statistic -1.220092 (c.v. are in Engle-Granger (1987)

0.00 -0.05 1.0 1.5 2.0 LU 2.5 3.0

The impression is that the inflation omission from the information set prevents the other two variables to be cointegrated. In fact, the sudden rise of inflation at the beginning of 70s pushed up nominal wage growth, without a corresponding reduction in the unemployment rate.

47

48

ls dlw c dlp lu DW = 1.91, R2 = 0.92


Dependent Variable: DLW Method: Least Squares Sample(adjusted): 1961 1999 Included observations: 39 after adjusting endpoints Variable Coefficient Std. Error t-Statistic C 0.152943 0.012837 11.91409 DLP 0.832169 0.052361 15.89303 LU -0.059509 0.006058 -9.823363 ADF Test Statistic -4.154235

Prob. 0.0000 0.0000 0.0000

Now there is cointegration (see the ADF test above). How to test for wage growth elasticity to inflation being one? With a restricted regression: ls dlw-dlp c lu whose results say that cointegration still holds: in fact, the residuals are autocorrelated (i.e. persistent)
0.15 0.10 0.05 0.06 0.04 0.02 0.00 -0.02 -0.04 -0.06 65 70 75 80 85 Actual 90 95 Fitted 0.00 -0.05

genr ecm = resids [2nd step] short term dynamics among I(0) variables. The starting point is a dynamic model up to 4th order (3rd because the short run dynamics is in differences). Given that residuals are white noise, we test for dropping some lags: the F deletion test for all 2 and 4 lags is not rejected with a p-value = 89.6%. The restricted model is depicted below:
Dependent Variable: D(DLW) Method Least Squares Sample(adjusted): 1963 1999 Included observations: 37 after adjusting endpoints Variable Coefficient Std. Error t-Statistic C 0.005225 0.003472 1.504584 D(DLP) 0.835032 0.156380 5.339758 D(DLP(-1)) -0.010759 0.224236 -0.047979 D(DLW(-1)) -0.214205 0.171202 -1.251178 D(LU) -0.175406 0.043813 -4.003544 D(LU(-1)) -0.068000 0.046539 -1.461134 ECM(-1) -0.848525 0.232109 -3.655714 R-squared 0.762838 Mean dependent var Adjusted R-squared 0.715405 S.D. dependent var S.E. of regression 0.017325 Akaike info criterion Sum squared resid 0.009005 Schwarz criterion Log likelihood 101.4359 F-statistic Durbin-Watson stat 1.915861 Prob(F-statistic)

Residual

Prob. 0.1429 0.0000 0.9621 0.2205 0.0004 0.1544 0.0010 -0.002961 0.032476 -5.104643 -4.799875 16.08260 0.000000

but stationary:
Durbin-Watson statistic ADF Test Statistic 1.535165 -4.790140

and the estimated long run cointegrated relationship is: dlw dlp = 0.134 0.0563 lu note that the estimated parameter for the unemployment rate (in logs) is very similar to the previous one.

Residual checks do not show relevant problems. The restriction to zero of not significant parameters is not rejected with p-value = 29.5%. The restricted model presents very stable recursive estimates (results not reported). The retained uniequational model is:
t dlwt = 0.003 + 0.76 dlpt 0.16 lut 0.86 ecmt-1 +

49

50

6. MODELLING SYSTEMS Problems with the single equation approach Fundamentally, there are problems when: i. not all the right hand variables in the cointegration vector are weakly exogenous (loss of information); ii. there are more than one cointegrating vector (when the number of variables of interest > 2). In the previous long run Phillips curve, we were interested in analysing the long run relationship between real wage growth and (logs of) the unemployment rate, then the point ii. was not a problem since between two I(1) variables there can be at maximum only one cointegration relationship. Instead, point i. is still a potential problem, given that in modelling ECM equation for wage we did not model the determinants of the other variables that enter on the right side of the ECM equation for wage (inflation and unemployment rates). This fact does not lead to a loss of information (long run estimator inefficiency) only if the cointegration relationship does not enter the other two equations for short run inflation and unemployment: this is the definition of weak exogeneity. In addition, both 2nd step Engle-Granger short run estimator and ARDL model are conditional models since simultaneous explanatory variables also appear as regressors. The potential advantages of not modelling additional variables and of reducing the number of short run explanatory variables in the equation require that the conditioning variables are weakly exogenous.

Weak exogeneity issue Though a better testing for weak exogeneity should be implemented in the field of the system (Johansen) cointegration, a first check can be accomplished by the following steps with reference to previous Phillips curve analysis. Eviews/phil/system phillips contains 3 reduced form equations, where only I(0) variables are included:
D(DLW) = C(11) + C(12)*D(DLW(-1)) + C(13)*D(DLP(-1)) + C(14)*D(LU(-1)) + C(15)*ECM(-1) D(DLP) = C(21) + C(22)*D(DLW(-1)) + C(23)*D(DLP(-1)) + C(24)*D(LU(-1)) + C(25)*ECM(-1) D(LU) = C(31) + C(32)*D(DLW(-1)) + C(33)*D(DLP(-1)) + C(34)*D(LU(-1)) + C(35)*ECM(-1)

System SUR estimation is required in order to test for the null: c(25)=0,c(35)=0 F = 4.77 [9.2%] DLP and LU levels contribute to the definition of the Phillips curve long run equilibrium (hidden inside the ECM term), but do not converge to that equilibrium. D(DLP) and D(LU) equations do not contain information about the long run parameters, since the cointegration relationship does not enter into these equations. (Eviews/phil/system: philrestr): Note that in the first equation, the restricted SUR estimate of the loading parameter is quite similar to that we reported above (see Engle-Granger 2nd step). If you add in the first equation D(DLP) and D(LU), you will reproduce OLS results (think to the orthogonality condition in OLS). Eviews/phil/system: philcond. It is valid to condition on DLP and LU to follow the uniequational approach because they are weakly exogenous.

51

52

Cointegration rank issue When we operate in a multivariate I(1) framework (n > 2 variables of interest) we can have up to (n-1) cointegrating relationships (which is the cointegration rank, r), and the single equation approach can lead to serious troubles if there are multiple cointegrating vectors. Example: suppose that the true model has r = 2: R6t = c + 1 (R6t-1 R3t-1) + 2 (R6t-1 R12t-1) + t = c + 1 R6t-1 1 R3t-1 + 2 R6t-1 2 R12t-1 + t = c + (1+2) [R6t-11/(1+2)R3t-1-2/(1+2)R12t-1] + t 1 2 In the single equation dynamic modelling we estimate as long run elasticities (1, 2) which instead in the true model are mixtures of the cointegrating parameters (1, -1) and (1, -1), with the loading factors (1, 2). In fact, from Eviews/termine/ we have that:
ls drbot6 c drbot6(-1 to 2) drbot3(0 to 2) drbot12(0 to 2) ARDL: rbot3(-1) rbot6(-1) rbot12(-1) E-G 1st step: ls rbot6 c rbot3 rbot12

Why the cointegration rank < n? Generalisation to the case of n variables of the common trend analysis (vector approach): [5] xt = t + vt
x1t 1t v1t x 2t 2t v 2t = + x v nt nt nt

with t ~ I(1) and vt ~ I(0)

If there is one cointegration relationship (r = 1), there exists a (n1) vector such that: [6] t = 11t + 22t + ... + nnt = 0 xt = t + vt = vt the linear combination xt is I(0) because a combination of I(0) variables is always I(0). With one cointegration relationship, one of the n trends it can be expressed as a linear combination of the other trends, e.g. (if we normalise for 1) from [6] we have: 1t = -2/1 2t - ... - n/1 nt In general, there can be multiple linear relationships among the trends and the number of these relationships is the COINTEGRATING RANK r. In this case, there are r < n linear relations such that t = 0 where is a (nr) matrix and 0 is a (r1) vector. Hence: xt = vt r relationships (combinations) among the n variables of interest are stationary. and premultiplying [5] by and substituting [6] we have:

ARDL cointegration approach is in: equation/ardl Engle-Granger 1st step cointegration is in: equation/eg1step It is worth to note that: cointegration still persists long run parameters are unidentified (mixed)

53

54

If r = n, then a nn full rank matrix B would exist such that: Bt = 0; then Bxt = Bvt xt = (B)-1Bvt = vt but this result is impossible since xt ~ I(1) and vt ~ I(0) by definition. The non stationarity accounting: number of variables in xt number of cointegrating relationships number of common stochastic trends n r = (n-r)

Johansens approach is based on an unrestricted vector autoregressive approach (UVAR). Lets start with a quite general VAR(2) model with n I(1) variables of interest and standard : [7] Xt = c0 + c1 t + A1 Xt-1 + A2 Xt-2 + t that is the multivariate analogous of previous AR(2) model: xt = c + t + 1 xt-1 + 2 xt-2 + t that (do remember the basics of the ADF test model) can be reparametrised as: xt = c + t + xt-1 + 1 xt-1 + t where: = 1+2-1; and 1 = -2 In the same way, VAR model in [7] can be reparametrised: [7] Xt = c0 + c1 t + Xt-1 + 1 Xt-1 + t where a particularly relevant role is played by the nn matrix = A1 + A2 I, defined as the long run multiplier matrix. In [7], changes of each variable in X are predicted by a linear combination of past values of all variables, provided that rank() 0. In the ADF case the main point was testing for unit roots under the null = 0 presence of a unit root stationarity can be achieved only by taking the first differences of the xt variable. The first step of the Johansens cointegration approach is to test for the rank(). If the null Ho: rank() = 0 is not rejected, then the system (VAR) becomes stationary only by imposing n unit roots to the n variables in the Xt vector. On the other side, we noted above that rank() = n is impossible, since the n variables are I(1). Hence, by definition, the cointegrating rank 0 r < n.

There is no point in imposing n unit roots to the variables of the system (xt), since r unit roots simplify (evaporate) thanks to cointegration among the n variables in xt.
PUTTING ALL PREVIOUS THINGS TOGETHER

Single equation dynamic modelling (e.g. Pesaran et al.) and the 2-step (Engle-Granger) approaches are both valid (i.e. we can use them without loss of information) ONLY IF two conditions are satisfied: cointegration rank r = 1; weak exogeneity of the explanatory (forcing) variables for the long run parameters of interest. otherwise, we must follow the procedure proposed by Soren Johansen within the framework of the Vector AutoRegressive (VAR) model that achieves both the results of: testing for the cointegration rank; imposing identifying restrictions on the reduced rank regressions.

55

56

The matrix is a reduced rank matrix, and can be decomposed: = , where and are nr matrices. The r linear combinations are such that Xt ~ I(0). From equation [7] the (reduced form) vector error correction model (VEC) is obtained by substituting matrix with : [8] Xt = c0 + c1 t + Xt-1 + 1 Xt-1 + t As far as the deterministic components of the VAR are concerned, Eviews allows for 5 different cases (as many other packages): 1) no intercepts, no trends: c0=c1=0 (unlikely to be relevant); 2) restricted intercepts, no trends: c0=0, c1=0 (non trended variables); 3) unrestricted intercepts, no trends: c00, c1=0 (for unit roots models with drifts); 4) unrestricted intercepts, restricted trends: c00, c1=1 (linear deterministic trends in the data); 5) unrestricted intercepts, unrestricted trends: c00, c10 (quadratic deterministic trends in the data); The null hypothesis of the trace test is Hr: rank() = r against the alternative hypothesis of (trend)-stationarity: Hn: rank() = n (full rank). The trace statistic is also a loglikelihood ratio statistic, and the appropriate critical values for all the five cases are reported by Eviews. To determine the number of cointegrating relations r, subject to the assumptions made about the trends in the series, we can proceed sequentially from r = 0 to r = n-1 until we fail to reject. The first row in the upper table tests the hypothesis of no cointegration, the second row tests the hypothesis of one cointegrating relation, the third row tests

the hypothesis of two cointegrating relations, and so on, all against the alternative hypothesis of full rank, i.e. all series in the VAR are stationary. After the value of r is estimated, the second step is to identify . When r = 1, there are no problems: the normalisation (one) restriction for the parameter of what the economic theory suggests is the dependent variable yields a unique estimate up to a scaling parameter. However, when r > 1, the problem of identification arises. The appropriate procedure would be to estimate the cointegrating relationships subject to a priori restrictions from the economic theory. Suppose there are r cointegrating relations and is a nr matrix, then we need at least r restrictions (including the normalisation restriction) on each of the r cointegrating relationships. The exact identification of the whole cointegrating parameters requires rr restrictions. Remember: the source of these identifying restrictions is usually from a priori theory. As far as the role of theory in providing these restrictions are discussed in Pesaran (1997). Unfortunately, Eviews 3.1 is very rough in approaching the issue; since the version 4 it has been considerably improved. General practical advice The order p of the VAR often plays a crucial role in the subsequent analyses and particular attention must be devoted to obtain vector white noise residuals. On the other side, after the order is selected sufficiently long, remaining

57

58

observations must be still enough for asymptotic theory to work reasonably well difficult balancing act. It is also important to remember that the lag specification that EViews prompts you to enter in the VAR refers to lags of the first difference terms in the VEC. For example, 1 1 specifies a model involving a regression of the first differences on one lag of the first difference. VAR approach is highly data intensive, particularly if n is large: when n = 4 and p = 5, each equation of the VAR contains 20 unknown parameters (plus possible deterministic components). The five cointegrating VAR cases presume that the variables are I(0) or I(1), and that the nature of the trends in Xt variables is ascertained (by plotting, no econometric theories). Sometimes, the trace test outcome is overtaken by a priori information from the long predictions of a suitable economic model (it is also important the sensitivity analysis to the choice of r). Eviews/phil/open group: dlwp lu /view/cointegration test/ model 3/
Series: DLWP LU Lags interval: 1 to 1 Eigenvalue 0.325332 0.025426 Likelihood Ratio 15.51374 0.952944 5 Percent Critical Value 15.41 3.76 1 Percent Critical Value 20.04 6.65 Hypothesized No. of CE(s) None * At most 1

DLWP 1.000000

LU 0.050439 (0.00732)

C -0.122536

It is worth to note that the Johansens approach provides quite similar long run estimates to the Engle-Granger 1st step approach, since as we previously noted the latter meets both the requirements to avoid main specification and estimation problems.
Eviews: procs/make a vector autoregression/1 1/VEC:
Sample(adjusted): 1963 1999 Included observations: 37 after adjusting endpoints Standard errors & t-statistics in parentheses Cointegrating Eq: DLWP(-1) LU(-1) CointEq1 1.000000 0.050439 (0.00732) (6.88642) -0.122536 D(DLWP) -0.828431 (0.24889) (-3.32854) -0.014596 (0.18953) (-0.07701) -0.072898 (0.04059) (-1.79586) 0.000246 (0.00360) (0.06849) 0.368937 0.311568 0.020244 93.91271 -0.002026 0.024398 D(LU) -0.201773 (1.03614) (-0.19474) -0.978637 (0.78904) (-1.24029) 0.149233 (0.16899) (0.88310) 0.027836 (0.01497) (1.85958) 0.143114 0.065216 0.084276 41.14123 0.034860 0.087166 1.71E-06 140.6207

C Error Correction: CointEq1

D(DLWP(-1))

D(LU(-1))

*(**) denotes rejection of the hypothesis at 5%(1%) significance level L.R. test indicates 1 cointegrating equation(s) at 5% significance level Unnormalized Cointegrating Coefficients: DLWP 12.29458 -2.007333 LU 0.620128 -0.481619

R-squared Adj. R-squared S.E. equation Log likelihood Mean dependent S.D. dependent Determinant Residual Covariance Log Likelihood

Normalized Cointegrating Coefficients: 1 Cointegrating Equation(s)

59

60

Next steps are: weak exogeneity test together with both long run parameter estimation and short term dynamics modelling. Then, residual tests (var_we). You must use version 4 of Eviews.
Vector Error Correction Estimates Sample(adjusted): 1963 1999 Included obs: 37 after adj endpoints Standard errors in ( ) & t-statistics in [ ] Cointegration Restrictions: B(1,1)=1,A(2,1)=0 Convergence achieved after 3 iterations. Restrictions identify all cointegrating vectors LR test for binding restrictions (r = 1): Chi-square(1) 0.040194 Probability 0.841102 Cointegrating Eq: CointEq1 DLWP(-1) 1.000000 LU(-1) 0.050082 (0.00768) [ 6.52274] -0.121878 D(DLWP) -0.851481 (0.21370) [-3.98449] D(LU) 0.000000 (0.00000) [ NA ]

VEC Res Ser Correl LM Tests H0: no serial cor at lag order h Included observations: 37 Lags 1 2 3 LM-Stat 3.035956 7.999531 3.388368 Prob 0.5518 0.0916 0.4951

Probs from chi-square (4 df.) VEC Residual Normality Tests H0: residuals are multivariate normal Included observations: 37 Component Skewness 1 2 Joint Component Kurtosis 1 3.517597 2 3.805923 Joint Component 1 2 Joint Ja-Bera 2.726891 6.095639 8.822530 Chi-sq df 1 1 2 df 1 1 2 Prob. 0.2558 0.0475 0.0657 Prob. 0.1282 0.0240 0.0246 Prob. 0.5204 0.3170 0.4930 0.612554 2.313868 -0.908903 5.094309 7.408177 Chi-sq 0.413023 1.001330 1.414353 df 2 2 4

C Error Correction: CointEq1

D(DLWP(-1))

-0.014914 -0.984184 (0.18923) (0.78829) [-0.07881] [-1.24850] -0.073151 0.149749 (0.04059) (0.16907) [-1.80235] [ 0.88572] 0.000254 0.027807 (0.00359) (0.01497) [ 0.07055] [ 1.85741] 0.369650 0.143011 0.312345 0.065102 0.013508 0.234409 0.020232 0.084281 6.450618 1.835631 93.93361 41.13899 -4.861276 -2.007513 -4.687123 -1.833360 -0.002026 0.034860 0.024398 0.087166

D(LU(-1))

VEC Residual Heteroskedasticity Tests: Includes Cross Terms Included observations: 37 Joint test: Chi-sq 20.47036 df 27 Prob. 0.8104

R-squared Adj. R-squared Sum sq. resids S.E. equation F-statistic Log likelihood Akaike AIC Schwarz SC Mean dependent S.D. dependent

Individual components: Dependent R-squared res1*res1 res2*res2 res2*res1 0.092625 0.292728 0.211538 F(9,27) 0.306241 1.241647 0.804877 Prob. 0.9662 0.3122 0.6155 Chi-sq(9) 3.427126 10.83092 7.826916 Prob. 0.9449 0.2875 0.5517

Previous single-equation results are strongly confirmed

61

62

Another example can be drawn from Italian Treasury bills interest rates. Eviews/termine/select rbot3 rbot6 rbot12/ double click/open group/view/coint./ 1 1/case 2
Sample: 1979:01 1998:09 Included observations: 223 Test assumption: No deterministic trend in the data Series: RBOT3 RBOT6 RBOT12 Lags interval: 1 to 1 Eigenvalue 0.236692 0.103856 0.007064 Likelihood Ratio 86.26466 26.03375 1.580939 5 Percent Critical Value 34.91 19.96 9.24 1 Percent Critical Value 41.07 24.60 12.97 Hypothesized No. of CE(s) None ** At most 1 ** At most 2

Standard errors in ( ) & t-statistics in [ ] Cointegration Restrictions: B(1,1)=1,B(1,2)=0,B(2,1)=0,B(2,2)=1,B(1,3)=-1,B(2,3)=-1 Convergence achieved after 2 iterations. Restrictions identify all cointegrating vectors LR test for binding restrictions (rank = 2): Chi-square(2) 1.926163 Probability 0.381715 Cointegrating Eq: CointEq1 CointEq2 RBOT3(-1) 1.000000 0.000000 RBOT6(-1) RBOT12(-1) C 0.000000 1.000000

-1.000000 -1.000000 -0.001166 -0.000818 (0.00094) (0.00075) [-1.23619] [-1.09389] D(RBOT3) D(RBOT6) -0.528178 0.025085 (0.14886) (0.13285) [-3.54818] [ 0.18882] 0.369898 -0.193926 (0.20690) (0.18465) [ 1.78778] [-1.05022] D(RBOT12) -0.041694 (0.11349) [-0.36737] 0.091514 (0.15775) [ 0.58013]

*(**) denotes rejection of the hypothesis at 5%(1%) significance level L.R. test indicates 2 cointegrating equation(s) at 5% significance level Unnormalized Cointegrating Coefficients: RBOT3 -22.66285 1.086596 1.133854 RBOT3 1.000000 RBOT6 26.88986 16.60378 -1.632210 RBOT6 -1.186517 (0.09561) 2850.842 RBOT12 -4.529554 -17.67262 0.407011 RBOT12 0.199867 (0.09712) C 0.042690 -0.017085 -0.055662 C -0.001884 (0.00123)

Error Correction: CointEq1

Normalized Cointegrating Coefficients: 1 Cointegrating Equation(s)

CointEq2

Log likelihood

Step (b): Weak exogeneity test for 12-month TB rate:


RBOT12 -0.986434 (0.02471) -0.999818 (0.01949) C -0.002881 (0.00326) -0.000840 (0.00257) 2863.069
Vector Error Correction Estimates Sample(adjusted): 1980:03 1998:09 Included observations: 223 after adjusting endpoints Standard errors in ( ) & t-statistics in [ ] Cointegration Restrictions: B(1,1)=1,B(1,2)=0,B(2,1)=0,B(2,2)=1,B(1,3)=-1, B(2,3)=-1,A(3,1)=0,A(3,2)=0 Convergence achieved after 2 iterations. LR test for binding restrictions (rank = 2): Chi-square(4) 2.286812 Probability 0.683171 Cointegrating Eq: CointEq1 1.000000 0.000000 -1.000000 -0.001085 (0.00095) [-1.14796] D(RBOT3) -0.485189 (0.09290) CointEq2 0.000000 1.000000 -1.000000 -0.000729 (0.00075) [-0.97187] D(RBOT6) 0.070152 (0.05318) D(RBOT12) 0.000000 (0.00000) RBOT3(-1) RBOT6(-1) RBOT12(-1)

Normalized Cointegrating Coefficients: 2 Cointegrating Equation(s) RBOT3 1.000000 0.000000 RBOT6 0.000000 1.000000

Log likelihood

As before, in what follows we have to switch to Eviews 4 in order to deepen the multivariate cointegration analysis. Step (a): imposing long-run overidentification restrictions
Vector Error Correction Estimates Sample(adjusted): 1980:03 1998:09 Included observations: 223 after adjusting endpoints

Error Correction: CointEq1

63 [-5.22280] CointEq2 0.275717 (0.12920) [ 2.13405] 0.132715 (0.13405) [ 0.99002] -0.168470 (0.20015) [-0.84174] 0.022412 (0.20417) [ 0.10977] [ 1.31917] -0.292604 (0.07396) [-3.95632] 0.176900 (0.11962) [ 1.47880] -0.155026 (0.17860) [-0.86799] 0.109891 (0.18220) [ 0.60315] [ NA ] 0.000000 (0.00000) [ NA ] 0.028579 (0.10222) [ 0.27958] 0.015922 (0.15262) [ 0.10432] 0.060543 (0.15569) [ 0.38886]

64

SF_data.wf1

Eviews quarterly database (period: 86.1-00.4)

D(RBOT3(-1))

Empirical analysis
spot, forward and premium plots unit root tests (testing down from pmax=6) ldmus ADF(5) = -2,699 ldmusf ADF(4) = -1.824 ldmus_fp ADF(0) = -6.085** Cointegrated VAR approach, from UVAR(1) [lags 1 1] diagnostic checks on residuals are OK, then pass to rank test [lags 0 1] r = 1 Estimate the VEqCM (with restricted long-run):
Vector Error Correction Estimates Sample(adjusted): 1986:2 2000:4 Included observations: 59 after adjusting endpoints Standard errors in ( ) & t-statistics in [ ] Cointegration Restrictions: B(1,1)=1,B(1,2)=-1 Restrictions identify all cointegrating vectors LR test for binding restrictions (rank = 1): Chi-square(1) 3.054771 Probability 0.080500 Cointegrating Eq: CointEq1 LDMUSF(-1) 1.000000 LDMUS(-1) Error Correction: CointEq1 -1.000000 D(LDMUSF) 0.088322 (0.42940) [ 0.20569] -0.000313 0.229573 0.062914 79.98022 -0.002014 0.062904

D(RBOT6(-1))

D(RBOT12(-1))

Exercise:

Cointegration and forward (ft) spot (st) D-Mark/US $ exchange rate (Zivot, 2000)

Economic theory (in pills ...) The Forward Rate Unbiasedness Hypothesis (FRUH) is based on rational expectations and risk neutrality hypotheses, and defines: (1) Rational expectation forecast error: such as: Et(ut+1) = 0 ut+1 = st+1 - Et(st+1) (2) Risk neutrality of the forward rate: ft = Et(st+1) Together, (1) + (2) lead to the relationships: level: st+1 = ft + ut+1 difference: st+1 = (ft - st) + ut+1
where: ft - st is the forward premium

The FRUH hypothesis can be tested in both (level and difference) models. Variables of interest (sf_data.wf1)
ldmus ldmusf DM-US$ spot exchange rate (logs) DM-US$ forward (3-m) exchange rate (logs)

R-squared Sum sq. resids S.E. equation Log likelihood Mean dependent S.D. dependent

D(LDMUS) 0.826023 (0.37079) [ 2.22773] 0.078012 0.171185 0.054327 88.63774 -0.001662 0.056579

65

66

In addition, the theory would predict that the forward rate is weakly exogenous, and that the loading parameter in the spot rate is 1. In order to check these restrictions, pass to the system
SYS_DMUS_FS: D(LDMUSF) = C(1)*( LDMUSF(-1) - 1*LDMUS(-1) ) D(LDMUS) = c(2)*( LDMUSF(-1) - 1*LDMUS(-1) )

The final (empirical) model is: ft = ft st = ft-1 - st-1 + st ft is a random walk st = ft-1 + st

obtain the FIML estimate, and impose the Wald test:


c(1)=0,C(2)=1

It is clear that our empirical model is in line with the FRUH approach, where ft is the forward shock, and st = st - ft-1 is the realised profit/loss from speculation. Note that both errors are white noise (unpredictable) processes.

The results are reproduced below:


System: SYS_DMUS_FS Estimation Method: Full Information Maximum Likelihood (Marquardt) Sample: 1986:2 2000:4 Included observations: 59 Total system (balanced) observations 118 Convergence achieved after 1 iteration Coefficient C(1) C(2) 0.088327 0.826024 Std. Error 0.448442 0.372911 245.2717 8.40E-07 z-Statistic 0.196963 2.215073 Prob. 0.8439 0.0268

MODELLING A SMALL MACROECONOMIC SYSTEM: FROM VAR TO SEM (Bagliano-Golinelli-Morana, 2003) The theoretical model
Following Gerlach-Svensson (2001) we nest in the inflation equation both the Phillips curve and the price gap effects (Hallman-Porter-Small, 1991): where: et t-1 t = et + y (qrt-1) + m [pt-1 - p*t-1] + t (rw model of expectations) (output gap) (rw model of potential output)

Log Likelihood Determinant residual covariance

Equation: D(LDMUSF) = C(1)*( LDMUSF(-1) - 1*LDMUS(-1) ) Observations: 59 R-squared -0.000313 Mean dependent var Adjusted R-squared -0.000313 S.D. dependent var S.E. of regression 0.062914 Sum squared resid Durbin-Watson stat 1.991218 Equation: D(LDMUS) = C(2)*( LDMUSF(-1) - 1*LDMUS(-1) ) Observations: 59 R-squared 0.078012 Mean dependent var Adjusted R-squared 0.078012 S.D. dependent var S.E. of regression 0.054327 Sum squared resid Durbin-Watson stat 1.959340 Wald Test: System: SYS_DMUS_FS Null Hypothesis: Chi-square C(1)=0 C(2)=1 4.134558 Probability 0.126530

-0.002014 0.062904 0.229573

-0.001662 0.056579 0.171185

qrt yt - y*t y*t = y0 + y*t-1 + yt

mt-pt = m0 + m1 yt + m2 (lt - st) + mt (real money demand) p*t mt - [m0 + m1 y*t + m2 (l*t - s*t) ] (P-star)

The long run solution of the model:

67

68

= e y = y* l* = 0f + l* = 0s + s* m*-p*= 0 + m1 y* where: 0 m0 + m2 0s

..... output omitted .... The structural model predicts a stochastic behaviour of the variables that is coherent with univariate DF test outcomes. Multivariate analysis Johansen (1995) cointegrated VARs approach to the vector of the endogenous variables of the system: xt = [(m-p)t yt lt st t qrt ] Note: from univariate analysis, the variables m-p, y, l, s, and are I(1), qr is I(0) The UVAR(3) model is chosen on the basis of both AIC criterion and residual diagnostic test results that support the third order dynamics. The corresponding Johansens rank test results are:
m-p ~ I(1) ; [(m-p)- y] ~ I(1)
Eviews/BaGoMo_CUP/BGM_UVAR/view/coint.test/3
Sample: 1981:4 1997:3 Included observations: 64 Trend assumption: Linear deterministic trend Series: MP S Y L DP YGAP Lags interval (in first differences): 1 to 2 Unrestricted Cointegration Rank Test Hypothesized No. of CE(s) Eigenvalue None ** At most 1 * At most 2 * At most 3 At most 4 At most 5 0.410572 0.324775 0.284830 0.225796 0.113139 0.038272 Trace Statistic 106.9796 73.14907 48.01566 26.56064 10.18178 2.497522 5 Percent CV 94.15 68.52 47.21 29.68 15.41 3.76 1 Percent CV 103.18 76.07 54.46 35.65 20.04 6.65

The structural model long run solution predicts the following stochastic properties of the variables:
e and y* are I(1) , y ~ I(1) ; qr ~ I(0) l and are CI(1,1) l and s are CI(1,1) l , s ~ I(1) ; (l-s) ~ I(0); (l-) ~ I(0) m-p and y are CI(1,1)

The outcomes from univariate Dickey-Fuller (DF) integration test confirm the predictions of the structural model long run solution
see Bagliano-Golinelli-Morana (2002, Table 1)

Univariate analysis

*(**) denotes rejection of the hypothesis at the 5%(1%) level

rank = 4 (at 10% significance level) ....

69

70

... because it is in line with the theoretical model outlined above. Then, we can suppose the following four long run relationships: a simple money demand function; relation between the inflation rate and the long term interest rate (Fisher parity); relation between the short and the long term interest rates (term structure of interest rates); output gap stationarity Are empirical realisations in line with such an interpretation? a) start with the unrestricted VAR (UVAR) the Haavelmo distribution b) impose the cointegration tank restrictions, and c) test for the long run over-identifying restrictions (CVAR)
Eviews/BaGoMo_CUP_V4/BGM_CVARlong/
Vector Error Correction Estimates Sample: 1981:4 1997:3 Included observations: 64 Standard errors in ( ) & t-statistics in [ ] Cointegration Restrictions: B(1,1)=1,B(1,3)=0,B(1,4)=0,B(1,5)=0,B(1,6)=0, B(2,1)=0,B(2,2)=0,B(2,3)=1,B(2,4)=-1,B(2,5)=0,B(2,6)=0, B(3,1)=0,B(3,2)=0,B(3,3)=0,B(3,4)=1,B(3,5)=-1,B(3,6)=0, B(4,1)=0,B(4,2)=0,B(4,3)=0,B(4,4)=0,B(4,5)=0,B(4,6)=1, Convergence achieved after 6 iterations. Restrictions identify all cointegrating vectors LR test for binding restrictions (rank = 4): Chi-square(7) 15.48266 Probability 0.030287 Cointegrating Eq: CointEq1 CointEq2 CointEq3 MP(-1) 1.000000 0.000000 0.000000

Y(-1)

-1.663368 (0.01312) [-126.746] 0.000000 0.000000 0.000000 0.000000 12.16091 D(MP) -0.157779 (0.05669) [-2.78299] 0.083287 (0.07453) [ 1.11743] -0.111188 (0.07612) [-1.46067] -0.071663 (0.04907) [-1.46036]

0.000000

0.000000

0.000000

S(-1) L(-1) DP(-1) YGAP(-1) C Error Correction: CointEq1

1.000000 -1.000000 0.000000 0.000000 0.007432 D(Y) 0.090382 (0.09040) [ 0.99976] -0.211621 (0.11885) [-1.78054] -0.166579 (0.12138) [-1.37234] 0.111143 (0.07825) [ 1.42037]

0.000000 1.000000 -1.000000 0.000000 -0.051891 D(S) -0.087051 (0.06425) [-1.35493] 0.007442 (0.08446) [ 0.08811] -0.113507 (0.08626) [-1.31583] -0.001656 (0.05561) [-0.02979]

0.000000 0.000000 0.000000 1.000000 0.007849 D(L) 0.043514 (0.05339) [ 0.81500] 0.136887 (0.07019) [ 1.95015] -0.127045 (0.07169) [-1.77221] 0.027848 (0.04621) [ 0.60260] D(DP) 0.397329 (0.14666) [ 2.70920] -0.305398 (0.19281) [-1.58394] 0.577208 (0.19691) [ 2.93128] 0.297839 (0.12694) [ 2.34628] D(YGAP) -0.100884 (0.09198) [-1.09675] -0.086949 (0.12093) [-0.71900] -0.130594 (0.12350) [-1.05740] -0.181116 (0.07962) [-2.27483]

CointEq2

CointEq3

CointEq4

(... short run components omitted ...)

d) test for the long run over-identifying restrictions (CVAR) plus a number of restrictions on loading parameters
Eviews/BaGoMo_CUP/var/BGM_CVARload/
Vector Error Correction Estimates Sample: 1981:4 1997:3 Included observations: 64 Standard errors in ( ) & t-statistics in [ ] Cointegration Restrictions: B(1,1)=1,B(1,3)=0,B(1,4)=0,B(1,5)=0,B(1,6)=0, B(2,1)=0,B(2,2)=0,B(2,3)=1,B(2,4)=-1,B(2,5)=0,B(2,6)=0, B(3,1)=0,B(3,2)=0,B(3,3)=0,B(3,4)=1,B(3,5)=-1,B(3,6)=0, the same as above B(4,1)=0,B(4,2)=0,B(4,3)=0,B(4,4)=0,B(4,5)=0,B(4,6)=1, A(2,1)=0,A(3,1)=0,A(4,1)=0,A(6,1)=0, A(1,2)=0,A(3,2)=0,A(5,2)=0,A(6,2)=0,additional restrictions

CointEq4 0.000000

71

72

A(1,3)=0,A(2,3)=0,A(3,3)=0,A(6,3)=0, A(1,4)=0,A(2,4)=0,A(4,4)=0, Convergence achieved after 8 iterations. Restrictions identify all cointegrating vectors LR test for binding restrictions (rank = 4): Chi-square(22) 38.00057 Probability 0.018319 Cointegrating Eq: CointEq1 CointEq2 CointEq3 MP(-1) Y(-1) 1.000000 -1.645856 (0.02185) 0.000000 0.000000 0.000000 0.000000 0.000000

CointEq4 0.000000 0.000000

S(-1) L(-1) DP(-1) YGAP(-1) C Error Correction: CointEq1

1.000000

0.000000 1.000000

0.000000 0.000000 0.000000 1.000000 0.007849 D(L)

0.000000 -1.000000 0.000000 0.000000 12.03693 D(MP) -0.072375 (0.03119) [-2.32083]

0.000000 -1.000000 0.000000 0.000000

0.007432 -0.051891 D(Y) D(S) 0.000000 (0.00000) [ NA ] 0.000000 (0.00000) [ NA ]

D(DP)

D(YGAP) 0.000000 (0.00000) [ NA ] 0.000000 (0.00000) [ NA ] 0.000000 (0.00000) [ NA ]

0.000000 0.390804 (0.00000) (0.11450) [ NA ] [ 3.41305] 0.000000 (0.00000) [ NA ]

CointEq2

0.000000 -0.203311 (0.00000) (0.11461) [ NA ] [-1.77394] 0.000000 (0.00000) [ NA ] 0.000000 (0.00000) [ NA ] 0.000000 (0.00000) [ NA ]

0.000000 0.169642 (0.00000) (0.06102) [ NA ] [ 2.78021]

CointEq3

0.000000 -0.125265 0.524819 (0.00000) (0.05799) (0.16397) [ NA ] [-2.16003] [ 3.20077]

CointEq4

0.000000 0.058235 (0.00000) (0.02894) [ NA ] [ 2.01196]

0.000000 0.274028 -0.123909 (0.00000) (0.09958) (0.04424) [ NA ] [ 2.75175] [-2.80081]

the elasticity of money demand to income is significantly bigger than one confirms an usual result in literature, sometimes explained by the omission of wealth the long-run effect of inflation on interest rate is suggested by the Fisher equation positive deviations from the long run relationship between real money and GDP cause upward pressures on inflation and output, and the equilibrium correcting effect on real money as far as interest rate inflation disequilibria are concerned, they significantly affect only the level of the interest rate increases in the capacity utilisation rate have positive effects on inflation (Phillips curve effect), and feedback on own levels both P-star and Phillips curve effects are significant explanations of short run inflation behaviour (as suggested by Gerlach-Svensson paper)
The Structural Econometric Model (SEM) is the outcome of a further modelling phase. Here, integration problems are solved by the imposition of cointegrated combinations, then the procedure of modelling from general to specific is based on statistics that have standard distributions.

Short run estimates (omitted) tell us that it is possible to further model the system. Such further short run parameters restrictions can be imposed to the model without changing (re-estimate) the long run (cointegration) estimates.

Main results from multivariate analysis:

73

74

SOME GUIDELINES FOR THE PREPARATION OF APPLIED ECONOMETRICS PROJECTS


(from M. Hashem Pesaran Lectures)

READING LIST
Introductory readings Kennedy P. (2003), A guide to econometrics, 5th ed., Blackwell Cuthbertson K., Hall S.G., Taylor M.P. (1992), Applied econometric techniques, Philip Allan Granger C.W.J. (1999), Empirical Modelling in Economics, Cambridge University Press Stock J.H., Watson M.W. (2003), Introduction to Econometrics, Addison Wesley, part four Classical time series analysis Box G.E.P., Jenkins G.M., Reinsel G.C. (1994), Time series analysis: forecasting and control, 3rd edition, Prentice Hall, ch. 1-9 Enders W. (1995), Applied econometric time series, Wiley, ch. 2 Granger C.W.J., Newbold P. (1986), Forecasting economic time series, Academic Press, ch. 1-3 Maddala G.S. (1992), Introduction to econometrics, 2nd edition, Macmillan, ch. 13 Mills T.C. (1990), Time series techniques for economists, Cambridge University Press, ch. 5-8 Mills T.C. (1993), The econometric modelling of financial time series, Cambridge University Press, ch. 2 Pindyck R.S., Rubinfeld D.L. (1991), Econometric models and economic forecasts, 3rd edition, Mc Graw-Hill, ch. 15-19 Trends and unit roots Campbell J.Y., Perron P. (1991), Pitfalls and opportunities: What macroeconomists should know about unit roots, in Blanchard O.J., Fischer S. (a cura di), NBER Economics Annual 1991, MIT Press Cochrane J. (1991), Comment, in Campbell e Perron, cit. Elliott G., Rothemberg T.J., Stock J.H. (1996), Efficient tests for an autoregressive unit root, in Econometrica, Vol. 64(4)

1. Introduction. This section shall identify the issues and should state clearly the purpose of the analysis. It should also gave a brief account of the literature together with specific references. 2. Theoretical considerations. This section should describe the economic model (or the relationship) to be analysed, and usually contains a brief review of the existing theory and related evidence, the econometric specification, and the a priori information concerning the parameters of the economic model (i.e. their signs, range of variation, or their most likely values). 3. Data sources and descriptions. This section should discuss the data used in the study and give their sources. In particular, attention should be paid to the relationship between the theoretical concepts in the economic model (see section 2) and the available data. 4. Econometric considerations. This section should describe the econometric methods, i.e. the estimation method, the inference procedure, diagnostic checks, etc. 5. Empirical results. This section should report the results, comment on their statistical and economic significance and suggest ways that the results may be improved and extended. 6. Conclusions. This section should give a very brief account of the main finding of the research. 7. Bibliography. A complete list of the references cited.

75

76

Enders W. (1995), cit., Wiley, ch. 3-4 Hall A. (1994), Testing for a unit root in time series with pretest data-based model selection, in Journal of Business and Economic Statistics, n. 12 Maddala G.S. (1992), cit., ch. 6.10 Mills T.C. (1990), cit., ch. 11.2 Mills T.C. (1993), cit., ch. 3.1 Ng S., Perron P. (1995), Unit root test in ARIMA models with data dependent methods for the selection of the truncation lag, in Journal of the American Statistical Association, n. 90 Ng S., Perron P. (2001), Lag length selection and the construction of unit root tests with good size and power, Econometrica, 69(6) Perron P. (1997), Further evidence on breaking trend functions in macroeconomic variables, Journal of Econometrics, vol. 80 Stock J.H. (1994), Unit roots, structural breaks and trends, in Engle R.F., McFadden D.L. (ed. by), Handbook of econometrics, vol. 4, ch. 47 Spurious regressions Granger C.W.J., Newbold P. (1974), Spurious regressions in econometrics, in Journal of Econometrics, n. 2 Granger C.W.J., Newbold P. (1986), cit., ch. 6.4 Phillips P.C.B. (1986), Understanding spurious regression in econometrics, Journal of Econometrics, vol. 33 Sims C., Stock J., Watson M. (1990), Inference in linear time series models with some unit roots, Econometrica, vol. 58 Dynamic specification Hendry D.F., A.R. Pagan and J.D. Sargan (1984), Dynamic specification, in Z. Griliches and M.D. Intriligator (eds.), Handbook of Econometrics, vol. II, North Holland Cointegration analysis Bagliano F., Golinelli R., Morana C. (2003), Inflation modelling in the euro area, Cambrigde Univ. Press, (Golinelli web page)

Dickey D.A., Rossana R.J. (1994), Cointegrated time series: a guide to estimation and hypothesis testing, in Oxford Bulletin of Economics and Statistics, vol. 56, n. 3 Enders W. (1995), cit., Wiley, ch. 6 Gerlach S. and L.E.O. Svensson (2000), Money and inflation in the Euro area: a case of monetary indicators?, NBER Working Papers Granger C.W.J e N. Swanson (1996), Future developments in the study of cointegrated variables, Oxford Bulletin of Economics and Statistics, vol. 58 Gregory, A.W. and Hansen, B.E. (1996), Residual-based tests for cointegration in models with regime shifts, Journal of Econometrics, vol. 70 Gregory, A.W. and Hansen, B.E. (1996b), Tests for cointegration in models with regime and trend shifts, Oxford Bulletin of Economics and Statistics, vol. 58 Hakkio C.S., M. Rush (1991), Cointegration: how short is the long run?, Journal of International Money and Finance, v. 10 Hall A.D., H.M. Anderson, C.W.J. Granger (1992), A cointegration analysis of treasury bill yields, The Review of Economics and Statistics, vol. 74 Hallman J.J., Porter R.D, Small D.H. (1991), Is the price level tied to the M2 monetary aggregate in the long run?, American Economic Review, vol. 81, pp. 841-858. Harris R. (1995), Cointegration analysis in econometric modelling, Prentice Hall Johansen S. (1997), Mathematical and statistical modelling of cointegration, EUI Working Paper, ECO No. 97/14 Maddala G.S. (1992), cit., ch. 14 Mills T.C. (1993), cit., ch. 6.1-6.6 Mills T.C. (1998), Recent developments in modelling nonstationary vector autoregressions, Journal of Econometric Surveys, vol. 12, n. 3 Pesaran M.H. (1997), The role of economic theory in modelling the long run, Economic Journal, vol. 107

77

78

Pesaran M.H., Shin Y., Smith R.J. (2001), Bounds approaches to the analysis of level relationships, special issue of the Journal of Applied Econometrics in honour of JD Sargan on the theme Studies in Empirical Macroeconometrics, (eds) D.F. Hendry and M.H. Pesaran, forthcoming Stock J.H., Watson M.W. (2001), Vector Autoregressions, Journal of Economic Perspectives, vol. 15, N. 4, Fall, pp. 101-115 Watson M.W. (1994), Vector autoregression and cointegration, in Engle R.F., McFadden D.L. (ed. by), Handbook of econometrics, vol. 4, ch. 47 Zivot E. (2000), Cointegration and forward and spot exchange rate regressions, Journal of International Money and Finance, vol. 19, pp. 785-812 Textbooks Banerjee A., Dolado J.J., Galbraith J.W., Hendry D.F. (1993), Cointegration, error correction and the econometric analysis of nonstationary data, Oxford University Press Clements M.P., Hendry D.F. (1999), Forecasting Non-Stationary Economic Time Series, The MIT Press Hamilton J. (1994), Time series analysis, Princeton Univ. Press Hendry D.F. (1994), Dynamic econometrics, Oxford Univ. Press Johansen S. (1995), Likelihood-based inference in cointegrated vector autoregressive models, Oxford University Press Applied textbooks Favero C. (2000), Applied macroeconometrics, Oxford Univ Press Ltkepohl H. and M. Krtzig, ed. by (2004), Applied Time Series Econometrics, Cambridge University Press Patterson K. (2000), An introduction to applied econometrics: a time series approach, Macmillan Press Rao B.B., ed. by (1994), Cointegration for the applied economist, St. Martins Press Vogelvang B. (2004), Econometrics:Theory and Applications with E-Views, FT Prentice Hall

Very useful: you can not miss ... Lucchetti Jack R. (2001), Appunti di analisi delle serie storiche, dowloadable from my home page. Mosconi R. (2000), Malcolm for Rats software.

ACKNOWLEDGEMENTS
Many thanks are due to Luigi Bidoia, Maria Elena Bontempi, Michele Burattoni, Juri Marcucci, and to the students of CIDE, Prometeia and SDIC courses for their comments and suggestions. The usual caveats apply.

You might also like