Pseudo Maximum Likelihood

Pseudo-Maximum Likelihood Estimation of ARCH(∞) Models
Author(s): Peter M. Robinson and Paolo Zaffaroni

Source: The Annals of Statistics, Vol. 34, No. 3 (Jun., 2006), pp. 1049-1074
Published by: Institute of Mathematical Statistics
Stable URL: http://www.jstor.org/stable/25463451
Accessed: 01/03/2010 21:33
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=ims.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The
Annals of Statistics.
http://www.jstor.org
TheAnnals of Statistics
2006, Vol. 34,No. 3, 1049-1074
DOI: 10.1214/009053606000000245
? InstituteofMathematical Statistics, 2006
PSEUDO-MAXIMUM LIKELIHOOD ESTIMATION

OF ARCH(oo) MODELS
By Peter M. Robinson1 and Paolo Zaffaroni
London School of Economics and Imperial College London
Strong consistency and asymptotic normality of the Gaussian pseudo

maximum likelihood estimate of the parameters in a wide class of ARCH(oo)
processes are established. The conditions are shown to hold in case of expo
nential and hyperbolic decay in the ARCH weights, though in the latter case
a faster decay rate is required for the central limit theorem than for the law of
large numbers. Particular parameterizations are discussed.
1. Introduction. ARCH(oo) processes comprise a wide class of models for

conditional heteroscedasticity in time series. Consider, for t e Z = {0, ?1,...}, the
equations
(1) xt=rjt?t,
oc
(2) a} = coo+ J2 isojxf-j,

= 1
7
where
oo
(3) coo>0, foj>0 (j > 1), ^^0y<OO,
= 1
7
and {st} is a sequence of independent identically distributed (i.i.d.) unobservable

real-valued random variables. We shall assume that a strictly stationary solution
xt to (1) and (2) exists almost surely (a.s.) under (3), and call it an ARCH(oo)
process. We consider a parametric version, in which we know functions i/j(0 of
the r x 1 vector f, for r < oo, such that, for some unknown fo?
(4) 1fjUo) = 1roj, 7>L

Also, ojq is unknown and xt is unobservable but we observe
=
(5) yt Ato+ xt
Received October 2003; revised January 2005.
Supported by a Leverhulme Trust Personal Research Professorship and ESRC Grants

R000238212 and R000239936.
AMS 2000 subject classifications. Primary 62M10; secondary 62F12.
Key words and phrases. ARCH(oo) models, pseudo-maximum likelihood estimation, asymptotic
inference.
1049
1050 P.M. ROBINSON AND P. ZAFFARONI
for some unknown /xo.

ARCH(oo) processes, extending the ARCH(m), m < oo, process of Engle [11]
and the GARCH(n,m) process of Bollerslev [4], were considered by Robinson
[29] as a class of parametric alternatives in testing for serial independence of yt.
Empirical evidence ofWhistler [35] and Ding, Granger and Engle [10] has sug
gested the possibility of long memory autocorrelation in the squares of financial
data. Taking [contrary to the first requirement in (3)] coo= 0, such long memory in
was considered by Robinson [29], the \//oj being the au
xf driven by (1) and (2)
= 1;
toregressive weights of a fractionally integrated process, implying YlJLi ^0j
see also Ding and Granger [9]. For such x//oj, and the same objective function as
was employed to generate the tests of Robinson [29], Koulikov [20] established
asymptotic statistical properties of estimates of fo- On the other hand, under our
assumption coo > 0, Giraitis, and Leipus

Kokoszka [13] found that such \/soj are
< 1
inconsistent stationarity of xt, which holds when Y,JLi iÔj
with covariance
Finite variance of xt implies summability of coefficients of a linear moving aver
age in martingale differences representation of see [37]. In this paper we do

xf;
not assume finite variance of xt, but rather that xt has a finite fractional moment
of degree less than 2. The first requirement in (3) was shown by Kazakevicius and
[18] to be necessary for existence of an xt satisfying (1) and (2). The inter
Leipus
mediate in (3) is sufficient but not necessary for a.s. positivity of
requirement a},
and is imposed here to facilitate a clearer focus on the xf/oj, which decay, possibly
slowly, but never vanish.
We wish to estimate the (r + 2) x 1 vector #o = (coo, Mo, fo)' on the basis of ob
servations yt, t = 1,..., T, the prime denoting transposition. The case when /xo is
for = 0, is covered a version of our treatment. If
known, example, /xo by simplified
the yt were instead unobserved errors, we have /xo = 0, but would then
regression
need to replace xt by residuals in what follows; the details of this extension would
be relatively straightforward. Another relatively straightforward extension would
cover simultaneous estimation of the regression parameters coo and fo, after re
/xo by a more general parametric function; as in (1), (2) and (5), efficiency
placing
gain is afforded by simultaneous estimation.
Under stronger restrictions than < 1, Giraitis and Robinson [14] con
YlJLi iÔj
sidered discrete-frequency Whittle estimation of fo, based on the squared obser
vations yf (with /xo known to be zero), this being asymptotically equivalent to
constrained least squares regression of y} on the s > 0, a method employed

yf_s,
in special cases of (2) by Engle [11] and Bollerslev [4]. In these the spectral den
sity of yf, when it exists, has a convenient closed form. This property, along with
of the fast Fourier transform, makes discrete-frequency Whittle esti
availability
mation based on the y} a computationally attractive option for point estimation,
even in very long financial time series. However, it has a number of disadvantages,
as discussed and Robinson it is not only asymptotically ineffi
by Giraitis [14]:
cient under Gaussian st, but never asymptotically efficient; it requires finiteness
ESTIMATION OF ARCH(oo) MODELS 1051
of fourth moments of yt for consistency and of eighth moments for asymptotic

normality, which are sometimes considered unacceptable for financial data; its
limit covariance matrix is relatively complicated to estimate; it is less well moti
vated in ARCH models than in stochastic
volatility and nonlinear moving average
models, such as those of Taylor [33], Robinson and Zaffaroni [30, 31], Harvey
[15], Breidt, Crato and de Lima [5] and Zaffaroni [36], where the actual likelihood
is computationally relatively intractable, while Whittle estimation also plays a less
special role in the ARCH models of Giraitis and Robinson
short-memory-in-jr2
[14] than in the models of the previous five references, where
long-memory-in-^2
it entails automatic "compensation" for possible lack of square-integrability of the
spectrum of Mikosch and Straumann [26] have shown that a finite fourth mo
yf.
ment is necessary for consistency of Whittle estimates, and that convergence rates
are slowed by fat tails in et.
For Gaussian et, a widely-used approximate maximum likelihood estimate is
defined as follows. Denote by 9 = (co, /x, ?')' any admissible value of #o and define
=
xt(v) yt -M,
oo
= 1
7
for (eZ, and
t-\
j=i
for f > 1, where 1() denotes the indicator function. Define also
*<*> = -T^+lna^6^ 9t(0) = -^+\nd}(9), 1< t < T,

(6) erf af(9)
T T
Qt(9) = T~l J^ltiO), QT(9) = T~x j>(0),
t=\ t=\
= ?
?t zrgmin Qt(9), 9j argmin Qt(9),
0e? 0e?
where @ is a prescribed compact subset of Er+2. The quantities with over-bar are
introduced due to yt being unobservable for t < 0; 9j is uncomputable. Because
we do not assume in the we refer to as a
Gaussianity asymptotic theory, ?t pseudo
maximum likelihood estimate (PMLE).
We establish of 9j and asymptotic of Txl2(9j ?
strong consistency normality
?
#o), as T oo, for a class of ^(f) sequences. In the case of the first prop
erty this is accomplished by first showing of 9j and then
strong consistency
that 9t ?9t ?
0, a.s. In the case of the second we likewise first show it for
-
TXI2(9T do) and then show that 9T-9T= but the latter property, and
op(T~x/2),
?
thus the asymptotic normality of Tx/2(9j #o)> is achieved only under a restricted
set of possible fo values, and this seems of practical concern in relation to some
popular choices of the are presented

These results in the following section,
x//j(t;)-
along with a description of regularity conditions and partial proof details. The
structure of the proof is similar in several respects to earlier ones for the GARCH
case of (2), especially that of Berkes, Horvath and Kokoszka [3]. Sections 3 and 4
apply the results to particular models.
2. Assumptions and main results. Our assumptions are as follows.
> 2. The =
Assumption A(q), q et are i.i.d. random variables with Eso 0,
= I, < oo and probability
Es^ E\so\q density function f(s) satisfying
= as?^0,
f(s) 0(L(\8\-x)\8\b)
?
for b > 1 and a function L that is slowly varying at the origin.
Assumption B. There exist coi,cou, lil, lû such that 0 < col < cou < oo,
?oo < /xl < /X(/ < oo, and a compact set T e Rr such that 0 = [col, <ou] x
[Mi,jXtf] x T.
Assumption C. 6o is an interior point of 0.
Assumption D. For all j > 1,
(6) inf^(f)>0;
~l~x
(7) sup \jrj(f) < Kj for some d > 0;
(8) foj<Kfok fori <k<j,
where K throughout denotes a generic, positive constant.
Assumption E. There exists a strictly stationary and ergodic solution xt to

(1) and (2), and for some
(9)pe{(d+irx,l),
with d as in Assumption D, we have
(10) ?|jt0|2p<oo.
Assumption F(/). For all j > 1, has continuous kth derivative on T

Vy(f)
such that, with ?7 denoting the /th element of f,
(ID
r^-s^K)1-'
for all r\ > 0 and all //*= 1,..., r, h = 1,..., &, k < I.
Assumption G. For each f Y, there exist = = r,

integers jt jt (?),/ 1,...,
such that 1 < ./l (?)< < jr(C) < ?? and
rank{*Ul,...Jr)(f)}=r,
where
*,;.*)tt) -l<?)..... <"?)).

+)"?)-^?.
Assumption H. There exists
(12) ^ > \
such that
(13) irQj<Kj-x-d\
and (10) holds for
(14) p e (4/(2rfo+ 3), 1).
Assumption A(g) allows some asymmetry in ?r, but implies the less primitive
condition (which does not even require existence of a density) employed in a sim
ilar context by Berkes, Horvath and Kokoszka [3]. Assumptions B and C are stan
dard. Inequalities (7) and (13) together imply do > d, while (8) with (3) ismilder
than monotonicity but implies =
o(j~l) as j ? oo. We take n > 0 in As
i/roj
sumption F(Z) because < 1 for all G
i/j(0 large enough j, by (7). Assumption
is crucial to the proof of consistency, being used in Lemmas 9 and 10 to show
that in the limit 9o globally minimizes Qt(9)\ it also ensures nonsingularity of
the matrix Ho in Proposition 2 and Theorem 2 below. This and other assumptions
are discussed in Sections 3 and 4 in connection with some parameterizations of
interest.
We present asymptotic results for the uncomputable 9? as propositions, those

for 9j as theorems. All these, and the corollaries in Sections 3 and 4 and lemmas
in Section 5, assume (l)-(5).
Proposition 1. For some 8 > 0, let Assumptions A(2 + 8), B, C, D, E, F(l)

and G hold. Then
-> a.s. as T -? oo.

9j 0q
Proof. The proof follows as in, for example, [17], Theorem 6, from uniform
a.s. convergence over 0 of Qt(9) to Q(9) ? Eqo(0) established in Lemma 7, the
fact thatQT(9T) < Qt(0), and Lemma 10.
Theorem 1. For some 8>0, let Assumptions A(2 + 8), B, C, D, E, F(l) and
G hold. Then
(15) 9j ?> 9o a.s. as T -* oo.
Proof. From Lemmas 7 and 8, Qt(0) converges uniformly to Q(9) a.s.,

whence the proof is as indicated for Proposition 1.
Denote by Kj the j th cumulant of st and introduce
-
G0 = (2+ k4)M 2k3(N + N') + P, Ho = M + \P,
where
M = E(tot& N = E(a-X r0)e'2, P=

E(a^2)e2ef2,
=
for ro ro(#o)> *t(0) = (9/90) log0/(0), and e2 the second column of the (r+
2) x (r + 2) identity matrix. In case /xo is known (e.g., to be zero), we omit the
second row and column from M, and have instead Go = (2 + k^)M, Ho = M. In
case st is Gaussian, ic$= k4 = 0, so Go = 2Ho = 2M + P.
PROPOSITION 2. Let Assumptions A(4), B, C, D, E, F(3) and G hold. Then
Tx/2(9T-90) -i N(0, H-xG0HqX) asTôo.
Proof. Write
=
<?<?(*) =
^ r-?f>(*).
where
-
ut(9) = r,(0)(l x?m) + cj-2(0)vt(9),
with
of(9) o0
By the mean value theorem,
=
(16) 0= Q{t\9t) Q{t\Oo) + Ht(Ot-Oo),
where has as its /th row the /th row of HT(9) = T~l evaluated
HT J2f=\ nt(0)
- -
at 9 =
9%\
where ht(9) = (d2/d9d9')QT(9), \\9^ 90\\ < \\9T 90\\, where
-
we define ||A||
=
{tr(A'A)}1/2 for any real matrix A. Now ut(00) = rt(90)(l
?
s2) 2e2st/ot is, by Lemmas 2, 3 and 7, a stationary ergodic martingale dif
ference vector with finite variance, so from Brown [6] and the Cramer-Wold de
vice, T1/2Q{j}(Oo) -*d N(0, Go) as T -> oo. Lemma 7 and Theorem 1,
Finally, by
Ht -+p Ho, whence the proof is completed in standard fashion.
Define
- m\ 9^(g) - <a\ ~ u m\ d24t(0)

= ??, = m\->m\ = ,
ut(0) gt(9) ut(9)ut(9), ht(9)
T ^ T
GT(9) = T~x J>(0), HT(9) = T~x?ft,(0).
t=\ t=\
THEOREM 2. Let Assumptions A(4), B, C, D, E, F(3), G and H hold. Then
(17) Tl/2(9T-90) 4> N(0, H^1GoHq1) as T^ oo,

_ 1 _ 1 ?_ 1/V ? /V? _ 1 ^
am/ //0 GoH0 is strongly consistently estimated by HT (9t)Gj (9t)HT (9j).
Proof. We have
= -
O=Q{t\0t) Qt\Oo) + Ht(9t Oo),
where Qj\0) = (d/dO)QT(0) and HT has as its ith row the ir/z row of HT(9)
evaluated at 0 = of, for - -
\\of #0|| < ||0r 0O||.Thus, from (16),
- - -
0T -0t = (H~x H-x)Q^(Oo) H~x\Qf(Oo) Q(Tl)(90)},
where the inverses a.s. for all sufficiently
exist large T by Lemma 9. In view of
Proposition 2 and Lemma 8, (17) follows on showing that
Q{t1\Oo)-Q(t)(9o)=op(T-x?2).
The left-hand side can be written (B\j + B2? + B^t)/T', where
T T t
= = - =
BXT ?>2fci? B2T l)b2t, B3T
-?>2 -2e2J2?tb3t,
t=\ t=\ t=\
with
-2(1)/ 2 -2\ 2(1) -2(1) 2 - -2
otK\of-af) V = or/' = a,2 erf
=-?a-, o\t b2t ?5-ry-, fc3, -?r-,
erferf ofcrfcrt
for of = = a2(1) = a2(1)(c90), with o^(9) =
of(9o), a2(1) a2(>0),
= = =
(d/d9)of(9), a2(1)(0) (d/d9)df(9). We show thatBiT op(Tx'2), i 1,2,3.
For the remainder of this proof, we drop the zero subscript in
\Jsoj.
Consider first B\t. We have
\ 7=1 7= 1 I
where =
^j V*,- (fo)- From Assumption F(l),
< 1+ 2
|a,2(I) | ? *, |*,_,-1+ a:E ^'"xlj,
7=1 7=1
for all ?7> 0. Now
t-\lt-\ \'/2/oo \1/2
?K*t,
E^7^-7l<Ê^-7J (Y,+j)
so since dt>(oi> 0,
r-l
= 1
7
From (8),
7= 1
It follows that
(19) \\a^\\/a^K^\
On the other hand, by the cr-inequality ([23], page 157) and (10),
oo oo
-
(20) E(af a}Y <Kj^ VjE\x,-j \2p< K? Vj
j=t j=t
Thus, by (8) and (14),

OO OO
^
(21) E\\blt\\p < Kx/f-w^j <
*?>f
< Ktx-p^+m~^,
j=t j=t
n< ?
choosing 1 l/{p(do + 1)}, which (14) enables. Applying the cr-inequality
again,
T
E\\BlTr <KjÊ\80\2pE\\bltr.
t=\
< we
Applying (21), this is 0(1) when p > 2/(do + 1),while when p 2/(do + 1),
may choose r\ so small to bound it by
< ^rp/2-{l+2(db+l)(l-//)}[p/2-2/{l+2(db+l)(l-i;)}] =
KT2-p(dQ+\)(\-r)) 0(Tp/2^
of n. Thus, =
using (12) [which requires (13)] and arbitrariness B\j op(Tx/2) by
Markov's inequality.
Consider Z^r- By independence of st and ?>2r,by the cr-inequality when p
<\,
and by the inequality of von Bahr and Esseen [34] and the fact that the 82 are i.i.d.
with mean 1when p > \,
T T
< <
E\\B2T\\2p K?(?|?0|4p + \)E\\b2t\\2p
KY,(E\\b4t\\2p
+ E\\b5t\\2p),
t=\ t=\
where
u -? ?t ~Gt u -? Gt {?t ~?t )

04t -o > &5t To 9
?t Gt?t
Thus, from Assumptions F(l) and H,

/ OO oo \
<
1164,11 +
^^^-l^-i
\ j=t Ell^lk^jA2
j=t I
r r oo \ l/2 ( oo 1V2"! r oo\ V2
^n2E*;
L [j=t J +\Uhjl)\\2MxlJ\
J Ete2-;
J [j=t J
[j=t
/ oo
(oo \ ]/2 \ V2 j
r r oo i i/2-i
< Kir**'2 + I
If^j-i^mi-i,)^
L [j=t J J
so
oo
< Kt~pd? + KY^ /-(ô+1)p(1-27?) < ^/i-(^0+i)p(i-2^)
?||^k||2p
for sufficiently small n. Thus, E\\b4t\\2p is (9(1) for p > 2/(d0 + 1), while
?f=i
for p < 2/(do + 1), it is bounded by
KT2-(do+\)p(\-2rj) < KTp-(dQ+2){p-2/(do+2)}+2(do+\)pr]
=Q^JP^
from (14) and arbitrariness of rj.Also, < so from

||65r|| K\\d2{X)/a2\\(a2 -a2)1'2,
(19) and (20) we have < and as before,
E\\b5t\\2p î-(4)+Dp(1-2^ proceeding
T
J2E\\b5t\\2p=o(Tp),
and thence, B2T =

op(Tx/2).
Next,
T T
|2p
E\\B3T\\2e < KE J2?tb3t <
KÊôÊb]',
t=i t=i
applying the cr -inequality when p < and von Bahr and Esseen [34] when p >
\ ^.
< - so from (20),
Now b3t (erf of)xl2&-2,
T oo
E\\B3T\\2p<Kj2T,^j
t=ij=t
< =
K{t{p > 2/(do+ 1))+ (InT)l(p 2/(do+ 1))
+ T2-P{d?+Xh(p< 2/(db + D)}
=
o(Tp),
much as before. =
Thence, B3j op(Tx/2).
It remains to consider the last statement of the theorem, which follows on stan
dard application of Propositions 1 and 2, Theorem 1 and Lemmas 7 and 8.
In earlier of this paper we checked

versions the conditions in the case of
GARCH(h, m) models in which the and we allow the
x/fj (^) decay exponentially
possibility that the GARCH coefficients lie in a subspace of dimension less than
m + n\ the details are available from the authors on request. However, the literature
on asymptotic theory for estimates of GARCH models is now extensive, recent ref
erences including [3, 7, 12, 16, 22, 32], along with investigations of the properties
of the models themselves; see recently [2, 18, 25]. We focus instead on alternative
models which have received less attention, and for which our theoretical frame
work is primarily intended.
We introduce the generating function
oo
(22) iAU;0 = ?^W, kl<i.
= 1
7
3. Fractional GARCH models. A slowly decaying class of ARCH(oo)

weights was considered by Robinson [29], Ding and Granger [9] and Koulikov
[20], generated by
(23) = i-0-z)*\ 0<?<1,

^(z;O
where r? \ and formally
In these references coo= 0 was assumed in (2), but we assume coo > 0 and general
ize (23) as follows. Introduce the functions aj = fl/(f)> bj = &/(?) and, for m > 1,
n > 0, n + m > r,
m n
-
(25) a(z', f) = X>7V, Mz; C)= 1 J^bjz'Hn > 1);
y=i 7=i
and for all f G Y,
= = l,...,n;
(26) 0/> 0, j \,...,m; bj>0, j
(27) ft(z;?)#0, |z|<l;

(28) a(z\ f) and b(z; f) have no common zeros in z.
Now take \/f(z; t;) (22) to be given by
no, i< m ^(z;0{l-(l-z)^}

(29) x//(z; O =-?--7-,
zb(z\ ?)
with d = d(t;) satisfying
(30) de(0,\).
We call xt based on (29) a fractional GARCH, FGARCH(rc, do, rn) process, for
=
do d(t;o).
COROLLARY1. Let\j/(z;0 be given by (29) and (25) with m>l,n>0, and

let d and the aj, bj be continuously dijferentiable. For some 8 > 0, let Assumptions
A(2 + 8), B, C and E /*6>/d,vWf/ialls eT satisfying (26)-(28), (30) and
\9 1 >=r.
mnk\-?(a\,...,am,b],...,bn,d)
Then (15) is true. Let also d and the aj, bj be thrice continuously dijferentiable
and do > Then (17) is true.
\.
PROOF. Denoting by cj (j > 1) and dj (j > 0) the coefficients of z7 in the

? ? we have ?
expansions of a(z\ $)/b(z\ ?)> z-1{l (1 z)d], respectively, V'y(C)
Yl{ZocJ-kdk, .7> 1- From [3], the Cj are bounded above and below by posi
tive, exponentially when n > 1, and are all nonnegative when
decaying sequences
n = 0. Since the are all positive, it follows that (6) holds. Also, Stirling's approx
dj
imation indicates that j~d~x/K < < so the same
dj Kj~d~x, V^/(f) satisfy the
of Y, smoothness of d, and > d, to
inequalities. Compactness (30), imply d(t;)
check (7). The above argument indicates that tyoj < Kj~d?~x < Kk~d?~x <
Kxjrok
> k > so to
for j 1, (8) holds, and thus Assumption D. With regard (11), note
that (d/dd)^(z\ O = -{a(z; 0/b(z; - -
0}z-1(l z/ln(l z)9 where the coef
- - < <
ficient of zJ in -z_1(l z)d\n(l z) is K(lnj)j-d-x
EJk=ik~xdj-k
< for any rj> 0. Derivatives with respect to the

Kj-(d+\)(i-r)) Kfj^tt) aj,bj
are dominated, and higher derivatives can be dealt with similarly, to complete the
checking of Assumption F(/). To check Assumption G, suppress reference to ? in
a, b, ty and
= - -
Hz) b(zrx{l (1 z)dh Y(z) = b(z)~xa(z),
and note that
9^(z) j-Uc^ = 1
?-=zJ 0(z), j l,...,m,
daj
??=ZJ =
y(z)c/)(z), j l,...,n,
dbj
?f(z) Y(z) - d -
?ZT~ =-(1 z)a log(l z).
ad z
Choose ji (f)
= i for i =
1,..., m + n, ? e T, leaving jm+n+\ (t;) to be determined
subsequently. Fix ? and write U = partitioning it in the ratio m + n:l
x^(ji,...jr)(0^
and calling its (i, j)th submatrix We first show that the (m + n) x (m + n)
[/,-y.
matrix U\\ is nonsingular. Write R for the n x (m + n) matrix with (/, j)th element
x (m + a) matrix with (/, y)th element where
Yj-i, and S for the (m + n) 07_/+i,
= = 0 for < 0, and for > 0, 0y and Yj are
0y y^r j j respectively given by
OO 00
4>(z)= ^<t>jzj, y(z) = J2 yjzJ>

= 1
j=\ 7
these series absolutely

converging for \z\ < 1 in view of (30). Noting that \fr
is given by (d/d^)%l/(z) = zJ, we find that the first m rows of U\\ can
J2JL\ Wj
be written (Im, 0)S, where Im is the m-rowed identity matrix, O is the m x n
matrix of zeroes and, when n > 1 the last n rows of U\\ can be written RS. Now
5 is upper-triangular with nonzero diagonal elements. Thus, for n = 0, U\ \= S is
nonsingular. For n>l,U\\is nonsingular if and only if the n x n matrix R2 having
(/, j)th element Ym+j-i and consisting of the last n columns of R is nonsingular.
This is not so if and only if the y/, j = m,..., m + n ? 1, are generated by a
n ? if there exist
homogeneous linear difference equation of degree 1, that is,
scalars A.o, M,..., Xn-\, not all zero, such that
n-\
- 1.
X0Yj ^2îYj-i
=0' J =m,...,m+n-
1=1
But it follows from (25) and (27) that they are generated by the linear difference
equation
n-\
-
Yj J2 biyj-i =7lh j =m,...,m+n-l,
i=\
= = = ?
where nm am + for m + 1,..., m + n 1. Since
bnym-n, iij bnyj-n j
bn ^ 0, the Ttj are all zero if and only if ym-n

=
?am/bn and yj = 0 for j =
? = > ra
m + \?n,...,m 1. But this implies ym 0 also, and thence, yj =0, ally
ft + 1. For ra < > 0, j =
ft, this is inconsistent with the requirement aj I,... ,m,
and for ra > ft, it implies a has a factor b, which is inconsistent with (28). Thus,
> 1.
U\i is nonsingular when n Nonsingularity of [/ follows if U22 ? U2\
U^XU\2.
=
For large enough jm+n+\ Jm+n+\ (?), this must be true because f/22 decays like
whereas the elements of U\2 are 0(^m+n+]) for some /J e
(\njm+n+\)j~^\.^
(0, 1). Thus Assumption G is true, and thence (15). Clearly (13) is true, so under
the additional conditions so is Assumption H, and thence (17).
For ra = = = 1, while when a\ e

1, ft
0, (29) reduces to (23) when a\ (0, 1),
it gives model (4.24) of Ding and Granger [9]. The important difference between
these two cases is that the covariance stationarity condition xj/(l; t;o) < 1 is satis
fied in the second but not in the first. In general with (29), as with the GARCH
model, xt is covariance stationary when a(l; ?0) < 6(1; fo) but not otherwise. We
compare (29) with

~
(31)V(z; ?) = 1-77?7?0 z) '
b(z\0
with d again satisfying (30) and a and b again given as in (25), though we now
= = ft = 0,
allow ra 0, meaning a(z\ f) = 0. Thus, with ra (31) reduces to (23).
ARCH(oo) models with yjrgiven by (31) were proposed by Baillie, Bollerslev and
Mikkelsen [1] and called FIGARCH(ft, do, ra). In general, though (31) also gives
in some notable respects. Application
hyperbolically decaying x/roj, it differs of
(26)-(28) again ensures positivity of \jrj (f) in case of FGARCH and facilitates the
above proof, but sufficient conditions in FIGARCH are less apparent in general,
though Baillie, Bollerslev and Mikkelsen [1] indicated that they can be obtained.
Also, unlike FGARCH, FIGARCH xt never has finite variance.
The requirement do > for the central limit theorem in Corollary 1would also
\
be imposed in a corresponding result for FIGARCH. This is automatically satisfied
in GARCH models but if only do e (0, in (13) is possible in the general setting
^]
of Section 3, it appears that the asymptotic bias in 9j is of order at least T~xl2,
whereas that for 9j is always o(T~xl2). Assumption H copes with the replacement
of cr2(r^) by (9), the truncation error varying inversely with do- Inspection of the
of
proof of Theorem 2 indicates that this bias problem is due to the term H~1B\t.
?
The factor
of of
in b\t is nonnegative, and if j~do~x is an exact rate for
ifroj,
? as t ->
exceeds t~d?/K 00 with probability approaching one. So far as
of of
the factor ot is concerned, the second element of o2^ [see (18)] has
/of mb\t
zero mean, but the first is positive, and though the \/f- can have elements of either
sign, whenever do < \ it seems unlikely that the last r elements of B\j can be
Nor is there scope for relaxing (12) by strengthening other conditions.

op(Tx/1).
With to for choice of p, when > 2d + entails no
regard implications do \, (14)
restriction over (9).
Though results of Giraitis, Kokoszka and Leipus [13] indicate existence of a
stationary solution of (l)-(3) when t/t(1; ^0) < 1, Kazakevicius and Leipus [19]
have questioned the existence of strictly stationary FIGARCH processes, and thus
the relevance of Assumption E here. The same reservations can be expressed
about FGARCH when a(l; f0) > b(\\ Co), andmore generally about ARCH(oo)
with > 1. A sufficient condition for (10) can be deduced as
processes t/Kl; ?o)
follows. Recursive substitution gives
oo / oo \oo
<K+ K
aj ? ? Vo;, ihj,slj1eljl-h ~$-h-...-X
?(/=1 Vl = l 7/= l /
so by the cr -inequality,
00 / oo oo
'=1 Vi = l 7/=l
v
X Ip \2p \p -\2p\ I
\bt-J\-J2\ \kt-J\-Ji\
Thus, from Lemma 2,

00 / oo \ '
E\xt\2(,<E\ot\2f,<K + KY, [E\so\2pJ2K )

i=o\ j=\ I
The last bound is finite if and only if
oo
(32) E\80\2pJ2K<1'
7= 1
Thus, (10) holds if there is a p satisfying (9) and (32). Recursive substitution
and the cr-inequality were also used by Nelson ([27], Corollary) to upper-bound
in the GARCH(1,1) case, but he employed the simple dynamic structure
E\at\2p
available there, and (32) does not reduce to his necessary and sufficient condition.
If V(l; ?o) < 1> (32) adds nothing because we already know that Exq < oo
here, but if i/r(l; fo) > 1, the second factor on the left-hand side of (32) exceeds 1
and increases with p; the question is whether the first factor, which is less than
1 and decreases with p [due to Assumption A(g)], can over-compensate. Ana
lytic verification of (32) for given fo> P seems in general infeasible, and numerical
verification when the However, consider the
highly problematic x//j decay slowly.
family of densities
(33) f(8) = exp[-{a(Y)\8\}x/y]/{2yr(Y)a(Y)}

for y > 0, where a(y) = {r(y)/T(3y)}1/2 (also used by Nelson [28] tomodel
the innovation of the exponential GARCH We = 1
have E8o =
model). 0, E8\
as necessary, Assumption A(q) is satisfied for all q > 0, and E\8o\2p = T((2p +
l)y)/{r(K)1_pr(3y)p}. In case y = 0.5, (33) is the normal density, for which
?t is asymptotically efficient. Here E\eo\2p = 2pF(p + 0.5)/v^r, and numerical

on In case = h
calculations for FIGARCH(0, do, 0) cast doubt (32). y (33) is the
with = + As can be
Laplace density, ?|?ol2p 2p~xF(2p I). y increases, E\8o\2p
< = it is 0.64 when = 10
made small for fixed p I, for example, with p 0.95, y
and 0.42 when y = 20.
4. Generalized exponential and models.

hyperbolic FGARCH(n,do,m)
[and FIGARCH(rc, do, m)] processes require do e (0,1). For d = I, (29) reduces
to (23), and for d > 1, at least one coefficient in the expansion of (23) is negative,
to the possibility of Because FGARCH like
leading negative \j/j;(?) VO'C?) decay
a large mathematical gap is left relative to GARCH processes. Even if
j~d~x,
decay is anticipated, there is a case for more direct modeling of the
exponential
V0'(?)ân provided by GARCH(rc, m), since it is the V^ (O anc*their derivatives
that must be formed in point and interval estimation based on the PMLE.
Consider the choices
m
(34) = + iyxeid^xê-di,
x/,j(0 Y,r(f
i=\
(35) = + irleidhiÛ + l)(j + ird~x,

fj(0 J2r(f>
where d = d(?) and the e{ = = are such that T satisfies

et(t;), f //(?)
(36) d e (0,oo),
(37) e{ > 0, i= 1,.. .,m,
(38) 0 < /i < <

fm < oo,
with 2m + 1 > r. Given and (22), we call xt generated by (34) a generalized
(l)-(4)
exponential, GEXP(m), process, and xt generated by (35) a generalized hy
perbolic, GHYP(m), process. Condition (38) is sufficient but not necessary for
tyj(0 > 0, all j > 1. By choosing m large enough in (34) or (35), any finite
\/f(l;i;) can be arbitrarily well approximated, but (34) and (35) can also achieve
For real x > 1, xê~dx and (lnx)^x~d~x if
parsimony. decay monotonically
= 0, and for > 0, have maxima at and
/ / single f/d e//^+1\ respectively. Thus,
with m ? l and f\ ? 0, we have monotonic decay in (34) and (35); otherwise,
both can exhibit lack of monotonicity, while eventually decaying exponentially or
hyperbolically. The scale factors in (34) and (35) are so expressed because xê~dx
and (lnx)fx~d-x integrate over (0, oo) to r(/ + l)/df+x and T(f + I)Id, re
so that ? in both but the
spectively, V"0; K) J2T=\ ei cases, approximation may
not be very close and the "integrated" case is less easy to distinguish than in
GARCH and FGARCH models (though it would be possible to alternatively scale
the weights by infinite sums to achieve equality).
The following corollary covers (34) and (35) and implies the
simultaneously,
special case when the f are specified a priori, for example, to be nonnegative
integers; strictly speaking, when the true value of f\ is unknown, Assumption C
prevents it from being zero.
COROLLARY2. Let^r(z\t;)be given by (22) and (34) or (35) with ra > 1 and
let d and the et, f be continuously differentiable. For some 8 > 0, let Assump
tionsA(2 + 8), B, C and E hold, with all f e Y satisfying (36)-(38) and
rank = r.
?(<?i, /i, ...,em, fm,d)
Then (15) is true. Let also d and the et, f be thrice continuously differentiable
and Assumption A(4) hold, and do = d(tô) > in case of (35). Then (17) is true.
\
PROOF. Given (36)-(38) and the proofs of Corollaries 1 and 2, the verification
of Assumptions D and F(/) is straightforward. We check Assumption G for (35)
only, a very similar type of proof holding for (34). We have
1
wlE(u\j,...,u'mjy\d^
L vj J
where
= - i= l,...,r,
uu (lnlntf + 1) (d/dfi)lnr(fi+l), l)'ln^(; + 1),
m
= + 1^
Vj -j2eir(fi+iyllnfi+lu
and E is the diagonal matrix whose ?

(2/ l)st diagonal element is et, and
whose even diagonal elements are all 1. Fixing ?, we show first that the lead
? ?
ing (r 1) x (r 1) submatrix of *o'i,-,./?(?) has full rank, equivalently,
that Um has full rank, where, for / = 1, ...,ra, the (2i) x (2i) matrix ?/; has
2x1 sub-vector k = ? = for some
(k,?)th Ukjt, l,...,i, 1,...,2/. Suppose,
?
/ = 1,..., ra 1 and given j\,..., j2i, that Ui has full rank, and partition the
rows and columns of ?/,-+i in the ratio 2/: 2, calling its (k, ?)th submatrix Uu (so
= = in x > 1, it fol
U\\ Ut). Take 72/4-2 72/+r
Because lnlnx strictly increases
lows that U22 is nonsingular and = ln~^+1 721+1)- Noting that
WU^W 0(InIn721+1
= ln^" 72/+1), while U\\ and U2\ depend only on 71,...,
\\Un\\ 0(lnln721+1 72/,
we can choose ?
72/+1 such that U\\ U\2U22U2\ differs negligibly from U\\.
> 0, U\ has full rank =
Thus, I/,-+i has full rank. Since, for f\ (e.g., when 71 1,
=
7*2 2), it follows by induction that Um has full rank. Since Vj is dominated by a
term of order ln^m+1 7, while = ln^' 7), a similar argument shows
||m// || 0(InIn7
that jr can then be chosen large enough, to complete verification of Assumption G.
5. Technical lemmas. Define

00 oo
= = 'cou+
af(9) a>+J2 irj(0xf_j, of YJ supV; (?)*,-,
7=1 7=1?6T
Lemma 1. Under Assumptions B and D,for all 9 e?, t e Z,
< < a.s.

K~xof(9) a2(9) Kaf(9)
Proof. A simple extension of [21], Lemma 1.
LEMMA 2. Under Assumptions A(2), B, C, D and E,for all t eZ,
<E < < KE\xt\2p < K,

(39) E\xt\2p < Ea2p mpa2p(9) KEofp
(40) inf a2(9) > 0, sup a2(9) < < oo a.s.,

Kaf
(41) < K.
?sup|lnar2(^)|
PROOF. respect to (39), the first inequality

With follows from Jensen's in
equality, the second
is obvious, the third follows from Lemma 1, the fourth fol
lows from the cr-inequality, (7) and (9), while the last one is due to (10). The
proof of (40) uses Lemma 1, o2(9) > col, (10) and [23], page 121. To prove (41),
< x + x~x for x > 0 and Lemma 2
| lnx| give
E sup |lna2(9)\ < p~xE supa2p(9) + inf < K.

6e?
OeS e\ \??? <r,2(0)l
J
LEMMA 3. Under Assumptions D, E and F(l),forall 9 e 0, cr2(9), qt(9) and

their first I derivatives are strictly stationary and ergodic.
PROOF. Follows straightforwardly from the assumptions.
LEMMA 4. Under Assumption A(2), for positive integer k < (b + 1)^/2,
(42)E\T8j\ <oo.
PROOF. Denote = the moment-generating function of a ran

by Mx (t) E(etX)
dom variable X. By Cressie et al. [8], the left-hand side of (42) is proportional to
/ OO /?00
Jo
J?
(43) r\ roo
< tk~xdt+
/ tk-xMn2(-t)dt.
JO J\/ ?o
It suffices to show that the last integral is bounded. For all 8 > 0, there exists r\ > 0
<
such that L(s~l) ?~s, e e (0, rj), so
2 rT] 1 1
e~te /(e) ds<K e-'e eb-s ds + 2e~tri .
/OO-oo JO
The last integral is bounded by
f?? < Kt0-b-l)/2m
Kt(8-b-l)/2 e-e?(8-b-l)/2d?
JO
- -
Thus, (43) is finite if k + n(8 b l)/2 < 0, that is, since 8 is arbitrary, if A:<
(fc+l)n/2. D
The previous version of the paper included a longer, independently obtained,

proof of the following lemma which we have been able to shorten in one respect
by using an idea of Berkes, Horvath and Kokoszka [3] in a corresponding lemma
covering the GARCH (ft, ra) case.
LEMMA 5. Under Assumptions A(q), B, C and D,for p < q/2,
E sup ( <*}
' V <
~ K < oo.
eeeWW)'
Proof. We have
oo
of =CO0+ ô\X2_x + ^2 Ôjxf-j <CO0+ Ôltf2-!^2-! + Kof_x

7=2
from (8). Thus, and thence, for fixed > 1, <

erf/erf_x <K(\+e2_x) j
erf/of_j
where = + F?r anY M < oo,
Khtj, htj n/=i(l ?f-i)-
"
a} < Ka} < K /co
? 2 a,2 A"1
fjiOelj-^' J
-2777 +
?
ct2(0) -ah: a,*2(0) \of fr{ a} )
Kh,M /{inf?6r inf'j=i,...,m &](?)}
= 1 et-j
2-7
The proof can now be completed much as in the proof of Lemma 5.1 of [3], us
ing Holder's inequality as there but employing our Lemma 4 and taking M >
2pq/[(b+l)(q-2p)]. D
LEMMA 6. Under Assumptions A(2), B, C, D, E and F(l),for all p > 0 and

k<l,
1 p
dka}(9) <oo,
(44)?sup <\J
6?0 07(0) 30ii 30/*
1 p
9*a.2(0)
''
060 O-/(0) 90'1 ?e'k
Proof. Take i'i

< 12< < h- First assume i\ > 3, whence, for given fc and
?-;^'
90,-, 90,-t ^
where = Now
?,-(?) 9*^(?)/9??,-2 90,-2.
I OO I OO
'7=1 I 7=1
so using Lemma 1,
It suffices to take p > 1. By Holder's inequality,
\ P/P
oo r oo ( oo \ 1-P/P
<
E it;(f)ix2_,- =E i?y(f)ip/p^(f),-w"^ E =wk2.,
1= 1 7 17 J 1.7 J 1
SO
By Assumption F(/), for all n > 0
sup^(Ol'iM?)'"'' ^* SUP <

WS)p~r,p Kj-v+wo-wK
C T <GT
? >
Since p(d + 1) > 1, we may choose 77such that (d + l)(p pn) 1. Thus,
E sup ?-=-
{ } < oo.
oee\ o*2{6) I
The above proof implies that also
1p
E 1^(011 <??,
7=1
{00 1
whence, the proof of (44) with i\ > 3 is concluded. Next take i\ = 2. If ii > 2,
~
dko}{9)
<4?)
?in-r-1îi0xi_lW,
where now = if i2 =
f;(?) 3*-V;(?)/9?i2-2 9^-2, while 2, i3 > 2,
where now = In the first of these cases the

?,(?) dk~2x//j(^)/d^3-2 3^-2-
is seen to be very similar to that above after
proof noting that, by the Cauchy
inequality, (46) is bounded by
oo
Ioo 1 1/2 oo
E woixLj 1 E= 1 i*7(?)i + *E i*y(?)i,

=7 J= 7 1
7
while in the second it ismore immediate; we thus omit the details. We are left with
?
the cases i\ 12= 13= 2 and i\ = 1, both of which are trivial. The details for
(45) are very similar (the truncations in numerator and denominator match), and
are thus omitted.
Define
T
gt(9) = ut(9)uft(9), GT(9) = T~x J>(0).
t=\
LEMMA 7. For some 8 > 0, under Assumptions A(2 +8), B, C, D,E and F(l),
- 0.5. as T -
(47) sup \QT(9) Q(9)\ -* 0 00,
6>e0
am/ (?(#) is continuous in 9. If also Assumption F(2) holds,

-
(48) sup ||Gr(0) G(9)\\ -+ 0 as. as T -+ 00,
<9e0
am/ G(#) w continuous in 9. If also Assumption F(3) holds,

-
(49) sup ||#r(#) #(0)11 -> 0 as. as r -> 00,
6> 0
awd //(0) w continuous in 9.
Proof. To prove (47), note first that, by Lemmas 1, 2, 3 and 5,
supE\q0(9)\ < sup?|logoro2(0)| + sup?xo(#) < oc.

00 0
Thus, by ergodicity
QT(9)^Q(9) a.s.,
for all 9 0. Then uniform convergence follows on establishing the equicontinu
ity property
sup \Qt(9)-Qt(9)\^0 a.s.,

0:\\0-e\\<e
as s -> 0, and continuity of Q(9). By the mean value theorem it suffices to show
that
SUP
?^? + , SUP~^~
\\dQT(9)\\ |3G(0)||< ?? a*s-'
0 II SO II e II 39 ||
which, by Loeve ([23], page 121) and identity of distribution, is implied by

< K(x2 + 1),
?sup@ ||wo(#)|| < oo. By the definition of ut(9), and x2(/x)
\\vt(0)\\< 2(|*,| + 1), we have
<
\\ut(9)\\ + l^l-^T+ l
4lM#)||(l I +^-?k!
L erf(9)\ crt(9) JI
Thus, E sup@ ||wo(#) IIis bounded by a constant times
r r ip-il/pr
a2 -|l-l/p
Esup||ro(0)||+ Esup -^M Esupllro^)!!^^-"
0 L 0 Yofj(9)\ J L 0 J
+ ?supj-^-j + l
for all p > 1.On choosing p < 1+ 5/2, this is finite by Lemmas 5 and 6. (Our use
of Lemmas 5 and 6 is similar to Berkes, Horvath and Kokoszka's [3] use of their
Lemmas 5.1 and 5.2 in the GARCH(n, m) case.) This completes the proof of (47).
Then (48) and (49) follow by applying analogous arguments to those above, and so
we omit the details; indeed, (48) and (49) are only used in the proof of consistency
of Gt(9t), Ht(9t) for Go, Ho, where convergence over only a neighborhood of
#o would suffice.
LEMMA 8. Under Assumptions A(2 + 8), B, C, D, E and F(l),
(50) sup \QT(9) -Qt(9)\^0 a.s. as T -* oo.

6>e0
If also Assumption F(2) holds,

- -? 0 a.s. as T -> oo.
(51) sup \\GT(9) GT(9)\\
If also Assumption F(3) holds,

- -> 0 a.s. as T -^ oo.
(52) sup \\HT(9) HT(9)\\
Proof. We have QT(9) - QT(9) = AT(9) + BT(9), where
At(9) = t-x j> -

-l^
I fi7.(0)= r-1 ?>;V){a,-2(0) ^r2(^)}
r=l LM#)-J ,=1
Because
oo
a2(9) = of (9)+ Y, ^MOxlj(^),

7=0
ln(l + x) < x for x > 0 and a2(9) >coL>0, it follows that

T
\AT(9)\<KT-xJ2i^(0)-^m
t=\
T oo
<KT-'J2J2*jWxljM
r=l j=t
oo f t+T \
<KT~lJ2 E fjiOUUn).
t=oij=t+\ J
Now from (7),
t+T t+T
supY MS)<K E F^1 <Kmm(t + \,T)(t + l)-^~x.
SeT j=t+\ j=t+\
Thus,
T oo
(53) supAT(9) < KT~X?(/ + l)~Hxlt + 1)+ tf

J2 t~l~\x2_t + 1).
From the cr-inequality, (9) and (10), JîO + l)~-~Xx2Lt has finite pth moment,
and thus, by Loeve ([23], page 121), is a.s. finite. Thus, the second term of (53)
?
tends to zero a.s. as T oo, while the first does so for the same reasons combined
with the Kronecker lemma. Next,
T oo
\BT(e)\< KT-^xtmY^tm^jiiD
(54) T oo
< KT-X + D
Ex^)E;~-_1(*r2-7
t=\ j=t
From previous remarks,

YlJLt
+ 1) -> 0 a.s. Also, for each 9, a.s.
j~-~l(xj-j
-+ EXo(0)< +
T-'Y^XtiO) k\e(^-\ \\<K
by ergodicity and Lemma 5. Thus, (54) ?? 0 a.s. by the Toeplitz lemma. The con
vergence is uniform in 9 because, from the proof of Lemma 7, for all 9 e &,
sup Hxo(0)-Xo(0)ll-?O a.s.,

<?:||6>-6?||<?
as s ?> 0. This completes the proof of (50). We omit the proofs of (51) and (52) as
they involve the same kind of arguments.
LEMMA 9. For some 8 > 0, under Assumptions A(2 + 8), B, C, D, E, F(l)

and G, M(9) is finite and positive definite for all 9 e 0.
Proof. Fix 9 e Q. Finiteness

of M(9) follows from Lemma 6. Positive
definiteness follows (by an argument similar to that of Lumsdaine [24] in
x 1 vectors =
the GARCH(1, 1) case) if, for all nonnull (r + 2) A,, \'M(9)\
?{A/r0(#)}2 > 0, that is, that
(55) X'to(9)o$(9)^0 a.s.,
since 0 < < oo a.s. Define
ofi(9)
Tta)(9)= ^-lnof(9)
oco
=
ot-2(9),
=? =
M0) lna2(0) -2at-2(6) T ^;(?)*,-;(/*),
o CO
= =
^(9) --lnof(9) o-2(9) J2 IrfhOxf-ifa),
a? 7=1
so that rt(9) = (Ttco(9), k = (X\,X2,
rtfl(9), t^(9))'. Write Xf3)\
where A.i and X2
are scalar and A.3 is r x 1. Consider first the case = =
k\ X2 0, A.3^ 0. Suppose
(55) does not hold. Then we must have
00
= 0 a.s.
J^X.'3iff\Oxf_j(fi)
;=i
If i- 0, it follows that
A.^j(1)(?)
00
-
(56) (<r,_ie,_i + jio vf = ? ^f\t)x-M)
~{^j^tt)}"1
7=2
Since ot-\ > 0 a.s., the left-hand side involves the nondegenerate random vari
able st-\, which is independent of the right-hand side, so (56) cannot hold. Thus,
^ = 0- of this argument indicates
3^7 (?) Repeated application that, for all f,
= = l,..., This is contradicted
Xf3\//j(t;) 0,j jr(0- by Assumption G, so (56)
cannot hold. Next consider the case X\ = 0, X2 / 0, A3 = 0. If (56) does not hold,
we must have
00
(57) = 0 a.s.
?lMO*?-y(/*)
;=i
Let k be the smallest integer such that^(f) ^ 0- Then (57) implies

r oo
?t-k = cr~_\(9) M-/xo-^_1(?) E fj(P)xt-j(ti)

j=k+\
But the left-hand

side is nondegenerate and independent of the right-hand side, so
=
(57) cannot hold. Next consider the case X\ 0, X2 ?" 0> ^3 ?" 0- If (55) is not true,
then, taking X2 = 1, we must have
00
(58) = 0 a.s.
Y{X^f\Oxt-j(^)-2fj(^)}xt-j(^)
7=1
Let k be the smallest integer such that either

^V^CO 7^0 or V*(f) ?" 0; the
preceding argument indicates that there exists such k. Then we have
- - -
{2^(0 + /xo li)}{cft-k?t-k + Mo M}
X'3\l/(kX)(t;)(crt-k8t-k
00
= a.s.
E {^?\Oxt-j(v) -2^j(0}xt-j(li)
7=^+1
The left-hand side is a.s. nonzero and involves the nondegenerate random variable
8t-k, which is independent of the right-hand side, so (58) cannot
hold. We are left
? =
with the cases where X\ ^ 0. Taking X\ = 1 and noting that cr2(9)rta)(9) 1,
the preceding arguments indicate that there exist no X2 and A3 such that
= 1 a.s.
^2cr2(9)rt^(9) + Xf3a2(9)rH(9)
LEMMA 10. For some 8 > 0, under Assumptions A(2 + 8), B, C, D, E, F(l)
andH,
inf Q(9) > Q(90).
Proof. We have
^v )- Q(9o)
Q(0) mko) = E -^-lnj-^?[-1 ^
+(/z-/z0)2? } ?,? .
[cup) \a2(9)\ J lcr2(9).
term on the right-hand side is zero only when =
The second \i /xo and is positive
? ? > > ?
otherwise. Because x hut 1 0 for x 0, with equality only when x 1, it
remains to show that
(59) lncr^(9)= lna02 a.s., some 9 # <90.

= 0 a.s., for 9-
By the mean value theorem, (59) implies that (9 9o)fro(9) ^ 9p
? < -
and some 9 such that \\9 9o\\ \\9 Oo\\. But by Lemma 9 there is no such 9.
D
Acknowledgments. We thank an Associate Editor and referees for a number

of helpful comments that have led to a considerable improvement in the paper, and
Fabrizio Iacone for help with the numerical calculations referred to in Section 3.
REFERENCES
[1] Baillie, R. T., Bollerslev, T. and Mikkelsen, H. O. (1996). Fractionally inte
grated generalized autoregressive conditional heteroskedasticity. J. Econometrics 74 3

30. MR1409033
[2] Basrak, B., Davis, R. A. and Mikosch, T. (2002). Regular variation of GARCH
processes. Stochastic Process. Appl 99 95-115. MR1894253

[3] BERKES, I., HORVATH, L. and KOKOSZKA, P. (2003). GARCH processes: Structure and
estimation. Bernoulli 9 201-227. MR 1997027
[4] BOLLERSLEV, T. (1986). Generalized autoregressive conditional heteroscedasticity. J. Econo

metrics 31 307-327. MR0853051
[5] Breidt, F. J., Crato, N. and de Lima, P. (1998). The detection and estimation of long
memory in stochastic volatility. J. Econometrics 83 325-348. MR1613867
[6] Brown, B. (1971). Martingale central limit theorems. Ann. Math. Statist. 42 59-66.
MR0290428
[7] Comte, F. and Lieberman, O. (2003). Asymptotic theory for multivariate GARCH
processes. J. Multivariate Anal. 84 61-84. MR1965823

[8] Cressie, N., Davis, A., Folks, J. L. and Policello, G., II (1981). The
moment-generating function and negative integer moments. Amer. Statist. 35 148-150.

MR0632425
[9] Ding, Z. and Granger, C. W. J. (1996). Modelling volatility persistence of speculative
returns: A new approach. J. Econometrics 73 185-215. MR 1410004
[10] Ding, Z., Granger, C. W. J. and Engle, R. F. (1993). A long memory property of stock
market returns and a new model. J. Empirical Finance 1 83-106.
[11] Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the vari
ance of United Kingdom inflation. Econometrica 50 987-1007. MR0666121
[12] FRANCQ, C. and Zakoian, J.-M. (2004). Maximum likelihood estimation of pure GARCH
and ARMA-GARCH processes. Bernoulli 10 605-637. MR2076065
[13] GiRAlTis, L., KOKOSZKA, P. and Leipus, R. (2000). Stationary ARCH models: Depen
dence structure and central limit theorem. Econometric Theory 16 3-22. MR 1749017
[14] GlRAITIS, L. and ROBINSON, P. M. (2001). Whittle estimation of ARCH models. Econo
metric Theory 17 608-631. MR 1841822
[15] Harvey, A. C. (1998). Long memory in stochastic volatility. In Forecasting in the
Volatility
Financial Markets (J. Knight and S. Satchell, eds.) 307-320. Butterworth-Heinemann,
Oxford.
[16] Jeantheau, T. (1998). Strong consistency of estimators for multivariate ARCH models.
Econometric Theory 14 70-86. MR 1613694
[17] JENNRICH, R. I. (1969). Asymptotic properties of non-linear least squares estimators. Ann.
Math. Statist. 40 633-643. MR0238419
[18] Kazakevicius, V. and Leipus, R. (2002). On in the ARCH (oo) model. Econo
stationarity
metric Theory 18 1-16. MR 1885347
[19] Kazakevicius, V. and Leipus, R. (2003). A new theorem on the existence of invari
ant distributions with to ARCH processes. J. Appl. Probab. 40 147-162.
applications
MR1953772
[20] KOULIKOV, D. (2003). Long memory ARCH(oo) models: and quasi-maximum
Specification
likelihood estimation. Working Paper 163, Centre for Analytical Finance, Univ. Aarhus.
Available at
www.cls.dk/caf/wp/wp-163.pdf.
[21] Lee, S. and Hansen, B. (1994). Asymptotic theory for the GARCH(1,1) quasi-maximum
likelihood estimator. Econometric Theory 10 29-52. MR 1279689
[22] Ling, S. and McAleer, M. (2003). Asymptotic for a vector ARMA-GARCH model.
theory
Econometric Theory 19 280-310. MR1966031
[23] LOEVE,M. (1977). Probability Theory 1, 4th ed. Springer, New York. MR0651017
[24] Lumsdaine, R. (1996). Consistency and asymptotic of the quasi-maximum likeli
normality
hood estimator in IGARCHQ, 1) and covariance GARCH(1,1) models. Econo
stationary
metrica 64 575-596. MR1385558
[25] MlKOSCH, T. and Starica, C. (2000). Limit theory for the sample autocorrelations and
extremes of a GARCH(1, 1) process. Ann. Statist. 28 1427-1451. MR1805791
[26] MlKOSCH, T. and Straumann, D. (2002). Whittle estimation in a heavy-tailed
GARCH(1, 1)model. Stochastic Process. Appl 100 187-222. MR1919613
[27] NELSON, D. (1990). Stationary and persistence in the GARCH(1, 1)models. Econometric
Theory 6 318-334. MR1085577
[28] Nelson, D. (1991). Conditional heteroskedasticity in asset returns: A new Econo
approach.
metrica 59 347-370. MR1097532
[29] Robinson, P. M. (1991). Testing for strong serial correlation and dynamic conditional het
eroskedasticity in multiple regression. J. Econometrics 47 67-84. MR 1087207

[30] Robinson, P. M. and Zaffaroni, P. (1997). Modelling and long memory in
nonlinearity
time series. In Nonlinear Dynamics and Time Series (C. D. Cutler and D. T. Kaplan, eds.)
161-170. Amer. Math. Soc, Providence, RL MR1426620
[31] Robinson, P. M. and Zaffaroni, P. (1998). Nonlinear time series with long memory:
A model for stochastic volatility. J. Statist. Plann. Inference 68 359-371. MR 1629599
[32] STRAUMANN, D. and MlKOSCH, T. (2006). Quasi-maximum likelihood estimation in con
ditionally heteroscedastic time series: A stochastic recurrence equations Ann.

approach.
Statist. To appear.
[33] TAYLOR, S. J. (1986). Modelling Financial Time Series. Wiley, Chichester.
[34] vonBahr,B. and Esseen, C.-G. (1965). Inequalities for the rth absolute moment of a sum
of random variables, 1 < r < 2. Ann. Math. Statist. 36 299-303. MR0170407
[35] Whistler, D. E. N. (1990). Semiparametric models of daily and intra-daily rate

exchange
volatility. Ph.D. dissertation, Univ. London.
[36] Zaffaroni, P. (2003). Gaussian inference on certain long-range dependent volatility models.
J. Econometrics 115 199-258. MR1984776
[37] ZAFFARONI, P. (2004). Stationarity and memory of ARCH(oo) models. Econometric Theory
20 147-160. MR2028355
Department of Economics Tanaka Business School

London School of Economics Imperial College London
Houghton Street South Kensington Campus
London WC2A 2AE London SW7 2AZ
United Kingdom United Kingdom
E-MAIL: p.m.robinson@lse.ac.uk E-MAIL: p.zaffaroni@imperial.ac.uk

Pseudo Maximum Likelihood

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pseudo Maximum Likelihood

Uploaded by

Copyright:

Available Formats

Pseudo-Maximum Likelihood Estimation of ARCH(∞) Models

Author(s): Peter M. Robinson and Paolo Zaffaroni

PSEUDO-MAXIMUM LIKELIHOOD ESTIMATION

By Peter M. Robinson1 and Paolo Zaffaroni

London School of Economics and Imperial College London

Strong consistency and asymptotic normality of the Gaussian pseudo

large numbers. Particular parameterizations are discussed.

1. Introduction. ARCH(oo) processes comprise a wide class of models for

(2) a} = coo+ J2 isojxf-j,

and {st} is a sequence of independent identically distributed (i.i.d.) unobservable

(4) 1fjUo) = 1roj, 7>L

Received October 2003; revised January 2005.

Supported by a Leverhulme Trust Personal Research Professorship and ESRC Grants

for some unknown /xo.

assumption coo > 0, Giraitis, and Leipus

age in martingale differences representation of see [37]. In this paper we do

constrained least squares regression of y} on the s > 0, a method employed

of fourth moments of yt for consistency and of eighth moments for asymptotic

for (eZ, and

*<*> = -T^+lna^6^ 9t(0) = -^+\nd}(9), 1< t < T,

popular choices of the are presented

apply the results to particular models.

2. Assumptions and main results. Our assumptions are as follows.

Assumption C. 6o is an interior point of 0.

Assumption D. For all j > 1,

(8) foj<Kfok fori <k<j,

where K throughout denotes a generic, positive constant.

Assumption E. There exists a strictly stationary and ergodic solution xt to

with d as in Assumption D, we have

Assumption F(/). For all j > 1, has continuous kth derivative on T

Assumption G. For each f Y, there exist = = r,

*,;.*)tt) -l<?)..... <"?)).

(14) p e (4/(2rfo+ 3), 1).

We present asymptotic results for the uncomputable 9? as propositions, those

Proposition 1. For some 8 > 0, let Assumptions A(2 + 8), B, C, D, E, F(l)

-> a.s. as T -? oo.

(15) 9j ?> 9o a.s. as T -* oo.

Proof. From Lemmas 7 and 8, Qt(0) converges uniformly to Q(9) a.s.,

Denote by Kj the j th cumulant of st and introduce

M = E(tot& N = E(a-X r0)e'2, P=

PROPOSITION 2. Let Assumptions A(4), B, C, D, E, F(3) and G hold. Then

Tx/2(9T-90) -i N(0, H-xG0HqX) asT^oo.

By the mean value theorem,

- m\ 9^(g) - <a\ ~ u m\ d24t(0)

THEOREM 2. Let Assumptions A(4), B, C, D, E, F(3), G and H hold. Then

(17) Tl/2(9T-90) 4> N(0, H^1GoHq1) as T^ oo,

Consider first B\t. We have

for all ?7> 0. Now

t-\lt-\ \'/2/oo \1/2

Thus, by (8) and (14),

u -? ?t ~Gt u -? Gt {?t ~?t )

Thus, from Assumptions F(l) and H,

from (14) and arbitrariness of rj.Also, < so from

and thence, B2T =

In earlier of this paper we checked

3. Fractional GARCH models. A slowly decaying class of ARCH(oo)

(23) = i-0-z)*\ 0<?<1,

and for all f G Y,

(27) ft(z;?)#0, |z|<l;

Now take \/f(z; t;) (22) to be given by

no, i< m ^(z;0{l-(l-z)^}

COROLLARY1. Let\j/(z;0 be given by (29) and (25) with m>l,n>0, and

PROOF. Denoting by cj (j > 1) and dj (j > 0) the coefficients of z7 in the

<> = -T^+lna^6^ 9t(0) = -^+\nd}(9), 1< t < T,

,;.)tt) -l<?)..... <"?)).

E woixLj 1 E= 1 i7(?)i + E i*y(?)i,