Cstmgasubmitjcrh 112 Brev 2 Bsec 2 B

ASYMPTOTIC EXPANSIONS FOR I.I.D.
SUMS VIA LOWER-ORDER

CONVOLUTIONS WITH GAUSSIANS
KENNETH S. BERENHAUT, JAMES W. CHERNESKY, JR., AND ROSS P. HILTON
A BSTRACT. In this paper we introduce new asymptotic expansions for probability func-
tions of sums of independent and identically distributed random variables. Results are
obtained by efficiently employing information provided by lower-order convolutions.
In comparison with Edgeworth-type theorems, advantages include improved asymp-
totic results in the case of symmetric random variables and ease of computation of main
error terms and asymptotic crossing points. The first-order estimate can perform quite
well against the corresponding renormalized saddlepoint approximation and, pointwise,
requires evaluation of only a single convolution integral. While the new expansions are
fairly straightforward, the implications are fortuitous and may spur further related work.
1. I NTRODUCTION
In this paper we introduce new asymptotic expansions for probability functions of
sums of independent and identically distributed (i.i.d.) random variables, employing in-
formation provided by lower-order convolutions. In particular, suppose X is a random
variable with mean and finite variance 2 , and X1 , X2 , . . . is a sequence of i.i.d. ran-
dom variables each with the same distribution as X. In addition, suppose that X has
probability function f , where f is either a density or a probability mass function. Let
f (i) be the probability function for the partial sum Si = X1 + X2 + + Xi .
We are interested in expansions of the form
k ( )
1
(1) f (M) = Ci (n, f , M) + o
(n)
k+1
,
i=0 n 2
uniformly in M, for k 0. The well-known classical Edgeworth expansions are of the
form in (1) (see Section 2, below, as well as for instance Gnedenko and Kolmogorov
(1954, Chapters 8 and 9) and Petrov (1975, Chapter 7)).
The main idea, here, is to make efficient use of possibly available direct information
on f (i) for small values of i in approximating { f (n) }n1 . In particular, for 0 h n,
(n)
suppose h = h is the probability function for
(2) Sn = Sn,h = X1 + + Xh + Nh+1 + + Nn = Sh + Nh+1 + + Nn ,
2000 Mathematics Subject Classification. 60F05, 60E05.

Key words and phrases. Edgeworth expansions, Local approximations, Symmetric distributions, Her-
mite polynomials, Lattice distributions, Convolutions, Saddlepoint approximations, Cumulants, Normal
distribution.
1
2 K. S. BERENHAUT, J. W. CHERNESKY, JR., AND R. P. HILTON
where the Ni are i.i.d. (normal) N( , 2 ) random variables, independent of the Xi . Note
that h is the probability function for an independent sum of Sh = X1 + X2 + + Xh
and a normal random variable with mean (n h) and variance (n h) 2 . Assuming
knowledge of the distribution of Sh , h (M), for fixed M, may potentially be computed via
conditioning. In fact, one may make use of existing work regarding sums of particular
random variables with Gaussian random variables (see for instance Nason (2006) for
discussion of the value of results in that direction).
Now, for convenience of notation, set
( )
n k (M) 0 (M)
(3) k (M) = ,
k 0 (M)
for 1 k n, fn = f (n) for n 1, and
f (n) (M) 0 (M)

(4) fn (M) = .
0 (M)
Equivalent to (1), we are interested in expansions of the form
k ( )
1
(5) fn (M) = Ci (n, f , M) + o

k
,
i=1 n2
uniformly in M, for k 1.
By employing some existing results on Edgeworth-type expansions for general sums
of not necessarily identically distributed random variables (see Section 2, below), we
will prove that under mild assumptions on the distribution of X, M in the support set
of Sn and , the forward difference operator (see Equation (43), below), Equation (5)
holds with
( )
n 1 ( i1 )
(6) Ci (n, f , M) = 1 (M), i1
i1
(see Theorem 2, below). Note that C1 (n, f , M) = 1 (M), C2 (n, f , M) = (n1)(2 (M)
( )
1 (M)) and C3 (n, f , M) = n1
2 (3 (M) 22 (M) + 1 (M)).
The following simple example for sums of Bernoulli random variables is illustrative
of the main concept.
Example 1. Suppose X1 , X2 , . . . are independent identically distributed Bernoulli ran-

dom variables satisfying P(X1 = 1) = p. We then have
(7) 1 (M) = pgn1,p (M 1) + (1 p)gn1,p (M)
and
2 (M) = p2 gn2,p (M 2) + 2p(1 p)gn2,p (M 1) + (1 p)2 gn2,p (M),
(8)
ASYMPTOTIC EXPANSIONS VIA LOWER-ORDER CONVOLUTIONS 3
where for i 1, gi,p is the density function of a N(ip, ip(1 p)) random variable. Equa-
tions (5) and (6) give, for k = 1,
( ) ( ) ( )
1 1 (M) 0 (M) 1
(9) fn (M) = 1 (M) + o =n + o 1/2 ,
n1/2 0 (M) n

that is, since 0 (M) = O(1/ n), employing (4),
( )
1
fn (M) = n (1 (M) 0 (M)) + 0 (M) + o
n
( )
1
(10) = n1 (M) (n 1)0 (M) + o .
n
Similarly, for k = 2, we have
( )
1
fn (M) = 1 (M) + (n 1)(2 (M) 1 (M)) + o
n
( )
1 (M) 0 (M)
= n
0 (M)
( ( ) ( ))
n 2 (M) 0 (M) 1 (M) 0 (M)
+(n 1) n
2 0 (M) 0 (M)
( )
1
+o ,
n
and thus,
n(n 1)
fn (M) = 0 (M) + n(1 (M) 0 (M)) + (2 (M) 0 (M))
(2 )
1
n(n 1)(1 (M) 0 (M)) + o 3/2
n
( )
n(n 1) (n 1)(n 2) 1
= 2 (M) n(n 2)1 (M) + 0 (M) + o 3/2 .
2 2 n
(11)
Now, combining (7) and (10) gives
( )
( ) 1
fn (M) = n pgn1,p (M 1) + (1 p)gn1,p (M) (n 1)gn,p (M) + o
n
( )2
np 12
M1(n1)p
= e (n1)p(1p)
2 (n 1)p(1 p)
( )2
n(1 p) 12 M(n1)p
+ e (n1)p(1p)
2 (n 1)p(1 p)
( )2
( )
n1 12 Mnp 1
e np(1p)
+o
2 np(1 p) n
( )2
1 M1(n1)p ( )2
n M(n1)p
pe 2 (n1)p(1p) + (1 p)e 2 n1
1
=
2 (n 1)p(1 p)
( )2
( )
n1 12 Mnp 1
(12) e np(1p)
+o .
2 np(1 p) n
The estimate in (12) may be compared with the first-order Edgeworth estimate (see
Proposition 2, below)
( ) ( )3
1 1 2p M np
fn (M) = 0 (M) 1 +
n 6 p(1 p) np(1 p)
( ))) ( )
M np 1
3 +o
np(1 p) n
( )2
1 12 Mnp
= e np(1p)
2 np(1 p)
( ) ( )3 ( )
1 1 2p M np M np
1 + 3
n 6 p(1 p) np(1 p) np(1 p)
( )
1
(13) +o .
n
A simplified version of the estimate in (12), for (n, p) = (50, 3/5) is given by
( )
3 (5M152)2
588
(5M147)2
588
(M30)2
f50 (M) 300e + 200e 343 2e 24 ,
84
(14)
while the corresponding estimate in (13) is

6 (M30)2
(15) f50 (M) e 24 (M 3 + 90M 2 2664M + 30240).
51840
The differences between the estimates and targets in (14) and (15) are plotted in Figure
1, below. Note the overall marked improvement of the estimate in (14) over that in (15).
F IGURE 1. Line plots of the differences between the estimates and tar-
gets in (14) (dashed) and (15) (solid).
For M = 31, we have

( ) ( )31 ( )19
50 3 2
(16) f50 (31) = 0.1108631156,
31 5 5
while the corresponding estimates in (14) and (15) are approximately 0.110961693 and
0.1113597512 with errors of 9.85774 105 and 4.966356 104 , respectively. For the
central value M = 25, the respective errors are 8.49161 106 and 3.4325981 104 ,
respectively a forty-fold improvement. We will return to this example in Example 3,
below.
Due to its simplicity, it is worth stating the lower-order expansion in (10) as a formal
result (see Theorem 2, below for the more general expansion in terms of fn ). Note that
beyond the mean and variance of X, for a single value of M, the estimate in (17) requires
only computation of one convolution integral (for further discussion of computational
coinsiderations see Examples 3, 4 and 5, below).
Theorem 1. Suppose X1 satisfies E(|X1 |5 ) < and either

1
Sn has a bounded density rN for some N
n
or1
X1 is integer valued with maximal span equal to one,
then
( )
1
(17) fn (M) = n1 (M) (n 1)0 (M) + o ,
n
uniformly in < M < , where 1 and 0 are the probability functions for Sn,1 =
X1 + N2 + + Nn and Sn,0 = N1 + N2 + + Nn in (2), respectively.
Proof. This follows from setting k = 1 in Theorem 2, below, and employing the trans-
formations in (4) and (3).
Note that Theorem 4, below, implies that if, in addition, X1 is a symmetric random
variable satisfying E(|X1 |7 ) < , then the o(1/n) error term, in (17) may be replaced
with o(1/n2 ).
For ease of notation, in regards to lattice random variables, we will restrict attention
to those with maximal span h = 1. It is not difficult to modify the given results to other
values of h.
We now turn to a simple continuous random variable example.
Example 2. Consider the case that X1 is a uniform random variable on the interval (0, 1).
In this case (see for instance Mood, Graybill, and Boes (1974, p. 238)),
( )
1 j n
(n 1)! 0
(1) (M j)n1 , If 0 < M < n,
(18) fn (M) = j .
jM
0, otherwise
Theorem 1 gives
( )
1
fn (M) = n1 (M) (n 1)0 (M) + o
n
( )
1
(19) = n(n1 (M) n1 (M 1)) (n 1)n (M) + o ,
n
where i is the cumulative distribution function of a N(i/2, i/12) random variable, and
i is the associated density.
1X is said to be a lattice random variable if there exist constants c and h > 0 such that X takes on
1
values of the form c ih, i Z, with probability one. The constant h is called a span of the distribution.
Since the distribution of X1 is symmetric, the first-order Edgeworth estimate in this
case is simply n (M) = 0 (M) (see Proposition 2 below). In Section 4, below (see
also the comment following Theorem 1), it is shown that the expansion in (6) can have
distinct advantages in regards to accuracy in the symmetric random variables case.
The following figure contains plots of the differences (n1 (M) (n 1)0 (M))
fn (M) and (the in general much larger in absolute size) 0 (M) fn (M), for n = 100.
F IGURE 2. Plots of the differences 0 (M) fn (M) (left) and (n1 (M)
(n 1)0 (M)) fn (M) (right) for n = 100 and X1 uniform on (0, 1).
We now turn to a comparison of the estimate in (17) with the standard saddlepoint
approximation (see, for instance Goutis and Casella (1999) and the references therein).
Example 3. (Comparison with a standard saddlepoint approximation). Returning to the

Bernoulli random variable considered in Example 1, note that the saddlepoint approxi-
mation in this case is given by (see, for instance Goutis and Casella (1999))
( )2 ( ( ) M t(y) )
1 2 1 p + pet(y)
(20) fn (M) exp n log 1 p + pet(y)
,
2 pet(y) (1 p)n n
for M {0, 1, . . . , n}, where y = M/n and t(y) satisfies pet(y) /(1 p + pet(y) ) = y, i.e.
( )
y(1 p)
(21) t(y) = log .
p(1 y)
Substitution in (20) gives

1 n nn
(22) fn (M) (1 p)nM pM ,
2 M(n M) M M (n M)nM
i.e. the result of applying Stirlings approximation for the factorials in the binomial
probability function.
Figure 3 gives the normal, first-order Edgeworth, saddlepoint and standardized sad-
dlepoint approximations (where the standardized saddlepoint is standardized to sum to
one), along with the approximation from Theorem 1, for n = 20 and p = 0.6.
n
0.004
t Theorem 1
s Saddlepoint
d Saddlepoint (st) n
e Edgeworth s
e
n s
0.002
n Normal s s
s e s
Difference from target
e n n
s s
s t n
n s e
0.000
e e e s t t dt d d d st e n
e
st
n
d e
st
n
d e
st
n
d e
st
n
d st
d s
d st
d et
d t et d st
d e
st
n nt d d dt dt d
d
t t
d
n e n
e e
e
n n
e
0.002
n e
n
0.004
0 5 10 15 20
gets for n = 20 and p = 0.5 in Example 3.
Table 1 gives the absolute error and rank placement for each of the approximations,
for the 21 values in the support set of Sn ; the respective mean absolute errors (MAE)
are given in the last row of the table. The approximation in (17) fares quite well through
much of the support set. For the corresponding symmetric case (p = 0.5 and n = 20), the
mean absolute errors are given in Table 2. A corresponding plot is given in Figure 4; note
that in this case the Edgeworth and normal approximations coincide (only the Edgeworth
is shown). The estimate in Theorem 1 outperforms the saddlepoint, marginally, with
respect to mean absolute error, in this case.
TABLE 1. Absolute errors, rank placements and mean absolute errors
for Example 3; as in Figure 3, t, s, d, e and n denote the estimates in
the case of Theorem 1, the saddlepoint, the standardized saddlepoint, the
Edgeworth, and the normal approximation, respectively.
M Thm 1 Saddlept Saddlept(st) Edgew Normal t s d e n
0 0.000000 0.000000 0.000000 0.000000 0.000000 1 2 2 5 4
1 0.000000 0.000000 0.000000 0.000001 0.000000 3 2 1 5 4
2 0.000000 0.000000 0.000000 0.000007 0.000001 3 2 1 5 4
3 0.000000 0.000001 0.000001 0.000031 0.000003 1 3 2 5 4
4 0.000006 0.000006 0.000002 0.000095 0.000038 2 3 1 5 4
5 0.000018 0.000024 0.000005 0.000198 0.000189 2 3 1 5 4
6 0.000012 0.000077 0.000006 0.000231 0.000572 2 3 1 4 5
7 0.000057 0.000207 0.000004 0.000062 0.001095 2 4 1 3 5
8 0.000141 0.000471 0.000044 0.000786 0.001105 2 3 1 4 5
9 0.000030 0.000905 0.000125 0.001358 0.000313 1 4 2 5 3
10 0.000285 0.001473 0.000226 0.000712 0.002901 2 4 1 3 5
11 0.000325 0.002036 0.000282 0.001159 0.004340 2 4 1 3 5
12 0.000158 0.002386 0.000223 0.002386 0.002386 1 3 2 3 5
13 0.000507 0.002362 0.000048 0.001377 0.001804 2 5 1 3 4
14 0.000170 0.001964 0.000153 0.000757 0.004370 2 4 1 3 5
15 0.000308 0.001358 0.000269 0.001668 0.003339 2 3 1 4 5
16 0.000278 0.000772 0.000260 0.000917 0.000598 2 4 1 5 3
17 0.000020 0.000356 0.000174 0.000086 0.001119 1 4 3 2 5
18 0.000128 0.000132 0.000086 0.000392 0.001195 2 3 1 4 5
19 0.000054 0.000041 0.000034 0.000231 0.000618 3 2 1 4 5
20 0.000008 0.000037 0.000037 0.000062 0.000195 1 2 2 4 5
MAE 0.000119 0.000696 0.000094 0.000596 0.001247
e
s
0.002
t Theorem 1 s s
s Saddlepoint
d Saddlepoint (st)
e Edgeworth s s
e e
0.001
s s
s s
e e s s e e
e s t t s e
0.000
e
st
d e
st
d st
d st
d dt d d dt dt d d dt st
d st
d e
st
d e
st
d
t t d t t d t t
e d dt d e
e e
0.001
e e
e e
0 5 10 15 20
gets for n = 20 and p = 0.5 in Example 3.
TABLE 2. Mean absolute errors for n = 20 and p = 0.5 in Example 3.
Thm 1 Saddlept Saddlept(st) Edgeworth Normal

MAE 0.000044 0.000647 0.000049 0.000575 0.000575
Note that, formally, the standard saddlepoint approximation requires inverting of the
cumulant generating function, whereas the approximation presented in Theorem 1 is,
in a sense, cumulant-free, but, in place, requires knowledge of the probability function
f . For discussion of saddlepoint approximations in the case of intractable cumulant
generating functions see Kolassa (1991).

In the next and final example of this section, we consider a case where a closed form
for the probability function of Sn is not as readily available. In order to be able to com-
pare estimates, we make use of the valuable distr and distrEx packages in R (see
Ruckdeschel, Kohl, Stabla, and Camphausen (2006)). The packages allow for distribu-
tional computations, for arbitrary probability functions and moderately sized n.
Example 4. Suppose X is distributed as a Beta random variable with parameters = 0.5

and = 0.7, i.e.
( + ) 1
f (x) = x (1 x) 1
( )( )
(1.2)
(23) = x0.5 (1 x)0.3 , 0 x 1,
(0.5)(0.7)

where is the gamma function given by (z) = 0 et t z1 dt, for z > 0.
Note that here the cumulant generating function is given by log (1 F1 ( ; + ;t)),
where 1 F1 is the confluent hypergeometric function, and the standard saddlepoint ap-
proximation is not as readily available (see Butler (2007, p .16) for discussion). For
some work on densities of sums of Beta random variables, see for instance Pham-Gia
and Turkkan (1998).
Figure 5 provides a comparison of performance of the normal approximation, first-
order Edgeworth approximation and the approximation given in Theorem 1. The approx-
imate mean absolute errors are 0.000934, 0.000420 and 0.000068, respectively. Again
the estimate in (17) performs quite well throughout the support set.
0.004
n t Theorem 1
e Edgeworth
n
n Normal
ee
0.002
n

n e
n n
e
n
neee t t
0.000
nt ee
e
nt e nt e
nt e nt e
nt e
nt e t t t t et nt t nt t t et t t e
tent e
nt e
nt e
nt e
nt e
nt e
nt e
nt e
nt e
nt e
nt e
nt e
nt
t t t et t e nn
e e n
e n
eenn
ee
0.002
n
0.004
n
n
0 5 10 15 20
gets for = 0.5 and = 0.7 in Example 4.
As suggested earlier, to obtain the estimate in Theorem 1, for a given X1 with density
f , mean and variance 2 , at a given point, M (and any n), requires numeric evaluation
of only the integral

(24) I(M) = f (x)n1 (M x)dx

where i , here, is the density of a normal random variable with mean i and variance
i 2 . The estimate for fn (M) is then given by
(25) fn (M) = nI(M) (n 1)n (M).

The computation of the estimate in (25) can be carried out in a straightforward man-
ner, for any given n, M and f , in Maple or some other standard mathematical software
package. Similarly, feasible computation of an estimate of order k essentially requires
knowledge of fi , for 1 i k. Higher order Edgeworth expansions can be a good deal
easier to compute if one only has knowledge of f .

The remainder of the paper proceeds as follows. Section 2 provides some background
information on Edgeworth expansions and related concepts, while Section 3 includes a
proof of the expansion indicated in (6) (see Theorem 2). Section 4 concludes the paper
with discussion of advantages and implications.
2. P RELIMINARIES AND N OTATION
In this section, we introduce some preliminaries and notation useful in the context of
expansions of the type in (1). The casual reader may wish to skip to the statements of
the main results in Section 3.
Suppose Y1 ,Y2 , . . . is a sequence of independent and not necessarily identically dis-
tributed random variables with mean and finite variance 2 . Define the central mo-
ments, { j,k } of Y j via
(26) j,k = E(Y j )k , k 1,
and the cumulants { j,k }k1 of Y j via
( ( ))
(it)k
(27) log E eitY j = j,k k!
.
k=1
For fixed j, the first four { j,k } are given by j,1 = 0, j,2 = j,2 = 2 , j,3 = j,3
and j,4 = j,4 3 2j,2 (see for instance Gnedenko and Kolmogorov (1954, Chapter 2;
Section 15)).
The Hermite polynomials {Hk } can be defined via
k x2 /2 d k x2 /2
(28) Hk (x) = (1) e e ,
dxk
and satisfy
[ 2k ]
(1) j xk2 j
(29) Hk (x) = k! ,
j=0 j!(k 2 j)!2 j
for k 0. The first six Hermite polynomials are H0 (x) = 1, H1 (x) = x, H2 (x) = x2 1,
H3 (x) = x3 3x, H4 (x) = x4 6x2 + 3, and H5 (x) = x5 10x3 + 15x. Note that these
polynomials satisfy the recurrence relation
(30) Hi (x) = xHi1 (x) (i 1)Hi2 (x),
for i 2. For further information on Hermite polynomials see for instance Szego (1975,
Chapter 5).
Next, define the standardized average cumulants
( j )
1
i=1 i,k
(31) j,k = k
k! j
for j, k 1 and the polynomials
a a
1 ( )k
(32) qa,n (x) = Ha+2v(x) km! n,m+2 m , a, n 1
v=1 k m=1
where the second sum in (32) is over a-tuples k = (k1 , k2 , . . . , ka ) where ki 0 and
ai=1 iki = a, and ai=1 ki = v. For convenience, unless otherwise stated, all sums over k,
throughout, will refer to sums over such a-tuples.
In addition, set V j (x) = P(Y j < x) and v j (t) = E(eitY j ). Note that if {Y j } are identically
distributed, (32) becomes
a a
1 ( )km
(33) qa,n (x) = qa (x) = Ha+2v(x) km!
m+2 , a, n 1
v=1 k m=1
where for k 1,
k
(34) k = ,
k k!
with k = 1,k .
Now, consider the partial sum Sn = ni=1 Yk for n 1 and note that E[Sn ] = n and
Var(Sk ) = n 2 . Petrov (see Petrov (1975, Theorem 7.14; p. 206)) provides the following
result.
Proposition 1. Suppose {Y j } have zero mean and common finite variance 2 ,
n
1
(35) lim sup
n E(|Y jk+2|) < ,
j=1
and
n
1
(36)
n
|x|k+2 dV j (x) 0,
j=1 |x|>n
for some integer k 1 and some positive < 1/2. In addition, suppose that
n ( )
1
(37) |v j (t)|dt = o n(k+1)/2
|t|> j=1
for every fixed > 0. Then for all sufficiently large n there exists an everywhere contin-
uous density rn (x) of the variable
1
(38) Sn
n
and
( )
1 x2 /2 1 x2 /2 k qa,n (x) 1
(39) rn (x) = e
2
+ e
2
a/2 + o nk/2
a=1 n
uniformly in x ( < x < ).

In the case of identically distributed random variables, Petrov (see Petrov (1975, The-
orems 7.13 and 7.15)) also proves the following result.
Proposition 2. Suppose X1 satisfies E(|X1 |k+2 ) < for some k 1 and X1 satisfies
either
1
Sn has a bounded density rN for some n = N
n
or
then, Equation (5) holds with
qi (x)
(40) Ci (n, f , M) =
=: Ei (M), i 1,
ni/2
say, uniformly in < M < , where
M n
(41) x= .
n

Some lower-order Edgeworth approximations implied by Proposition 2, are the fol-
lowing:
( )
1 1
fn (M)
= H3 (x)3 + o ,
n n
( ) ( )
1 1 H6 (x) 2 1
fn (M) = H3 (x)3 + (3 ) + H4 (x)4 + o ,
n n 2! n
( )
1 1 H6 (x) 2
fn (M) = H3 (x)3 + (3 ) + H4 (x)4
n n 2!
( ) ( )
1 H9 (x) 3 1
+ 3/2 (3 ) + H7 (x)3 4 + H5 (x)5 + o 3/2 .
n 3! n
(42)
In the next section, we prove our main result.
3. M AIN RESULT
In this section, we employ Propositions 1 and 2 to prove our main result. While the
results that follow are quite straightforward to state, some care must be taken in tracking
error terms.
Note that throughout, will denote the standard forward difference operator (see, for
instance, Riordan (1979, p. 201)), i.e. for a 0 and given function ,
a ( )
ai a
(43) (x) = (1)
a
(x + i).
i=0 i
The following elementary combinatorial identity will be helpful.
Lemma 1. Suppose k, v 1. Then,
( )
k ( ) 1
1 n1 O vk = o(1), If v > k
(44)
nv1 j=1 j 1
j1 (1)v1 =

n ,
1, otherwise
where j1 (1)v1 is an abbreviation for j1 xv1 evaluated at x = 1.
Proof. First, note that standard identities for binomial coefficients and the Stirling num-
bers of the second kind {Sn,k } (see for instance Charalambides and Singh (1988), Guo
(2005) and Riordan (1979, p. 202)) give that
j1 (1)v1 = j1 (0)v1 + j (0)v1
= ( j 1)!Sv1, j1 + j!Sv1, j
= ( j 1)!(Sv1, j1 + jSv1, j ) = ( j 1)!Sv, j ,
and
v
(45) (n) j Sv, j = nv,
j=1
where (n) j = n(n 1) . . . (n j + 1). Hence, since v (1)v1 = 0, if k v we have

k ( ) v ( )
n1 n1
j 1 (1) j1 v1
= j1 (1)v1
j=1 j=1 j 1
v ( )
n1
= ( j 1)!Sv, j
j=1 j 1
v
(n 1)!
= (n j)! Sv, j
j=1
v
1 1
(46) =
n (n) j Sv, j = n nv = nv1.
j=1
(n1)
On the other hand, if k < v, kj=1 j1 j1 (1)v1 is a polynomial of degree at most
k 1 in n (for large n), and hence,
( )
1 k n1
(47) j 1 j1(1)v1 = O(nkv)
nv1 j=1
and the result follows.

Applying Proposition 1 to the mean-corrected variables {Yi }i1 given by
{
Xi , if 1 i h
(48) Yi = ,
Ni , otherwise
with {Xi }, {Ni } as in (2) gives the following result.
Lemma 2. Suppose E(|X1k+2 |) < for some integer k 1. Then, for 1 h n (and h
as defined in (3)),
k q (x) ( )
h h (M) 0 (M) 1
= a/2 + o k/2 ,
a,n
(49) (M) =
n h 0 (M) a=1 n n
uniformly in M ( < M < ), where x is as in (41) and
( )km
a a
1 hm+2
qa,n (x) = Ha+2v (x)

v=1 k m=1 km ! n
( )
a
h v
(50) = Ca,v ,
v=1 n
with, for 1 v a,
a
1 ( )km
(51) Ca,v = Ha+2v (x) m+2 ,
k m=1 km !
i = i /( i i!), and i and 2 the i-th cumulant and variance of X, respectively.
Proof. We apply Proposition 1. Since E(|Ni | j ) < for all i h + 1 and j 0 and
E(|X k+2 |) < , it suffices to verify that (37) holds. This follows directly from the fact
that almost all Yi are Gaussian. In particular, |vi (t)| 1 for 1 i h and |vi (t)| = e 2 t
1 2 2
for i > h. Hence for > 0 and n > h,

n
|v j (t)|dt
|t|> j=1 |t|>
|vh+1 (t)|nh dt
(52) CeKn ,
for constants C, K > 0 (depending on and h).
Since it follows from (27) that all cumulants of order greater than two for a normal

random variable are zero (see for instance Davison (2003, p. 45)), n,m+2 in (32) be-
comes
( n )
1 i=1 i,m+2 h
(53) n,m+2 = m+2 = m+2 ,
(m + 2)! n n
and the theorem follows.

am=1 km
Note that for a-tuples k = (k1 , k2 , . . . , ka ) in (51), = v, and hence Ca,v in (51)
is comprised of terms of order v in the {i }.
We are now in a position to state and prove our main theorem. Note that all results,
from here forward, are stated in terms { fn } and {i } (see Equations (4) and (3) for the
simple transformations to { f (n) } and {i }, respectively).
Theorem 2. Suppose X1 satisfies E(|X1 |3k+2 ) < , for some k 1, and either
1
Sn has a bounded density rN for some n = N
n
or
then, Equation (5) holds with
( )
n 1 ( i1 )
(54) Ci (n, f , M) = 1 (M) =: Vi (M), i 1,
i1
say, uniformly in < M < .
Proof. For 1 h k, we have via Lemma 2 that

( )a ( )km ( )
h 3k
1 a
1 hm+2 a
1
(M) =
n h Ha+2v(x) km! n

n
+ o 3k/2
n
a=1 v=1 k m=1
( ) ( ) ( )
3k
1 a a h v 1
(55) = C a,v + o 3k/2 .
a=1 n v=1 n n
(n1)
Hence, Lemma 1 and the fact that j1 (n/h) = o(n j ) give
( )
k
n 1 ( j1 )
1 (M)
j=1 j 1
( )
3k ( ) ( ) ( k )
1 a a 1 k n1 n
= n Ca,v nv1 j 1 (1) j1 v1
+ o 3k/2
n
a=1 v=1 j=1
( )
k ( ) 3k ( ) ( )
1 a a 1 a k 1
= n Ca,v + n Ca,v + o(1) + o nk/2
a=1 v=1 a=k+1 v=1
k ( )a a ( )
1 1
= n Ca,v + o nk/2
a=1 v=1
( )
1
(56) = fn (M) + o k/2 .
n

The main error term for the kth-order estimate in (54) can actually be computed quite
simply (assuming non-zero third cumulant), as can be seen by the next theorem.
Theorem 3. Suppose the assumptions of Theorem 2 are satisfied, and in addition E(|X1 |3k+3 ) <
. Then,
( )k+1 ( )
k
1 (3 )k+1 1
(57) fn (M) Vi(M) =
n
H3(k+1) (x)
(k + 1)!
+ o (k+1)/2 ,
n
i=1
uniformly in < M < , where x is as in (41).

Proof. Arguing as in (56), we have
( )
3k+1 ( )a ( )
k
1 a
1 k
n1
V j (M) = Ca,v (1)
j1 v1
j=1 a=1 n v=1 nv1 j=1 j 1
( )
nk
+o
n(3k+1)/2
( ) 3k+1 ( ) ( )
k
1 a a 1 a k
= Ca,v + n Ca,v + o(1)
a=1 n v=1 a=k+2 v=1
( )k+1 ( k ) ( )
1 1
+
n Ck+1,v + o(1) + o n(k+1)/2
v=1
( ) ( )
1 k+1 1
(58) = fn (M) Ck+1,k+1 + o (k+1)/2 .
n n

In the next section we will discuss some implications of Theorem 2.
4. S OME I MPLICATIONS
In this section we will consider some implications of Theorem 2, including in partic-
ular, improved errors in the case of symmetric distributions as well as facility in deter-
mining asymptotic crossing points of estimates with the target.
4.1. Symmetric distributions. The kth-order estimate in (54) is in general consider-

ably better than the corresponding kth-order estimate in (40), in the case of a symmetric
distribution (under the assumption of non-zero even-order cumulants). In fact, we have
the following theorem for symmetric random variables.
Theorem 4. Suppose that X1 is a symmetric random variable satisfying the hypotheses

of Theorem 2 and E(|X1 |4k+3 ) < . Then,
k ( )
1
(59) fn (M) = Vi(M) + o n(2k+1)/2
,
i=1
uniformly in M ( < M < ), where {Vi } is as in (54).

Proof. Note that since X1 is symmetric, i = 0 for i 3 odd. Hence,
( )
4k+1 ( ) ( )
k
1 a a 1 k n1
V j (M) = n Ca,v nv1 j 1 (1) j1 v1
j=1 a=1 v=1 j=1
( k )
n
+o (4k+1)/2
n
( ) ( )
2k+1
1 a min(a,k)
= Ca,v
a=1 n v=1
( )
2k+1 ( ) ( )
1 a a 1 k n1
+ Ca,v nv1 j 1 j1(1)v1
a=k+1 n v=k+1 j=1
( ) ( ) ( )
4k+1
1 a k 1
+ Ca,v + o(1) + o n(2k+1)/2
a=2k+2 n v=1
2k+1 ( ) ( )
1 a a 1
= Ca,v + o n(2k+1)/2
a=1 n v=1
( )
1
= fn (M) + o (2k+1)/2 .
n
In the second to last equality, we have used the fact that for k + 1 v a 2k + 1,
a-tuples (k1 , k2 , . . . , ka ) with ki 0, ai=1 ki = v and ai=1 iki = a must satisfy k1 > 0, and
hence, since 3 = 3 = 0, Ca,v = 0 for such (a, v).

Note that the assumption of symmetry in the statement of Theorem 4 could be replaced
with the weaker assumption that i = 0 for all odd i 3 (see for instance Churchill (1946)
for discussion on how the two classes of distributions can differ greatly), or more simply,
just 3 = 0. In(fact, by a similar ) argument, the error term in Theorem 4 can be further
improved to o 1/n((J1)k+1)/2 , if the symmetry assumption is replaced with that of
( )
i = 0 for 3 i J (assuming E |X1 |(J+1)k+3 < ).
Without additional hypotheses, the corresponding symmetry assumption allows for
only moderate improvements on the Edgeworth expansions in Proposition 2. For in-
stance, we have the following theorem.
Theorem 5. Suppose that X1 is a symmetric integer valued random variable with maxi-
mal span equal to one and satisfying E(|X1 |2k/2+3 ) < for some k 1. Then,
k ( )
1
(60) fn (M) = Ei(M) + o n(2k/2+1)/2
,
i=1
uniformly in M ( < M < ), where Ei is as in (40).

Proof. The modest improvement here (from k/2 to k/2 + 1/2 in the error term), in the
case of even k, is due to the fact that in that case k + 1 is odd and hence every term in the
inner sum in (33) for a = k + 1 includes at least one odd i 1 for which ki > 0 and thus
the sum is zero.

Note that there is a higher moment assumption in Theorem 4 as compared with The-
orem 5, but the error in (59) is significantly smaller than that in (60). A comparison of
error terms is given in Table 3.
TABLE 3. Order of Error Terms (Symmetric Case)
k 1 2 3 4 5
Edgeworth Expansion 1/n 1/2 1/n 3/2 1/n 3/2 1/n5/2 1/n5/2
Expansion given in (54) 1/n3/2 1/n5/2 1/n7/2 1/n9/2 1/n11/2
Arguing similar to in the proof of Theorem 3, we have the following regarding the
main error term in the case of symmetric distributions.
Theorem 6. Suppose the assumptions of Theorem 4 are satisfied, and in addition E(|X1 |4k+5 ) <
, for some integer k 1. Then,
( )
k
1 (4 )k+1 1
(61) fn (M) Vi (M) = k+1 H4(k+1) (x)

+ o (2k+3)/2
i=1 n (k + 1)! n
uniformly in < M < , where x is as in (41).
4.2. Asymptotic crossing points. Asymptotic crossing points between the kth-order
estimate in (54) and the target probability function can be computed quite simply as can
be seen by the next theorem. Note that the result, and discussion that follows, is in terms
of unstandardized random variables; the convergent nature of the crossing points is, of
course, in standardized terms.
Theorem 7. Suppose the assumptions of Theorem 3 are satisfied. Then, if 3 = 0, the
(unstandardized) asymptotic crossing points of the estimate with the target are

(62) i n + n , 1 i 3(k + 1),
where 1 < 2 < < 3(k+1) are the 3(k + 1) real and simple zeroes of the Hermite
polynomial H3(k+1) .
Proof. This follows directly from Theorem 3 and the fact that all zeroes of Hermite
polynomials are real and simple (see for instance Szego (1975, Section 6.2)).

Note that without additional assumptions the main error term in (40) is simply
( )
1 k+1
(63) qk+1 (x),
n
i.e. a potentially several-term linear combination of Hermite polynomials in x. In this
case, the many results on the locations of zeroes of Hermite polynomials (see for instance
Szego (1975, Chapter 6)) cannot be directly applied. In fact the zeroes of qk+1 (x) need
not be real as is seen by the following example.
Example 5. Consider the case k = 2. We have by Theorem 3,

( )
1 (3 )3 1
fn (M) V1 (M) V2 (M)= 3/2 H9 (x) + o 3/2 .
n 3! n

The asymptotic crossing points here are M = i n + n , i = 1, 2, . . . , 9, where the i
are the zeros of the polynomial
H9 (x) = x9 36x7 + 378x5 1260x3 + 945x = 0,
i.e.
x 0, 1.023255664, 2.076847979, 3.205429003, 4.512745863.
In the case of the Edgeworth expansion, we have
(3 ) 3 ( )
H 9 (x) 3! + H7 (x)4 3 + H5 (x)5 1
fn (M) E1 (M) E2 (M) = + o 3/2 .
n3/2 n
to above, the asymptotic crossing points of the estimate with the target are M =
Similar
i n + n , where i are the distinct real zeros of the polynomial Z defined via
(3 ) 3
(64) Z(x) = H9 (x) + H7 (x)4 3 + H5 (x)5 .
3!
Note that the zeroes of Z, are dependent on the distribution of X1 through the cumulants
3 , 4 and 5 as well as the variance 2 . In the case where X is supported on {0, 1, 2}
with probabilities P(X = 0) = 7/22, P(X = 1) = 14/22 and P(X = 2) = 1/22, we have
( 2 , 3 , 4 , 5 ) = (35/121, 21/1331, 574/14641, 21525/161051)
and

35
(65) Z(x) = y(417255 + 263190y2 21762y4 446y6 + y8 ).
7350000
There are seven real zeros of Z, namely
(66) x 0, 1.375929290, 2.897073323, 22.12179176,
and two complex zeros
(67) x 7.325291i.

Note that the analogous result to Theorem 7 for symmetric distributions follows from
Theorem 6.
For further work on asymptotic crossing points see for instance Finner, Roters, and
Dickhaus (2007) and Berenhaut, Chen, and Tran (2008) and the references therein.
Acknowledgements
We are very thankful to a referee for comments and insights that improved this man-
uscript.
R EFERENCES
Berenhaut, K. S., Chen, D., & Tran, V. (2008). On the compatibility of Dysons condi-
tions. Statist. Probab. Lett., 78(17), 31103113.
Butler, R. W. (2007). Saddlepoint approximations with applications. Cambridge: Cam-
bridge University Press.
Charalambides, C. A., & Singh, J. (1988). A review of the Stirling numbers, their
generalizations and statistical applications. Comm. Statist. Theory Methods, 17(8),
25332595.
Churchill, E. (1946). Information given by odd moments. Ann. Math. Statistics, 17,
244246.
Davison, A. C. (2003). Statistical models (Vol. 11). Cambridge: Cambridge University
Press.
Finner, H., Roters, M., & Dickhaus, T. (2007). Characterizing density crossing points.
Amer. Statist., 61(1), 2833.
Gnedenko, B. V., & Kolmogorov, A. N. (1954). Limit distributions for sums of inde-
pendent random variables. Inc., Cambridge, Mass.: Addison-Wesley Publishing
Company. (Translated and annotated by K. L. Chung. With an Appendix by J. L.
Doob)
Goutis, C., & Casella, G. (1999). Explaining the saddlepoint approximation. Amer.
Statist., 53(3), 216224.
Guo, L. (2005). Baxter algebras, Stirling numbers and partitions. J. Algebra Appl., 4(2),
153164.
Kolassa, J. E. (1991). Saddlepoint approximations in the case of intractable cumulant
generating functions. In Selected Proceedings of the Sheffield Symposium on Ap-
plied Probability (Sheffield, 1989) (Vol. 18, pp. 236255). Hayward, CA: Inst.
Math. Statist.
Mood, A., Graybill, F., & Boes, D. (1974). Introduction to the theory of statistics (3. ed.
ed.). Auckland [u.a.]: McGraw-Hill.
Nason, G. P. (2006). On the sum of t and Gaussian random variables. Statist. Probab.
Lett., 76(12), 12801286.
Petrov, V. V. (1975). Sums of independent random variables. New York: Springer-
Verlag. (Translated from the Russian by A. A. Brown, Ergebnisse der Mathematik
und ihrer Grenzgebiete, Band 82)
Pham-Gia, T., & Turkkan, N. (1998). Distribution of the linear combination of two
general beta variables and applications. Comm. Statist. Theory Methods, 27(7),
18511869.
References 23
Riordan, J. (1979). Combinatorial identities. Robert E. Krieger Publishing Co., Hunt-
ington, N.Y. (Reprint of the 1968 original)
Ruckdeschel, P., Kohl, M., Stabla, T., & Camphausen, F. (2006, May). S4 classes for
distributions. R News, 6(2), 26.
Szego, G. (1975). Orthogonal polynomials (Fourth ed.). Providence, R.I.: American
Mathematical Society. (American Mathematical Society, Colloquium Publica-
tions, Vol. XXIII)
D EPARTMENT OF M ATHEMATICS , WAKE F OREST U NIVERSITY, W INSTON -S ALEM , NC 27109
E-mail address: berenhks@wfu.edu
D EPARTMENT OF M ATHEMATICS , WAKE F OREST U NIVERSITY, W INSTON -S ALEM , NC 27109

E-mail address: jjc03001@gmail.com
H. M ILTON S TEWART S CHOOL OF I NDUSTRIAL AND S YSTEMS E NGINEERING , G EORGIA I NSTI -

TUTE OF T ECHNOLOGY, ATLANTA , GA 30332-0205
E-mail address: rhilton3@gatech.edu

Cstmgasubmitjcrh 112 Brev 2 Bsec 2 B

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cstmgasubmitjcrh 112 Brev 2 Bsec 2 B

Uploaded by

Copyright:

Available Formats

ASYMPTOTIC EXPANSIONS FOR I.I.D.

SUMS VIA LOWER-ORDER

KENNETH S. BERENHAUT, JAMES W. CHERNESKY, JR., AND ROSS P. HILTON

2000 Mathematics Subject Classification. 60F05, 60E05.

for 1 k n, fn = f (n) for n 1, and

f (n) (M) 0 (M)

Example 1. Suppose X1 , X2 , . . . are independent identically distributed Bernoulli ran-

Similarly, for k = 2, we have

For M = 31, we have

Theorem 1. Suppose X1 satisfies E(|X1 |5 ) < and either

Example 3. (Comparison with a standard saddlepoint approximation). Returning to the

Thm 1 Saddlept Saddlept(st) Edgeworth Normal

Example 4. Suppose X is distributed as a Beta random variable with parameters = 0.5

Difference from target

(25) fn (M) = nI(M) (n 1)n (M).

where (n) j = n(n 1) . . . (n j + 1). Hence, since v (1)v1 = 0, if k v we have

i = i /( i i!), and i and 2 the i-th cumulant and variance of X, respectively.

for i > h. Hence for > 0 and n > h,

say, uniformly in < M < .

Proof. For 1 h k, we have via Lemma 2 that

uniformly in < M < , where x is as in (41).

4.1. Symmetric distributions. The kth-order estimate in (54) is in general consider-

Theorem 4. Suppose that X1 is a symmetric random variable satisfying the hypotheses

uniformly in M ( < M < ), where {Vi } is as in (54).

uniformly in M ( < M < ), where Ei is as in (40).

TABLE 3. Order of Error Terms (Symmetric Case)

Example 5. Consider the case k = 2. We have by Theorem 3,

D EPARTMENT OF M ATHEMATICS , WAKE F OREST U NIVERSITY, W INSTON -S ALEM , NC 27109

H. M ILTON S TEWART S CHOOL OF I NDUSTRIAL AND S YSTEMS E NGINEERING , G EORGIA I NSTI -

You might also like