You are on page 1of 10

Handout for Chapters 1-3 of Bouchaud

March 19, 2012

1 Denitions
1.1 Typical values of random variables

The rst part of this handout contains the important denitions that are introduced in the rst chapter of Bouchaud's book. A probability density P () is b such that the probability that x is between a and b is a P (x) dx. Also dene P< (x) P (X < x) and P> (x) 1 P< (x). There are several typical values to describe distributions. The most probable 1 value is x = maxx P (x), the median xmed is s.t. P< (xmed ) = P> (xmed ) = 2 , and the expected value is m < x >. For unimodal and symmetric distributions this three values coincide. Furthermore dene the mean absolute deviation (or MAD) as
Emad = |x xmed | P (x) dx

(1)

and the root mean square or (RMS) as


= (x m)
2

(2)

The full width half maximum w1/2 is dened as s.t.


P x w1/2 2 = P (x ) . 2

(3)

Finally the w-quantile is determined by the condition


P x x / w w ,x + 2 2 =a

(4)

with a a number between 0 and 1, usually a = 0.1 or a = 0.05.


Remark: Mean and variance are better for calculations because of the analyticity that brings nice properties, e.g. the additivity of variance under convolution. Median and MAD on the other side are often more robust, i.e. less sensitive to rare events, and therefore it's good for some statistics. For some distributions with very fat tails the mean square deviation might be innite. Then the typical values must be described using quantiles.

The characteristic function P (z) is the Fourier transform of the probability density, P (z) = eizx P (x) dx. (5)

From the characteristic function it is possible to obtain the probability distribution back by inverse Fourier transform,
P (x) = 1 2 eizx P (z) dz.

(6)

From the characteristic funtion it is possible to obtain the moments by


mn = xn = (i)
n

d P (z) |z=0 , dz n

(7)

d and the cumulants by cn = (i)n dzn log P (z) |z=0 and the normalized cumu2 lants as n = cn / .

Particularly important cumulants are the skewness 3 = (x m)3 / 3 and the kurtosis 3 = (x m)3 / 3 . Note that for a Gaussian the kurtosis is 0 and that one can use it to measure how "far" a distribution is from a Gaussian (distributions with > 0 are called leptokurtic).
Remark: A moment of order n is only dened if P (x) has tails decaying faster than 1/ xn+1 .
1.2 Important distributions

The second part of the rst chapter summarizes some important distributions for econophysics. The Log-normal distribution is a distribution s.t. log x is normally distributed. Its distribution is
1 log2 (x/x0 ) PLN (x) = exp 2 2 x 2 2

(8)

and its moments are mn = xn exp n2 2 /2 . 0 It is used if one assumes that the rate of returns, rather than the absoluete change of prices, are independent random variables. Note that is empirically hard to distinguish from the inverse gamma distribution. The Lvy distribution has "fatter tails" than the Gaussian in the sense that it has heavier weight on the tails. It has a power-law behavior for large arguments (Pareto tails). Note that the Lvy distribution is stable under addition. The density can be written as
L A |x1+ |

for x , 0 < < 2.

(9)

For = 1 we obtain the Cauchy distribution that is


L1 (x) = x2 A . + 2 A2

(10)

As the skewness is not well dened for the Lvy distribution one can dene an asymmetry parameter (A A )/(A + A ) instead. The characteristic + + function is L (z) = exp (a |z| ) . (11)
Remarks: By going to the limit of = 2 we can obtain the Gaussian distribution from the Lvy distribution. Furthermore it is hard to estimate based on empirical data due to the existance of subleading terms.

The Truncated Lvy distribution is useful when one observes that Pareto tails only appear in an intermediate regime and beyond that regime the distribution decays exponentially. The characteristic function is
L(t) (z) = exp a 2 + z 2
2

cos ( arctan (|z| /)) cos (/2)

(12)

Note that for 0 one recovers a pure Lvy distribution and for it turns into a Gaussian.

Other distributions worth mentioning are the hyperbolic distribution


PH (x) = 1 exp 2x0 K1 (x0 ) x2 + x2 0

(13)

and the Student distribution


1 ((1 + ) /2) a PS (x) . (1+)/2 (/2) (a2 + x2 )

(14)

Note that the student distribution has power law tails. For = 1 it is a Cauchy distribution and for it turns into a Gaussian.

2 Maximum and addition of random variables


2.1 Maximum of iid random variables

The order of magnitude of the largest event max is dened by


P> (max ) = 1 , N

(15)

With the maximum value xmax = maxi=1,2,...N (xi ) the cumulative distribution is obtained as
P (xmax < ) = (P< ())N = (1 P> ())N exp(N P> ()).

(16)

For large N , the distribution of xmax only depends on the asymptotic behavior of the distribution of x. 4

Now dene u by setting xmax = max + u/. For P> (x) exp(x), we have max = log(N )/. In this case, u is Gumbel distributed. For P> = A+ /x 1 (power law tails), we have max = A+ N and u is Frechet distributed. Now we consider not only the largest value but we order all values in decreasing order and call the nth largest value [n]. Then the probability distribution of Pn is given by:
Pn ([n]) = N n1 P (x = [n])(P (x > [n]))n1 (P (x < [n]))N n (17) N 1

Let us now x n and denote the maximum of Pn by [n]. In the case of an exponential tail, we have [n] = log( N )/. In the case of power-law tails, we n 1 have [n] = A+ (N/n) . This can be used to determine the exponent of a power-law distribution.

2.2

Sum of random variables

We are often interested in the sum of random variables since the sum of small price changes describes the price changes over longer times. Consider the sum of N independent random variables, X = X1 + X2 + ... + XN , where Xi is distributed according to Pi (Xi ). Then the distribution of X can be obtained as
N 1

P (x, N )

=
iid

P1 (x1 ) . . . PN 1 (xN 1 )PN (x x1 . . . xN 1 )


i=1

dxi

[P1 ]

(x)

(18)

The cumulants of a distribution convoluted N times with itself simply add


cn,N = N cn,1 ,

(19)

where cn,1 are the cumulants of the elementary distributions. The normalized cumulants are then
N = n cn,1 cn,N = N 1n/2 . n/2 (c2,N ) (c2,1 )n/2

(20)

The cumulants for n > 2 decay with 1/N (which we expect from the central limit theorem). Furthermore, the sum of variables with a power-law tail has a power-law tail itself with a dierent tail amplitude.
2.3 Stable distribution and self-similarity

A distribution is called stable (also scale-invariant), if the law of the sum of variables has the same shape as the elementary distribution. By saying that two distributions have the same shape we mean that they coincide up to some dilation and translation. The family of all stable laws coincides with the Lvy distribution (including the Gaussian distribution).
2.4 Central Limit Theorems

Under certain conditions, the sum of a large number of i.i.d. random variables is distributed according to a stable law.
Convergence to a Gaussian: Let (Xn )n an i.i.d. sequence of random variables with nite variance 2 . Then their sum asymptotically tends towards a Gaussian, in the sense that
N

lim P

u1

N i=1

Xi N m u2 N

u2

=
u1

u2 1 e 2 d u. 2

(21)

Remarks:
All summands are of the same order and much smaller than the sum 1 ( N 2 ). None contributes with a positive fraction to the value of the

sum.

The condition for the (Xn )n to be independent can be relaxed: it suces that they are not too correlated, i.e. the correlation function Xi Xj m2

must decay suciently fast (see example below).

The condition for the (Xn )n to be identically distributed can be relaxed:

it suces that their distributions are not too dierent, such that none of the variances dominates over all the others. center region around the mean value (which depends on the elementary distribution of the Xi 's) and it does not apply to the tails.

For nite values of N , the CLT gives only a good approximation for a

Example: Let (Xn )n be a sequence of not necessarily independent random variables, and let Cij := Xi Xj m2 . Assume that the process is stationary, i.e. Cij = C (|i j|) only depends on |i j|. For the variance of the sum X we nd (setting m to 0) :
N N

X2 =
i,j=1

Cij = N 2 + 2
l=1

(N l)C(l).

(22)

Hence, for the variance of the sum not to grow faster than N (as in the CLT), C(l) has to vanish faster than 1 . l
Convergence to a Lvy distribution: Let (Xn )n an i.i.d. sequence of random variables which are asymptotically distributed as a power law with < 2 and a tail amplitude A = A + A (in particular, the variance is innite). Then + their sum asymptotically tends towards a Lvy distribution.

Remarks:
The largest summand is of the same order as the sum ( N ); it con1

tributes with a positive fraction to the value of the sum.

Similar possible relaxations and caveats apply as in the Gaussian case.


2.5 Large deviations

Let X be the sum of N random variables with mean m, variance 2 and nite cumulants. Then the CLT...
states that we can approximate the distribution of X for large enough N

by a Gaussian distribution, but it

does not specify the quality of the approximation for nite N .

Typically, the approximation is valid in a central region around the mean m, but fails for large deviations from m. To determine this 'central region', consider the normalized random variable U = (X N m)/( N ) which has mean 0 and variance 1. We want to approximate tail probabilities P> (u) for xed N by the respective values computed with a Gaussian distribution PG> (u). To guarantee that 1 u P> (u) PG> (u) = erfc (23)
2 2

holds, N should be chosen such that (3 skewness, 4 kurtosis):


N >> N = 2 , if 3 = 0, 3 N >> N = 4 , if 3 = 0.

The region where the relative error is small is given as follows :


if 3 = 0, then |x N m| if 3 = 0, then |x N m|
3 3 N 3 ; 2 2 4 N 4 . 3 1 2

Warning: If the elementary distribution decreases as a power law ( cumulants are innite), the central region is much more restricted.

The behavior outside the central region is studied in the mathematical theory of Large Deviations. There, the asymptotic decrease of tail probabilities is characterized explicitly e.g. using the so-called Cramer function.
2.6 Conclusions: survival and vanishing of tails

"The CLT teaches us that if the number of terms in a sum is large, the sum becomes (nearly) a Gaussian variable. This sum can represent the temporal aggregation of the daily uctuations of a nancial asset, or the aggregation, in a portfolio, of dierent stocks. The Gaussian (or non-Gaussian) nature of this sum is thus of crucial importance for risk control, since the extreme tails of the distribution correspond to the most "dangerous" uctuations. Fluctuations 8

are never Gaussian in the far-tails: one can explicitly show that if the elementary distribution decays as a power-law (or as an exponential, which formally correspond to = ), the distribution of the sum decays in the very same manner outside the central region, i.e. much more slowly than the Gaussian. The CLT simply ensures that these tail regions are expelled more and more towards large values of X when N grows, and their associated probability is smaller and smaller. When confronted with a concrete problem, one must decide whether N is large enough to be satised with a Gaussian description of the risks. In particular, if N is less than the characteristic value N dened above, the Gaussian approximation is bad." [Bouchaud and Potters]

3 Divisibilty and path integrals


3.1 Divisibility

We call a random variable X N-divisible, if it is possible to nd N independent identically distributed random variables Xi , 1 i N , such that
N

X=
i=1

Xi

(24)

Since Gaussian and Lvy distributions are stable, we know that Gaussian or Lvy distributed random variables are divisible. (Since convolution becomes multiplication in Fourier space, this is equivalent to the question if P (z)1/N is the Fourier transform of a probability distribution.) By taking the limit N we pass from the previous problem to nding innitesimal increments dX(t) such that
T

X (T ) =
0

dX (t)

(25)

where the time becomes continuous rather than discrete. If those innitesimal increments exist, then X is called innitely divisible. If X is a Gaussian variable, then the increments are also Gaussian. Therefore the sample path X(t) is continous, and the trajectory X(t) is a Brownian motion. If X is a Lvy stable random variable, then the increments are also Lvy distributed and the sample path has discontinuites with nite probability.
3.2 Path integrals

Let's dene a trajectory as = {x1 , x2 , ..., xN } where x0 is the starting point. The probability to observe the trajectory for iid increments is
N 1

P (T ) =
i=0

P1 (xi+1 xi )

(26)

with P1 the elementary distribution of the increments. One can average some N 1 quantity O over the path, for example O = exp with V some funci=0 V tion. Then
N 1

=
i=0

P1 (xi+1 xi ) eV (xi ) dxi

(27)

It is interesting to dene an average restricted at trajectories ending at a xed point x:


N 1

O (x, N )

(xN x)
i=0

P1 (xi+1 xi ) eV (xi ) dxi

(28)

and this is called a path integral. Of course one has O N = O (x, N ) dx. From that it's possible to obtain the following recursion:
O (x, N + 1) = eV (x) P1 (x x ) O (x , N ) dx

(29)

In the continuos limit (i.e. V = V dt) and assuming that P1 is very sharply peaked around its mean mdt with variance 2 dt one can Taylor expand O (x , N ) around x in powers of x x obtaining O (x, N + 1) O (x, N ) dt0 dt O (x, T ) O (x, T ) 2 2 O (x, T ) = m + + V (x) O (x, T ) T x 2 x2 lim
3

(30) (31)

(neglecting terms of O dt 2 ). This means that O (x, T ) satises a partial dierential equation. For computational purposes it's very useful to do the opposite, i.e. to represent the solution of a partial dierential equation as a path integral (easily simulated via Monte Carlo). For P1 Gaussian one can nd:
x(T )=x T

O (x, T ) =
x(0)=x0

exp
0

1 2 2

dx m dt

V (x) dt Dx

(32)

where Dx is the measure over the paths space.This O (x, T ) is the solution of the previous PDE with initial condition O (x, T = 0) = (x x0 ).This is the so called Feynmann-Kac integral.

10

You might also like