An Introduction To Stochastic Calculus

An Introduction to Stochastic Calculus with Applications
to Finance
Ovidiu Calin
Department of Mathematics
Eastern Michigan University
Ypsilanti, MI 48197 USA
ocalin@emich.edu
Preface
The goal of this work is to introduce elementary Stochastic Calculus to senior under-
graduate as well as to master students with Mathematics, Economics and Business
majors. The authors goal was to capture as much as possible of the spirit of ele-
mentary Calculus, at which the students have been already exposed in the beginning
of their majors. This assumes a presentation that mimics similar properties of deter-
ministic Calculus, which facilitates the understanding of more complicated concepts
of Stochastic Calculus. Since deterministic Calculus books usually start with a brief
presentation of elementary functions, and then continue with limits, and other prop-
erties of functions, we employed here a similar approach, starting with elementary
stochastic processes, dierent types of limits and pursuing with properties of stochas-
tic processes. The chapters regarding dierentiation and integration follow the same
pattern. For instance, there is a product rule, a chain-type rule and an integration by
parts in Stochastic Calculus, which are modications of the well-known rules from the
elementary Calculus.
Since deterministic Calculus can be used for modeling regular business problems, in
the second part of the book we deal with stochastic modeling of business applications,
such as Financial Derivatives, whose modeling are solely based on Stochastic Calculus.
In order to make the book available to a wider audience, we sacriced rigor for
clarity. Most of the time we assumed maximal regularity conditions for which the
computations hold and the statements are valid. This will be found attractive by both
Business and Economics students, who might get lost otherwise in a very profound
mathematical textbook where the forests scenary is obscured by the sight of the trees.
An important feature of this textbook is the large number of solved problems and
examples from which will benet both the beginner as well as the advanced student.
This book grew from a series of lectures and courses given by the author at the
Eastern Michigan University (USA), Kuwait University (Kuwait) and Fu-Jen University
(Taiwan). Several students read the rst draft of these notes and provided valuable
feedback, supplying a list of corrections, which is by far exhaustive. Any typos or
comments regarding the present material are welcome.
The Author,
Ann Arbor, October 2012
i
ii O. Calin
Contents
I Stochastic Calculus 3
1 Basic Notions 5
1.1 Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Events and Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Basic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Independent Random Variables . . . . . . . . . . . . . . . . . . . . . . . 12
1.8 Integration in Probability Measure . . . . . . . . . . . . . . . . . . . . . 13
1.9 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.10 Radon-Nikodyms Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.11 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.12 Inequalities of Random Variables . . . . . . . . . . . . . . . . . . . . . . 18
1.13 Limits of Sequences of Random Variables . . . . . . . . . . . . . . . . . 24
1.14 Properties of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.15 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2 Useful Stochastic Processes 33
2.1 The Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Geometric Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3 Integrated Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Exponential Integrated Brownian Motion . . . . . . . . . . . . . . . . . 43
2.5 Brownian Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6 Brownian Motion with Drift . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.7 Bessel Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.8 The Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.9 Interarrival times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.10 Waiting times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.11 The Integrated Poisson Process . . . . . . . . . . . . . . . . . . . . . . . 51
2.12 Submartingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
iii
iv O. Calin
3 Properties of Stochastic Processes 57
3.1 Stopping Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Stopping Theorem for Martingales . . . . . . . . . . . . . . . . . . . . . 61
3.3 The First Passage of Time . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 The Arc-sine Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5 More on Hitting Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.6 The Inverse Laplace Transform Method . . . . . . . . . . . . . . . . . . 74
3.7 Limits of Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . 80
3.8 Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.9 The Martingale Convergence Theorem . . . . . . . . . . . . . . . . . . . 86
3.10 The Squeeze Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.11 Quadratic Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.11.1 The Quadratic Variation of W
t
. . . . . . . . . . . . . . . . . . . 88
3.11.2 The Quadratic Variation of N
t
t . . . . . . . . . . . . . . . . 90
4 Stochastic Integration 95
4.1 Nonanticipating Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.2 Increments of Brownian Motions . . . . . . . . . . . . . . . . . . . . . . 95
4.3 The Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Examples of Ito integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.1 The case F
t
= c, constant . . . . . . . . . . . . . . . . . . . . . . 98
4.4.2 The case F
t
= W
t
. . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5 Properties of the Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . 99
4.6 The Wiener Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.7 Poisson Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.8 An Work Out Example: the case F
t
= M
t
. . . . . . . . . . . . . . . . . 107
4.9 The distribution function of X
T
=
_
T
0
g(t) dN
t
. . . . . . . . . . . . . . . 112
5 Stochastic Dierentiation 115
5.1 Dierentiation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.2 Basic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.3 Itos Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.3.1 Itos formula for diusions . . . . . . . . . . . . . . . . . . . . . . 119
5.3.2 Itos formula for Poisson processes . . . . . . . . . . . . . . . . . 121
5.3.3 Itos multidimensional formula . . . . . . . . . . . . . . . . . . . 122
6 Stochastic Integration Techniques 125
6.1 Fundamental Theorem of Stochastic Calculus . . . . . . . . . . . . . . . 125
6.2 Stochastic Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . 127
6.3 The Heat Equation Method . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.4 Table of Usual Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . 137
An Introd. to Stoch. Calc. with Appl. to Finance v
7 Stochastic Dierential Equations 139
7.1 Denitions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.2 Finding Mean and Variance from the Equation . . . . . . . . . . . . . . 140
7.3 The Integration Technique . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4 Exact Stochastic Equations . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.5 Integration by Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.6 Linear Stochastic Dierential Equations . . . . . . . . . . . . . . . . . . 155
7.7 The Method of Variation of Parameters . . . . . . . . . . . . . . . . . . 161
7.8 Integrating Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.9 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8 Applications of Brownian Motion 169
8.1 The Generator of an Ito Diusion . . . . . . . . . . . . . . . . . . . . . . 169
8.2 Dynkins Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.3 Kolmogorovs Backward Equation . . . . . . . . . . . . . . . . . . . . . 173
8.4 Exit Time from an Interval . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.5 Transience and Recurrence of Brownian Motion . . . . . . . . . . . . . . 175
8.6 Application to Parabolic Equations . . . . . . . . . . . . . . . . . . . . . 179
9 Martingales and Girsanovs Theorem 183
9.1 Examples of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
9.2 Girsanovs Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
II Applications to Finance 195
10 Modeling Stochastic Rates 197
10.1 An Introductory Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 197
10.2 Langevins Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.3 Equilibrium Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10.3.1 The Rendleman and Bartter Model . . . . . . . . . . . . . . . . . 200
10.3.2 The Vasicek Model . . . . . . . . . . . . . . . . . . . . . . . . . . 201
10.3.3 The Cox-Ingersoll-Ross Model . . . . . . . . . . . . . . . . . . . 203
10.4 No-arbitrage Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
10.4.1 The Ho and Lee Model . . . . . . . . . . . . . . . . . . . . . . . 205
10.4.2 The Hull and White Model . . . . . . . . . . . . . . . . . . . . . 205
10.5 Nonstationary Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
10.5.1 Black, Derman and Toy Model . . . . . . . . . . . . . . . . . . . 206
10.5.2 Black and Karasinski Model . . . . . . . . . . . . . . . . . . . . . 207
vi O. Calin
11 Bond Valuation and Yield Curves 209
11.1 The Case of a Brownian Motion Spot Rate . . . . . . . . . . . . . . . . 209
11.2 The Case of Vasiceks Model . . . . . . . . . . . . . . . . . . . . . . . . 211
11.3 The Case of CIRs Model . . . . . . . . . . . . . . . . . . . . . . . . . . 213
11.4 The Case of a Mean Reverting Model with Jumps . . . . . . . . . . . . 214
11.5 The Case of a Model with pure Jumps . . . . . . . . . . . . . . . . . . . 218
12 Modeling Stock Prices 221
12.1 Constant Drift and Volatility Model . . . . . . . . . . . . . . . . . . . . 221
12.2 When Does the Stock Reach a Certain Barrier? . . . . . . . . . . . . . . 225
12.3 Time-dependent Drift and Volatility Model . . . . . . . . . . . . . . . . 226
12.4 Models for Stock Price Averages . . . . . . . . . . . . . . . . . . . . . . 227
12.5 Stock Prices with Rare Events . . . . . . . . . . . . . . . . . . . . . . . 234
13 Risk-Neutral Valuation 239
13.1 The Method of Risk-Neutral Valuation . . . . . . . . . . . . . . . . . . . 239
13.2 Call Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
13.3 Cash-or-nothing Contract . . . . . . . . . . . . . . . . . . . . . . . . . . 242
13.4 Log-contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
13.5 Power-contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
13.6 Forward Contract on the Stock . . . . . . . . . . . . . . . . . . . . . . . 245
13.7 The Superposition Principle . . . . . . . . . . . . . . . . . . . . . . . . . 245
13.8 General Contract on the Stock . . . . . . . . . . . . . . . . . . . . . . . 246
13.9 Call Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
13.10 General Options on the Stock . . . . . . . . . . . . . . . . . . . . . . . 247
13.11 Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
13.12 Asian Forward Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . 251
13.13 Asian Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
13.14 Forward Contracts with Rare Events . . . . . . . . . . . . . . . . . . . 261
13.15 All-or-Nothing Lookback Options (Needs work!) . . . . . . . . . . . . . 262
13.16 Perpetual Look-back Options . . . . . . . . . . . . . . . . . . . . . . . . 265
13.17 Immediate Rebate Options . . . . . . . . . . . . . . . . . . . . . . . . . 266
13.18 Deferred Rebate Options . . . . . . . . . . . . . . . . . . . . . . . . . . 266
14 Martingale Measures 267
14.1 Martingale Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
14.1.1 Is the stock price S
t
a martingale? . . . . . . . . . . . . . . . . . 267
14.1.2 Risk-neutral World and Martingale Measure . . . . . . . . . . . . 269
14.1.3 Finding the Risk-Neutral Measure . . . . . . . . . . . . . . . . . 270
14.2 Risk-neutral World Density Functions . . . . . . . . . . . . . . . . . . . 271
14.3 Correlation of Stocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
14.4 Self-nancing Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
14.5 The Sharpe Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
An Introd. to Stoch. Calc. with Appl. to Finance 1
14.6 Risk-neutral Valuation for Derivatives . . . . . . . . . . . . . . . . . . . 276
15 Black-Scholes Analysis 279
15.1 Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
15.2 What is a Portfolio? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
15.3 Risk-less Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
15.4 Black-Scholes Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
15.5 Delta Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
15.6 Tradeable securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
15.7 Risk-less investment revised . . . . . . . . . . . . . . . . . . . . . . . . . 289
15.8 Solving the Black-Scholes . . . . . . . . . . . . . . . . . . . . . . . . . . 292
15.9 Black-Scholes and Risk-neutral Valuation . . . . . . . . . . . . . . . . . 295
15.10Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
15.11Risk-less Portfolios for Rare Events . . . . . . . . . . . . . . . . . . . . . 296
16 Black-Scholes for Asian Derivatives 301
16.1 Weighted averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
16.2 Setting up the Black-Scholes Equation . . . . . . . . . . . . . . . . . . . 303
16.3 Weighted Average Strike Call Option . . . . . . . . . . . . . . . . . . . . 304
16.4 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
16.5 Asian Forward Contracts on Weighted Averages . . . . . . . . . . . . . . 310
17 American Options 313
17.1 Perpetual American Options . . . . . . . . . . . . . . . . . . . . . . . . 313
17.1.1 Present Value of Barriers . . . . . . . . . . . . . . . . . . . . . . 313
17.1.2 Perpetual American Calls . . . . . . . . . . . . . . . . . . . . . . 317
17.1.3 Perpetual American Puts . . . . . . . . . . . . . . . . . . . . . . 318
17.2 Perpetual American Log Contract . . . . . . . . . . . . . . . . . . . . . 321
17.3 Perpetual American Power Contract . . . . . . . . . . . . . . . . . . . . 322
17.4 Finitely Lived American Options . . . . . . . . . . . . . . . . . . . . . . 323
17.4.1 American Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
17.4.2 American Put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
17.4.3 Mac Millan-Barone-Adesi-Whaley Approximation . . . . . . . . . 326
17.4.4 Blacks Approximation . . . . . . . . . . . . . . . . . . . . . . . . 326
17.4.5 Roll-Geske-Whaley Approximation . . . . . . . . . . . . . . . . . 326
17.4.6 Other Approximations . . . . . . . . . . . . . . . . . . . . . . . . 326
18 Hints and Solutions 327
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
2 O. Calin
Part I
Stochastic Calculus
3
Chapter 1
Basic Notions
1.1 Probability Space
The modern theory of probability stems from the work of A. N. Kolmogorov published
in 1933. Kolmogorov associates a random experiment with a probability space, which
is a triplet, (, T, P), consisting of the set of outcomes, , a -eld, T, with Boolean
algebra properties, and a probability measure, P. In the following sections, each of
these elements will be discussed in more detail.
1.2 Sample Space
A random experiment in the theory of probability is an experiment whose outcomes
cannot be determined in advance. These experiments are done mentally most of the
time.
When an experiment is performed, the set of all possible outcomes is called the
sample space, and we shall denote it by . In nancial markets one can regard this also
as the states of the world, understanding by this all possible states the world might
have. The number of states the world that aect the stock market is huge. These
would contain all possible values for the vector parameters that describe the world,
and is practically innite.
For some simple experiments the sample space is much smaller. For instance, ip-
ping a coin will produce the sample space with two states H, T, while rolling a die
yields a sample space with six states. Choosing randomly a number between 0 and 1
corresponds to a sample space which is the entire segment (0, 1).
All subsets of the sample space form a set denoted by 2
. The reason for this

notation is that the set of parts of can be put into a bijective correspondence with the
set of binary functions f : 0, 1. The number of elements of this set is 2
||
, where
[[ denotes the cardinal of . If the set is nite, [[ = n, then 2
has 2
n
elements. If
is innitely countable (i.e. can be put into a bijective correspondence with the set
of natural numbers), then 2
||
is innite and its cardinal is the same as that of the real
5
6
number set R. As a matter of fact, if represents all possible states of the nancial
world, then 2
describes all possible events, which might happen in the market; this is
a supposed to be a fully description of the total information of the nancial world.
The following couple of examples provide instances of sets 2
in the nite and

innite cases.
Example 1.2.1 Flip a coin and measure the occurrence of outcomes by 0 and 1: as-
sociate a 0 if the outcome does not occur and a 1 if the outcome occurs. We obtain the
following four possible assignments:
H 0, T 0, H 0, T 1, H 1, T 0, H 1, T 1,
so the set of subsets of H, T can be represented as 4 sequences of length 2 formed
with 0 and 1: 0, 0, 0, 1, 1, 0, 1, 1. These correspond in order to the sets
, T, H, H, T, which is the set 2
{H,T}
.
Example 1.2.2 Pick a natural number at random. Any subset of the sample space
corresponds to a sequence formed with 0 and 1. For instance, the subset 1, 3, 5, 6
corresponds to the sequence 10101100000 . . . having 1 on the 1st, 3rd, 5th and 6th
places and 0 in rest. It is known that the number of these sequences is innite and
can be put into a bijective correspondence with the real number set R. This can be also
written as [2
N
[ = [R[, and stated by saying that the set of all subsets of natural numbers
N has the same cardinal as the real numbers set R.
1.3 Events and Probability
The set of parts 2
satises the following properties:

1. It contains the empty set ;
2. If it contains a set A, then it also contains its complement

A = A;
3. It is closed with regard to unions, i.e., if A
1
, A
2
, . . . is a sequence of sets, then
their union A
1
A
2
also belongs to 2
.
Any subset T of 2
that satises the previous three properties is called a -eld. The

sets belonging to T are called events. This way, the complement of an event, or the
union of events is also an event. We say that an event occurs if the outcome of the
experiment is an element of that subset.
The chance of occurrence of an event is measured by a probability function P :
T [0, 1] which satises the following two properties:
1. P() = 1;
7
2. For any mutually disjoint events A
1
, A
2
, T,
P(A
1
A
2
) = P(A
1
) +P(A
2
) + .
The triplet (, T, P) is called a probability space. This is the main setup in which
the probability theory works.
Example 1.3.1 In the case of ipping a coin, the probability space has the following
elements: = H, T, T = , H, T, H, T and P dened by P() = 0,
P(H) =
1
2
, P(T) =
1
2
, P(H, T) = 1.
Example 1.3.2 Consider a nite sample space = s
1
, . . . , s
n
, with the -eld T =
2
, and probability given by P(A) = [A[/n, A T. Then (, 2
, P) is called the
classical probability space.
1.4 Random Variables
Since the -eld T provides the knowledge about which events are possible on the
considered probability space, then T can be regarded as the information component
of the probability space (, T, P). A random variable X is a function that assigns a
numerical value to each state of the world, X : R, such that the values taken by
X are known to someone who has access to the information T. More precisely, given
any two numbers a, b R, then all the states of the world for which X takes values
between a and b forms a set that is an event (an element of T), i.e.
; a < X() < b T.
Another way of saying this is that X is an T-measurable function. It is worth noting
that in the case of the classical eld of probability the knowledge is maximal since
T = 2
, and hence the measurability of random variables is automatically satised.

From now on instead of measurable terminology we shall use the more suggestive word
predictable. This will make more sense in a future section when we shall introduce
conditional expectations.
Example 1.4.1 Let X() be the number of people who want to buy houses, given the
state of the market . Is X predictable? This would mean that given two numbers,
say a = 10, 000 and b = 50, 000, we know all the market situations for which there
are at least 10, 000 and at most 50, 000 people willing to purchase houses. Many times,
in theory, it makes sense to assume that we have enough knowledge to assume X
predictable.
Example 1.4.2 Consider the experiment of ipping three coins. In this case is the
set of all possible triplets. Consider the random variable X which gives the number of
8
Figure 1.1: If any pullback X
1
_
(a, b)
_
is known, then the random variable X : R
is 2
-measurable.
tails obtained. For instance X(HHH) = 0, X(HHT) = 1, etc. The sets
; X() = 0 = HHH, ; X() = 1 = HHT, HTH, THH,
; X() = 3 = TTT, ; X() = 2 = HTT, THT, TTH
belong to 2
, and hence X is a random variable.

Example 1.4.3 A graph is a set of elements, called nodes, and a set of unordered pairs
of nodes, called edges. Consider the set of nodes A = n
1
, n
2
, . . . , n
k
and the set of
edges c = (n
i
, n
j
), 1 i, j n, i ,= j. Dene the probability space (, T, P), where
the sample space is the the complete graph, = A c;
the -eld T is the set of all subgraphs of ;
the probability is given by P(G) = n(G)/k, where n(G) is the number of nodes
of the graph G.
As an example of a random variable we consider Y : T R, Y (G) = the total number
of edges of the graph G. Since given T, one can count the total number of edges of each
subgraph, it follows that Y is T-measurable, and hence it is a random variable.
1.5 Distribution Functions
Let X be a random variable on the probability space (, T, P). The distribution
function of X is the function F
X
: R [0, 1] dened by
F
X
(x) = P(; X() x).
It is worth observing that since X is a random variable, then the set ; X() x
belongs to the information set T.
9
4 2 0 2 4
0.1
0.2
0.3
0.4
0 2 4 6 8
0.1
0.2
0.3
0.4
0.5
a b
4, 3
3, 2
0 5 10 15 20
0.05
0.10
0.15
0.20
8, 3 3, 9
0.0 0.2 0.4 0.6 0.8 1.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
c d
Figure 1.2: a Normal distribution; b Log-normal distribution; c Gamma distributions;
d Beta distributions.
The distribution function is non-decreasing and satises the limits
lim
x
F
X
(x) = 0, lim
x+
F
X
(x) = 1.
If we have
d
dx
F
X
(x) = p(x),
then we say that p(x) is the probability density function of X. A useful property which
follows from the Fundamental Theorem of Calculus is
P(a < X < b) = P(; a < X() < b) =
_
b
a
p(x) dx.
In the case of discrete random variables the aforementioned integral is replaced by the
following sum
P(a < X < b) =
a<x<b
P(X = x).
For more details the reader is referred to a traditional probability book, such as Wack-
erly et. al. [6].
1.6 Basic Distributions
We shall recall a few basic distributions, which are most often seen in applications.
10
Normal distribution A random variable X is said to have a normal distribution if
its probability density function is given by
p(x) =
1
2
e
(x)
2
/(2
2
)
,
with and > 0 constant parameters, see Fig.1.2a. The mean and variance are given
by
E[X] = , V ar[X] =
2
.
If X has a normal distribution with mean and variance
2
, we shall write
X N(,
2
).
Exercise 1.6.1 Let , R. Show that if X is normal distributed, with X
N(,
2
), then Y = X + is also normal distributed, with Y N( +,
2
2
).
Log-normal distribution Let X be normally distributed with mean and variance
2
. Then the random variable Y = e
X
is said to be log-normal distributed. The mean
and variance of Y are given by
E[Y ] = e
+
2
2
V ar[Y ] = e
2+
2
(e
2
1).
The density function of the log-normal distributed random variable Y is given by
p(x) =
1
x
2
e
(ln x)
2
2
2
, x > 0,
see Fig.1.2b.
Exercise 1.6.2 Given that the moment generating function of a normally distributed
random variable X N(,
2
) is m(t) = E[e
tX
] = e
t+t
2
2
/2
, show that
(a) E[Y
n
] = e
n+n
2
2
/2
, where Y = e
X
.
(b) Show that the mean and variance of the log-normal random variable Y = e
X
are
E[Y ] = e
+
2
/2
, V ar[Y ] = e
2+
2
(e
2
1).
Gamma distribution A random variable X is said to have a gamma distribution with
parameters > 0, > 0 if its density function is given by
p(x) =
x
1
e
x/
()
, x 0,
11
where () denotes the gamma function,
1
see Fig.1.2c. The mean and variance are
E[X] = , V ar[X] =
2
.
The case = 1 is known as the exponential distribution, see Fig.1.3a. In this case
p(x) =
1
e
x/
, x 0.
The particular case when = n/2 and = 2 becomes the
2
distribution with n
degrees of freedom. This characterizes also a sum of n independent standard normal
distributions.
Beta distribution A random variable X is said to have a beta distribution with
parameters > 0, > 0 if its probability density function is of the form
p(x) =
x
1
(1 x)
1
B(, )
, 0 x 1,
where B(, ) denotes the beta function.
2
See see Fig.1.2d for two particular density
functions. In this case
E[X] =

+
, V ar[X] =

( +)
2
( + + 1)
.
Poisson distribution A discrete random variable X is said to have a Poisson proba-
bility distribution if
P(X = k) =

k
k!
e
, k = 0, 1, 2, . . . ,
with > 0 parameter, see Fig.1.3b. In this case E[X] = and V ar[X] = .
Pearson 5 distribution Let , > 0. A random variable X with the density function
p(x) =
1
()
e
/x
(x/)
+1
, x 0
is said to have a Pearson 5 distribution
3
with positive parameters and . It can be
shown that
E[X] =
_
_
_
1
, if > 1
, otherwise,
V ar(X) =
_
_
_
2
( 1)
2
( 2)
, if > 2
, otherwise.
1
Recall the denition of the gamma function () =

0
y
1
e
y
dy; if = n, integer, then
(n) = (n 1)!
2
Two denition formulas for the beta functions are B(, ) =
()()
(+)
=
1
0
y
1
(1 y)
1
dy.
3
The Pearson family of distributions was designed by Pearson between 1890 and 1895. There are
several Pearson distributions, this one being distinguished by the number 5.
12
3
0 2 4 6 8 10
0.05
0.10
0.15
0.20
0.25
0.30
0.35
15, 0 k 30
5 10 15 20 25 30
0.02
0.04
0.06
0.08
0.10
a b
Figure 1.3: a Exponential distribution; b Poisson distribution.
The mode of this distribution is equal to

+ 1
.
The Inverse Gaussian distribution Let , > 0. A random variable X has an
inverse Gaussian distribution with parameters and if its density function is given
by
p(x) =

2x
3
e
(x)
2
2
2
x
, x > 0. (1.6.1)
We shall write X IG(, ). Its mean, variance and mode are given by
E[X] = , V ar(X) =

3
, Mode(X) =
_
_
1 +
9
2
4
2

3
2
_
.
This distribution will be used to model the time instance when a Brownian motion
with drift exceeds a certain barrier for the rst time.
1.7 Independent Random Variables
Roughly speaking, two random variables X and Y are independent if the occurrence
of one of them does not change the probability density of the other. More precisely, if
for any sets A, B R, the events
; X() A, ; Y () B
are independent,
4
then X and Y are called independent random variables.
Proposition 1.7.1 Let X and Y be independent random variables with probability
density functions p
X
(x) and p
Y
(y). Then the joint probability density function of (X, Y )
is given by p
X,Y
(x, y) = p
X
(x) p
Y
(y).
4
In Probability Theory two events A1 and A2 are called independent if P(A1 A2) = P(A1)P(A2).
13
Proof: Using the independence of sets, we have
5
p
X,Y
(x, y) dxdy = P(x < X < x +dx, y < Y < y +dy)
= P(x < X < x +dx)P(y < Y < y +dy)
= p
X
(x) dx p
Y
(y) dy
= p
X
(x)p
Y
(y) dxdy.
Dropping the factor dxdy yields the desired result. We note that the converse holds
true.
1.8 Integration in Probability Measure
The notion of expectation is based on integration on measure spaces. In this section
we recall briey the denition of an integral with respect to the probability measure
P.
Let X : R be a random variable on the probability space (, T, P). A partition
(
i
)
1in
of is a family of subsets
i
satisfying
1.
i

j
= , for i ,= j;
2.
n
_
i
i
= .
Each
i
is an event with the associated probability P(
i
). A simple function is a sum
of characteristic functions f =
n
i
c
i
i
. This means f() = c
k
for
k
. The
integral of the simple function f is dened by
_
f dP =
n
i
c
i
P(
i
).
If X : R is a random variable such that there is a sequence of simple functions
(f
n
)
n1
satisfying:
1. f
n
is fundamental in probability: > 0 lim
n,m
P(; [f
n
() f
m
()[ ) 0,
2. f
n
converges to X in probability: > 0 lim
n
P(; [f
n
() X()[ ) 0
then the integral of X is dened as the following limit of integrals
_
X dP = lim
n
_
f
n
dP.
From now on, the integral notations
_
X dP or
_
X() dP() will be used inter-

changeably. In the rest of the chapter the integral notation will be used formally,
without requiring a direct use of the previous denition.
5
We are using the useful approximation P(x < X < x +dx) =
x+dx
x
p(u) du = p(x)dx.
14
1.9 Expectation
A random variable X : R is called integrable if
_
[X()[ dP() =
_
R
[x[p(x) dx < ,
where p(x) denotes the probability density function of X. The previous identity is
based on changing the domain of integration from to R.
The expectation of an integrable random variable X is dened by
E[X] =
_
X() dP() =
_
R
xp(x) dx.
Customarily, the expectation of X is denoted by and it is also called the mean. In
general, for any continuous
6
function h : R R, we have
E[h(X)] =
_
h
_
X()
_
dP() =
_
R
h(x)p(x) dx.
Proposition 1.9.1 The expectation operator E is linear, i.e. for any integrable ran-
dom variables X and Y
1. E[cX] = cE[X], c R;
2. E[X +Y ] = E[X] +E[Y ].
Proof: It follows from the fact that the integral is a linear operator.
Proposition 1.9.2 Let X and Y be two independent integrable random variables.
Then
E[XY ] = E[X]E[Y ].
Proof: This is a variant of Fubinis theorem, which in this case states that a double
integral is a product of two simple integrals. Let p
X
, p
Y
, p
X,Y
denote the probabil-
ity densities of X, Y and (X, Y ), respectively. Since X and Y are independent, by
Proposition 1.7.1 we have
E[XY ] =
__
xyp
X,Y
(x, y) dxdy =
_
xp
X
(x) dx
_
yp
Y
(y) dy = E[X]E[Y ].
6
in general, measurable
15
1.10 Radon-Nikodyms Theorem
This section is concerned with existence and uniqueness results that will be useful later
in dening conditional expectations. Since this section is rather theoretical, it can be
skipped at a rst reading.
Proposition 1.10.1 Consider the probability space (, T, P), and let ( be a -eld
included in T. If X is a (-predictable random variable such that
_
A
X dP = 0 A (,
then X = 0 a.s.
Proof: In order to show that X = 0 almost surely, it suces to prove that P
_
; X() = 0
_
= 1.
We shall show rst that X takes values as small as possible with probability one, i.e.
> 0 we have P([X[ < ) = 1. To do this, let A = ; X() . Then
0 P(X ) =
_
A
dP =
1
_
A
dP
1
_
A
X dP = 0,
and hence P(X ) = 0. Similarly P(X ) = 0. Therefore
P([X[ < ) = 1 P(X ) P(X ) = 1 0 0 = 1.
Taking 0 leads to P([X[ = 0) = 1. This can be formalized as follows. Let =
1
n
and consider B
n
= ; [X()[ , with P(B
n
) = 1. Then
P(X = 0) = P([X[ = 0) = P(
n=1
B
n
) = lim
n
P(B
n
) = 1.
Corollary 1.10.2 If X and Y are (-predictable random variables such that
_
A
X dP =
_
A
Y dP A (,
then X = Y a.s.
Proof: Since
_
A
(X Y ) dP = 0, A (, by Proposition 1.10.1 we have X Y = 0
a.s.
Theorem 1.10.3 (Radon-Nikodym) Let (, T, P) be a probability space and ( be a
-eld included in T. Then for any random variable X there is a (-predictable random
variable Y such that _
A
X dP =
_
A
Y dP, A (. (1.10.2)
16
We shall omit the proof but discuss a few aspects.
1. All -elds ( T contain impossible and certain events , (. Making
A = yields
_
X dP =
_
Y dP,
which is E[X] = E[Y ].
2. Radon-Nikodyms theorem states the existence of Y . In fact this is unique almost
surely. In order to show that, assume there are two (-predictable random variables Y
1
and Y
2
with the aforementioned property. Then from (1.10.2) yields
_
A
Y
1
dP =
_
A
Y
2
dP, A (.
Applying Corollary (1.10.2) yields Y
1
= Y
2
a.s.
3. Since E[X] =
_
X dP is the expectation of the random variable X, given the

full knowledge T, then the random variable Y plays the role of the expectation of X
given the partial information (. The next section will deal with this concept in detail.
1.11 Conditional Expectation
Let X be a random variable on the probability space (, T, P). Let ( be a -eld
contained in T. Since X is T-predictable, the expectation of X, given the information
T must be X itself. This shall be written as E[X[T] = X (for details see Example
1.11.3). It is natural to ask what is the expectation of X, given the information (.
This is a random variable denoted by E[X[(] satisfying the following properties:
1. E[X[(] is (-predictable;
2.
_
A
E[X[(] dP =
_
A
X dP, A (.
E[X[(] is called the conditional expectation of X given (.
We owe a few explanations regarding the correctness of the aforementioned de-
nition. The existence of the (-predictable random variable E[X[(] is assured by the
Radon-Nikodym theorem. The almost surely uniqueness is an application of Proposi-
tion (1.10.1) (see the discussion point 2 of section 1.10).
It is worth noting that the expectation of X, denoted by E[X] is a number, while
the conditional expectation E[X[(] is a random variable. When are they equal and
what is their relationship? The answer is inferred by the following solved exercises.
Example 1.11.1 Show that if ( = , , then E[X[(] = E[X].
17
Proof: We need to show that E[X] satises conditions 1 and 2. The rst one is
obviously satised since any constant is (-predictable. The latter condition is checked
on each set of (. We have
_
X dP = E[X] = E[X]
_
dP =
_
E[X]dP
_
X dP =
_
E[X]dP.
Example 1.11.2 Show that E[E[X[(]] = E[X], i.e. all conditional expectations have
the same mean, which is the mean of X.
Proof: Using the denition of expectation and taking A = in the second relation of
the aforementioned denition, yields
E[E[X[(]] =
_
E[X[(] dP =
_
XdP = E[X],
which ends the proof.
Example 1.11.3 The conditional expectation of X given the total information T is
the random variable X itself, i.e.
E[X[T] = X.
Proof: The random variables X and E[X[T] are both T-predictable (from the de-
nition of the random variable). From the denition of the conditional expectation we
have
_
A
E[X[T] dP =
_
A
X dP, A T.
Corollary (1.10.2) implies that E[X[T] = X almost surely.
General properties of the conditional expectation are stated below without proof.
The proof involves more or less simple manipulations of integrals and can be taken as
an exercise for the reader.
Proposition 1.11.4 Let X and Y be two random variables on the probability space
(, T, P). We have
1. Linearity:
E[aX +bY [(] = aE[X[(] +bE[Y [(], a, b R;
2. Factoring out the predictable part:
E[XY [(] = XE[Y [(]
18
if X is (-predictable. In particular, E[X[(] = X.
3. Tower property:
E[E[X[(][1] = E[X[1], if 1 (;
4. Positivity:
E[X[(] 0, if X 0;
5. Expectation of a constant is a constant:
E[c[(] = c.
6. An independent condition drops out:
E[X[(] = E[X],
if X is independent of (.
Exercise 1.11.5 Prove the property 3 (tower property) given in the previous proposi-
tion.
Exercise 1.11.6 Let X be a random variable on the probability space (, T, P), which
is independent of the -eld ( T. Consider the characteristic function of a set A
dened by
A
() =
_
1, if A
0, if / A
Show the following:
(a)
A
is (-predictable for any A (;
(b) P(A) = E[
A
];
(c) X and
A
are independent random variables;
(d) E[
A
X] = E[X]P(A) for any A (;
(e) E[X[(] = E[X].
1.12 Inequalities of Random Variables
This section prepares the reader for the limits of sequences of random variables and
limits of stochastic processes. We shall start with a classical inequality result regarding
expectations:
Theorem 1.12.1 (Jensens inequality) Let : R R be a convex function and
let X be an integrable random variable on the probability space (, T, P). If (X) is
integrable, then
(E[X]) E[(X)]
almost surely (i.e. the inequality might fail on a set of probability zero).
19
Figure 1.4: Jensens inequality (E[X]) < E[(X)] for a convex function .
Proof: We shall assume twice dierentiable with
continuous. Let = E[X].

Expand in a Taylor series about and get
(x) = () +
()(x ) +
1
2
()( )
2
,
with in between x and . Since is convex,
0, and hence
(x) () +
()(x ),
which means the graph of (x) is above the tangent line at
_
x, (x)
_
. Replacing x by
the random variable X, and taking the expectation yields
E[(X)] E[() +
()(X )] = () +
()(E[X] )
= () = (E[X]),
which proves the result.
Fig.1.4 provides a graphical interpretation of Jensens inequality. If the distribution
of X is symmetric, then the distribution of (X) is skewed, with (E[X]) < E[(X)].
It is worth noting that the inequality is reversed for concave. We shall next
present a couple of applications.
A random variable X : R is called square integrable if
E[X
2
] =
_
[X()[
2
dP() =
_
R
x
2
p(x) dx < .
Application 1.12.2 If X is a square integrable random variable, then it is integrable.
20
Proof: Jensens inequality with (x) = x
2
becomes
E[X]
2
E[X
2
].
Since the right side is nite, it follows that E[X] < , so X is integrable.
Application 1.12.3 If m
X
(t) denotes the moment generating function of the random
variable X with mean , then
m
X
(t) e
t
.
Proof: Applying Jensen inequality with the convex function (x) = e
x
yields
e
E[X]
E[e
X
].
Substituting tX for X yields
e
E[tX]
E[e
tX
]. (1.12.3)
Using the denition of the moment generating function m
X
(t) = E[e
tX
] and that
E[tX] = tE[X] = t, then (1.12.3) leads to the desired inequality.
The variance of a square integrable random variable X is dened by
V ar(X) = E[X
2
] E[X]
2
.
By Application 1.12.2 we have V ar(X) 0, so there is a constant
X
> 0, called
standard deviation, such that
2
X
= V ar(X).
Exercise 1.12.4 Prove the following identity:
V ar[X] = E[(X E[X])
2
].
Exercise 1.12.5 Prove that a non-constant random variable has a non-zero standard
deviation.
Exercise 1.12.6 Prove the following extension of Jensens inequality: If is a convex
function, then for any -eld ( T we have
(E[X[(]) E[(X)[(].
Exercise 1.12.7 Show the following:
(a) [E[X][ E[[X[];
(b) [E[X[(][ E[[X[ [(], for any -eld ( T;
(c) [E[X][
r
E[[X[
r
], for r 1;
(d) [E[X[(][
r
E[[X[
r
[(], for any -eld ( T and r 1.
21
Theorem 1.12.8 (Markovs inequality) For any , p > 0, we have the following
inequality:
P(; [X()[ )
1
p
E[[X[
p
].
Proof: Let A = ; [X()[ . Then
E[[X[
p
] =
_
[X()[
p
dP()
_
A
[X()[
p
dP()
_
A
p
dP()
=
p
_
A
dP() =
p
P(A) =
p
P([X[ ).
Dividing by
p
leads to the desired result.
Theorem 1.12.9 (Tchebychevs inequality) If X is a random variable with mean
and variance
2
, then
P(; [X() [ )

2
2
.
Proof: Let A = ; [X() [ . Then
2
= V ar(X) = E[(X )
2
] =
_
(X )
2
dP
_
A
(X )
2
dP

2
_
A
dP =
2
P(A) =
2
P(; [X() [ ).
Dividing by
2
leads to the desired inequality.
The next result deals with exponentially decreasing bounds on tail distributions.
Theorem 1.12.10 (Cherno bounds) Let X be a random variable. Then for any
> 0 we have
1. P(X )
E[e
tX
]
e
t
, t > 0;
2. P(X )
E[e
tX
]
e
t
, t < 0.
Proof: 1. Let t > 0 and denote Y = e
tX
. By Markovs inequality
P(Y e
t
)
E[Y ]
e
t
.
Then we have
P(X ) = P(tX t) = P(e
tX
e
t
)
= P(Y e
t
)
E[Y ]
e
t
=
E[e
tX
]
e
t
.
22
2. The case t < 0 is similar.
In the following we shall present an application of the Cherno bounds for the
normal distributed random variables.
Let X be a random variable normally distributed with mean and variance
2
. It
is known that its moment generating function is given by
m(t) = E[e
tX
] = e
t+
1
2
t
2
2
.
Using the rst Cherno bound we obtain
P(X )
m(t)
e
t
= e
()t+
1
2
t
2
2
, t > 0,
which implies
P(X ) e
min
t>0
[( )t +
1
2
t
2
2
]
.
It is easy to see that the quadratic function f(t) = ( )t +
1
2
t
2
2
has the minimum
value reached for t =

2
. Since t > 0, needs to satisfy > . Then
min
t>0
f(t) = f
_

2
_
=
( )
2
2
2
.
Substituting into the previous formula, we obtain the following result:
Proposition 1.12.11 If X is a normally distributed variable, with X N(,
2
),
then for any >
P(X ) e
( )
2
2
2
.
Exercise 1.12.12 Let X be a Poisson random variable with mean > 0.
(a) Show that the moment generating function of X is m(t) = e
(e
t
1)
;
(b) Use a Cherno bound to show that
P(X k) e
(e
t
1)tk
, t > 0.
Markovs, Tchebychevs and Chernos inequalities will be useful later when com-
puting limits of random variables.
The next inequality is called Tchebychevs inequality for monotone sequences of
numbers.
23
Lemma 1.12.13 Let (a
i
) and (b
i
) be two sequences of real numbers such that either
a
1
a
2
a
n
, b
1
b
2
b
n
or
a
1
a
2
a
n
, b
1
b
2
b
n
If (
i
) is a sequence of non-negative numbers such that
n
i=1
i
= 1, then
_
n
i=1
i
a
i
__
n
i=1
i
b
i
_

n
i=1
i
a
i
b
i
.
Proof: Since the sequences (a
i
) and (b
i
) are either both increasing or both decreasing
(a
i
a
j
)(b
i
b
j
) 0.
Multiplying by the positive quantity
i
j
and summing over i and j we get
i,j
j
(a
i
a
j
)(b
i
b
j
) 0.
Expanding yields
_
j
__
i
a
i
b
i
_
i
a
i
__
j
b
j
_
j
a
j
__
i
b
i
_
+
_
i
__
j
a
j
b
j
_
0.
Using
j
= 1 the expression becomes
i
a
i
b
i

_
i
a
i
__
j
b
j
_
,
Next we present a meaningful application of the previous inequality.
Proposition 1.12.14 Let X be a random variable and f and g be two functions, both
increasing or both decreasing. Then
E[f(X)g(X)] E[f(X)]E[g(X)]. (1.12.4)
24
Proof: If X is a discrete random variable, with outcomes x
1
, , x
n
, inequality
(1.12.4) becomes
j
f(x
j
)g(x
j
)p(x
j
)
j
f(x
j
)p(x
j
)
j
g(x
j
)p(x
j
),
where p(x
j
) = P(X = x
j
). Denoting a
j
= f(x
j
), b
j
= g(x
j
), and
j
= p(x
j
), the
inequality transforms into
j
a
j
b
j
j
a
j
j
b
j
j
,
which holds true by Lemma 1.12.13.
If X is a continuous random variable with the density function p : I R, the
inequality (1.12.4) can be written in the integral form
_
I
f(x)g(x)p(x) dx
_
I
f(x)p(x) dx
_
I
g(x)p(x) dx. (1.12.5)
Let x
0
< x
1
< < x
n
be a partition of the interval I, with x = x
k+1
x
k
. Using
Lemma 1.12.13 we obtain the following inequality between Riemann sums
j
f(x
j
)g(x
j
)p(x
j
)x
_
j
f(x
j
)p(x
j
)x
__
j
g(x
j
)p(x
j
)x
_
,
where a
j
= f(x
j
), b
j
= g(x
j
), and
j
= p(x
j
)x. Taking the limit |x| 0 we obtain
(1.12.5), which leads to the desired result.
Exercise 1.12.15 Show the following inequalities:
(a) E[X
2
] E[X]
2
;
(b) E[X sinh(X)] E[X]E[sinh(X)];
(c) E[X
6
] E[X]E[X
5
];
(d) E[X
6
] E[X
3
]
2
.
Exercise 1.12.16 For any n, k 1, show that
E[X
2(n+k+1)
] E[X
2k+1
]E[X
2n+1
].
1.13 Limits of Sequences of Random Variables
Consider a sequence (X
n
)
n1
of random variables dened on the probability space
(, T, P). There are several ways of making sense of the limit expression X = lim
n
X
n
,
and they will be discussed in the following sections.
25
Almost Certain Limit
The sequence X
n
converges almost certainly to X, if for all states of the world , except
a set of probability zero, we have
lim
n
X
n
() = X().
More precisely, this means
P
_
; lim
n
X
n
() = X()
_
= 1,
and we shall write ac-lim
n
X
n
= X. An important example where this type of limit
occurs is the Strong Law of Large Numbers:
If X
n
is a sequence of independent and identically distributed random variables with
the same mean , then ac-lim
n
X
1
+ +X
n
n
= .
It is worth noting that this type of convergence is also known under the name of
strong convergence. This is the reason that the aforementioned theorem bares its name.
Example 1.13.1 Let = H, T be the sample space obtained when a coin is ipped.
Consider the random variables X
n
: 0, 1, where X
n
denotes the number of
heads obtained at the n-th ip. Obviously, X
n
are i.i.d., with the distribution given
by P(X
n
= 0) = P(X
n
= 1) = 1/2, and the mean E[X
n
] = 0
1
2
+ 1
1
2
=
1
2
. Then
X
1
+ +X
n
is the number of heads obtained after n ips of the coin. By the law of
large numbers,
1
n
(X
1
+ +X
n
) tends to 1/2 strongly, as n .
Mean Square Limit
Another possibility of convergence is to look at the mean square deviation of X
n
from
X. We say that X
n
converges to X in the mean square if
lim
n
E[(X
n
X)
2
] = 0.
More precisely, this should be interpreted as
lim
n
_
_
X
n
() X()
_
2
dP() = 0.
This limit will be abbreviated by ms-lim
n
X
n
= X. The mean square convergence is
useful when dening the Ito integral.
Example 1.13.1 Consider a sequence X
n
of random variables such that there is a
constant k with E[X
n
] k and V ar(X
n
) 0 as n . Show that ms-lim
n
X
n
= k.
26
Proof: Since we have
E[[X
n
k[
2
] = E[X
2
n
2kX
n
+k
2
] = E[X
2
n
] 2kE[X
n
] +k
2
=
_
E[X
2
n
] E[X
n
]
2
_
+
_
E[X
n
]
2
2kE[X
n
] +k
2
_
= V ar(X
n
) +
_
E[X
n
] k
_
2
,
the right side tends to 0 when taking the limit n .
Exercise 1.13.2 Show the following relation
E[(X Y )
2
] = V ar[X] +V ar[Y ] +
_
E[X] E[Y ]
_
2
2Cov(X, Y ).
Exercise 1.13.3 If X
n
tends to X in mean square, with E[X
2
] < , show that:
(a) E[X
n
] E[X] as n ;
(b) E[X
2
n
] E[X
2
] as n ;
(c) V ar[X
n
] V ar[X] as n ;
(d) Cov(X
n
, X) V ar[X] as n .
Exercise 1.13.4 If X
n
tends to X in mean square, show that E[X
n
[1] tends to
E[X[1] in mean square.
Limit in Probability or Stochastic Limit
The random variable X is the stochastic limit of X
n
if for n large enough the probability
of deviation from X can be made smaller than any arbitrary . More precisely, for any
> 0
lim
n
P
_
; [X
n
() X()[
_
= 1.
This can be written also as
lim
n
P
_
; [X
n
() X()[ >
_
= 0.
This limit is denoted by st-lim
n
X
n
= X.
It is worth noting that both almost certain convergence and convergence in mean
square imply the stochastic convergence. Hence, the stochastic convergence is weaker
than the aforementioned two convergence cases. This is the reason that it is also called
the weak convergence. One application is the Weak Law of Large Numbers:
If X
1
, X
2
, . . . are identically distributed with expected value and if any nite num-
ber of them are independent, then st-lim
n
X
1
+ +X
n
n
= .
Proposition 1.13.5 The convergence in the mean square implies the stochastic con-
vergence.
27
Proof: Let ms-lim
n
Y
n
= Y . Let > 0 be arbitrarily xed. Applying Markovs
inequality with X = Y
n
Y , p = 2 and = , yields
0 P([Y
n
Y [ )
1
2
E[[Y
n
Y [
2
].
The right side tends to 0 as n . Applying the Squeeze Theorem we obtain
lim
n
P([Y
n
Y [ ) = 0,
which means that Y
n
converges stochastically to Y .
Example 1.13.6 Let X
n
be a sequence of random variables such that E[[X
n
[] 0 as
n . Prove that st-lim
n
X
n
= 0.
Proof: Let > 0 be arbitrarily xed. We need to show
lim
n
P
_
; [X
n
()[
_
= 0. (1.13.6)
From Markovs inequality (see Exercise 1.12.8) we have
0 P
_
; [X
n
()[
_

E[[X
n
[]
.
Using Squeeze Theorem we obtain (1.13.6).
Remark 1.13.7 The conclusion still holds true even in the case when there is a p > 0
such that E[[X
n
[
p
] 0 as n .
Limit in Distribution
We say the sequence X
n
converges in distribution to X if for any continuous bounded
function (x) we have
lim
n
(X
n
) = (X).
This type of limit is even weaker than the stochastic convergence, i.e. it is implied by
it.
An application of the limit in distribution is obtained if we consider (x) = e
itx
. In
this case, if X
n
converges in distribution to X, then the characteristic function of X
n
converges to the characteristic function of X. In particular, the probability density of
X
n
approaches the probability density of X.
It can be shown that the convergence in distribution is equivalent with
lim
n
F
n
(x) = F(x),
whenever F is continuous at x, where F
n
and F denote the distribution functions of
X
n
and X, respectively. This is the reason that this convergence bares its name.
28
Remark 1.13.8 The almost certain convergence implies the stochastic convergence,
and the stochastic convergence implies the limit in distribution. The proof of these
statements is beyound the goal of this book. The interested reader can consult a graduate
text in probability theory.
1.14 Properties of Limits
Lemma 1.14.1 If ms-lim
n
X
n
= 0 and ms-lim
n
Y
n
= 0, then
1. ms-lim
n
(X
n
+Y
n
) = 0
2. ms-lim
n
(X
n
Y
n
) = 0.
Proof: Since ms-lim
n
X
n
= 0, then lim
n
E[X
2
n
] = 0. Applying the Squeeze Theorem
to the inequality
7
0 E[X
n
]
2
E[X
2
n
]
yields lim
n
E[X
n
] = 0. Then
lim
n
V ar[X
n
] = lim
n
_
E[X
2
n
] lim
n
E[X
n
]
2
_
= lim
n
E[X
2
n
] lim
n
E[X
n
]
2
= 0.
Similarly, we have lim
n
E[Y
2
n
] = 0, lim
n
E[Y
n
] = 0 and lim
n
V ar[Y
n
] = 0. Then
lim
n
Xn
= lim
n
Yn
= 0. Using the correlation denition formula of two random
variables X
n
and Y
n
Corr(X
n
, Y
n
) =
Cov(X
n
, Y
n
)
Xn
Yn
,
and the fact that [Corr(X
n
, Y
n
)[ 1, yields
0 [Cov(X
n
, Y
n
)[
Xn
Yn
.
Since lim
n
Xn
Xn
= 0, from the Squeeze Theorem it follows that
lim
n
Cov(X
n
, Y
n
) = 0.
Taking n in the relation
Cov(X
n
, Y
n
) = E[X
n
Y
n
] E[X
n
]E[Y
n
]
7
This follows from the fact that V ar[Xn] 0.
29
yields lim
n
E[X
n
Y
n
] = 0. Using the previous relations, we have
lim
n
E[(X
n
+Y
n
)
2
] = lim
n
E[X
2
n
+ 2X
n
Y
n
+Y
2
n
]
= lim
n
E[X
2
n
] + 2 lim
n
E[X
n
Y
n
] + lim
n
E[Y
2
n
]
= 0,
which means ms-lim
n
(X
n
+Y
n
) = 0.
Proposition 1.14.2 If the sequences of random variables X
n
and Y
n
converge in the
mean square, then
1. ms-lim
n
(X
n
+Y
n
) = ms-lim
n
X
n
+ ms-lim
n
Y
n
2. ms-lim
n
(cX
n
) = c ms-lim
n
X
n
, c R.
Proof: 1. Let ms-lim
n
X
n
= L and ms-lim
n
Y
n
= M. Consider the sequences X
n
=
X
n
L and Y
n
= Y
n
M. Then ms-lim
n
X
n
= 0 and ms-lim
n
Y
n
= 0. Applying Lemma
1.14.1 yields
ms-lim
n
(X
n
+Y
n
) = 0.
This is equivalent with
ms-lim
n
(X
n
L +Y
n
M) = 0,
which becomes
ms-lim
n
(X
n
+Y
n
) = L +M.
1.15 Stochastic Processes
A stochastic process on the probability space (, T, P) is a family of random variables
X
t
parameterized by t T, where T R. If T is an interval we say that X
t
is a
stochastic process in continuous time. If T = 1, 2, 3, . . . we shall say that X
t
is a
stochastic process in discrete time. The latter case describes a sequence of random vari-
ables. The aforementioned types of convergence can be easily extended to continuous
time. For instance, X
t
converges in the strong sense to X as t if
P
_
; lim
t
X
t
() = X()
_
= 1.
The evolution in time of a given state of the world given by the function
t X
t
() is called a path or realization of X
t
. The study of stochastic processes
using computer simulations is based on retrieving information about the process X
t
given a large number of it realizations.
30
Consider that all the information accumulated until time t is contained by the -
eld T
t
. This means that T
t
contains the information of which events have already
occurred until time t, and which did not. Since the information is growing in time, we
have
T
s
T
t
T
for any s, t T with s t. The family T
t
is called a ltration.
A stochastic process X
t
is called adapted to the ltration T
t
if X
t
is T
t
- predictable,
for any t T.
Example 1.15.1 Here there are a few examples of ltrations:
1. T
t
represents the information about the evolution of a stock until time t, with
t > 0.
2. T
t
represents the information about the evolution of a Black-Jack game until
time t, with t > 0.
Example 1.15.2 If X is a random variable, consider the conditional expectation
X
t
= E[X[T
t
].
From the denition of conditional expectation, the random variable X
t
is T
t
-predictable,
and can be regarded as the measurement of X at time t using the information T
t
.
If the accumulated knowledge T
t
increases and eventually equals the -eld T, then
X = E[X[T], i.e. we obtain the entire random variable. The process X
t
is adapted to
T
t
.
Example 1.15.3 Don Joe is asking a doctor how long he still has to live. The age
at which he will pass away is a random variable, denoted by X. Given his medical
condition today, which is contained in T
t
, the doctor infers that Mr. Joe will die at the
age of X
t
= E[X[T
t
]. The stochastic process X
t
is adapted to the medical knowledge
T
t
.
We shall dene next an important type of stochastic process.
8
Denition 1.15.4 A process X
t
, t T, is called a martingale with respect to the
ltration T
t
if
1. X
t
is integrable for each t T;
2. X
t
is adapted to the ltration T
t
;
3. X
s
= E[X
t
[T
s
], s < t.
Remark 1.15.5 The rst condition states that the unconditional forecast is nite
E[[X
t
]] =
_
[X
t
[ dP < . Condition 2 says that the value X
t
is known, given the
information set T
t
. This can be also stated by saying that X
t
is T
t
-predictable. The
third relation asserts that the best forecast of unobserved future values is the last obser-
vation on X
t
.
8
The concept of martingale was introduced by Levy in 1934.
31
Remark 1.15.6 If the third condition is replaced by
3
. X
s
E[X
t
[T
s
], s t
then X
t
is called a submartingale; and if it is replaced by
3
. X
s
E[X
t
[T
s
], s t
then X
t
is called a supermartingale.
It is worth noting that X
t
is a submartingale if and only if X
t
is a supermartingale.
Example 1.15.1 Let X
t
denote Mr. Li Zhus salary after t years of work at the same
company. Since X
t
is known at time t and it is bounded above, as all salaries are,
then the rst two conditions hold. Being honest, Mr. Zhu expects today that his future
salary will be the same as todays, i.e. X
s
= E[X
t
[T
s
], for s < t. This means that X
t
is a martingale.
If Mr. Zhu is optimistic and believes as of today that his future salary will increase,
then X
t
is a submartingale.
Exercise 1.15.7 If X is an integrable random variable on (, T, P), and T
t
is a l-
tration. Prove that X
t
= E[X[T
t
] is a martingale.
Exercise 1.15.8 Let X
t
and Y
t
be martingales with respect to the ltration T
t
. Show
that for any a, b, c R the process Z
t
= aX
t
+bY
t
+c is a T
t
-martingale.
t
and Y
t
be martingales with respect to the ltration T
t
.
(a) Is the process X
t
Y
t
always a martingale with respect to T
t
?
(b) What about the processes X
2
t
and Y
2
t
?
Exercise 1.15.10 Two processes X
t
and Y
t
are called conditionally uncorrelated, given
T
t
, if
E[(X
t
X
s
)(Y
t
Y
s
)[T
s
] = 0, 0 s < t < .
Let X
t
and Y
t
be martingale processes. Show that the process Z
t
= X
t
Y
t
is a martingale
if and only if X
t
and Y
t
are conditionally uncorrelated. Assume that X
t
, Y
t
and Z
t
are
integrable.
In the following, if X
t
is a stochastic process, the minimum amount of information
resulted from knowing the process X
t
until time t is denoted by T
t
= (X
s
; s t). In
the case of a discrete process, we have T
n
= (X
k
; k n).
n
, n 0 be a sequence of integrable independent random
variables, with E[X
n
] < , for all n 0. Let S
0
= X
0
, S
n
= X
0
+ +X
n
. Show the
following:
(a) S
n
E[S
n
] is an T
n
-martingale.
(b) If E[X
n
] = 0 and E[X
2
n
] < , n 0, then S
2
n
V ar(S
n
) is an T
n
-martingale.
(c) If E[X
n
] 0, then S
n
is an T
n
-submartingale.
32
n
, n 0 be a sequence of independent, integrable random
variables such that E[X
n
] = 1 for n 0. Prove that P
n
= X
0
X
1
X
n
is an
T
n
-martingale.
Exercise 1.15.13 (a) Let X be a normally distributed random variable with mean
,= 0 and variance
2
. Prove that there is a unique ,= 0 such that E[e
X
] = 1.
(b) Let (X
i
)
i0
be a sequence of identically normally distributed random variables with
mean ,= 0. Consider the sum S
n
=
n
j=0
X
j
. Show that Z
n
= e
Sn
is a martingale,
with dened in part (a).
In section 9.1 we shall encounter several processes which are martingales.
Chapter 2
Useful Stochastic Processes
This chapter deals with the most common used stochastic processes and their basic
properties. The two main basic processes are the Brownian motion and the Poisson
process. The other processes described in this chapter are derived from the previous
two.
2.1 The Brownian Motion
The observation made rst by the botanist Robert Brown in 1827, that small pollen
grains suspended in water have a very irregular and unpredictable state of motion, led
to the denition of the Brownian motion, which is formalized in the following:
Denition 2.1.1 A Brownian motion process is a stochastic process B
t
, t 0, which
satises
1. The process starts at the origin, B
0
= 0;
2. B
t
has stationary, independent increments;
3. The process B
t
is continuous in t;
4. The increments B
t
B
s
are normally distributed with mean zero and variance
[t s[,
B
t
B
s
N(0, [t s[).
The process X
t
= x + B
t
has all the properties of a Brownian motion that starts
at x. Since B
t
B
s
is stationary, its distribution function depends only on the time
interval t s, i.e.
P(B
t+s
B
s
a) = P(B
t
B
0
a) = P(B
t
a).
It is worth noting that even if B
t
is continuous, it is nowhere dierentiable. From
condition 4 we get that B
t
is normally distributed with mean E[B
t
] = 0 and V ar[B
t
] = t
B
t
N(0, t).
33
34
This implies also that the second moment is E[B
2
t
] = t. Let 0 < s < t. Since the
increments are independent, we can write
E[B
s
B
t
] = E[(B
s
B
0
)(B
t
B
s
) +B
2
s
] = E[B
s
B
0
]E[B
t
B
s
] +E[B
2
s
] = s.
Consequently, B
s
and B
t
are not independent.
Condition 4 has also a physical explanation. A pollen grain suspended in water is
kicked by a very large numbers of water molecules. The inuence of each molecule on
the grain is independent of the other molecules. These eects are average out into a
resultant increment of the grain coordinate. According to the Central Limit Theorem,
this increment has to be normal distributed.
Proposition 2.1.2 A Brownian motion process B
t
is a martingale with respect to the
information set T
t
= (B
s
; s t).
Proof: The integrability of B
t
follows from Jensens inequality
E[[B
t
[]
2
E[B
2
t
] = V ar(B
t
) = [t[ < .
B
t
is obviously T
t
-predictable. Let s < t and write B
t
= B
s
+ (B
t
B
s
). Then
E[B
t
[T
s
] = E[B
s
+ (B
t
B
s
)[T
s
]
= E[B
s
[T
s
] +E[B
t
B
s
[T
s
]
= B
s
+E[B
t
B
s
] = B
s
+E[B
ts
B
0
] = B
s
,
where we used that B
s
is T
s
-predictable (from where E[B
s
[T
s
] = B
s
) and that the
increment B
t
B
s
is independent of previous values of B
t
contained in the information
set T
t
= (B
s
; s t).
A process with similar properties as the Brownian motion was introduced by Wiener.
Denition 2.1.3 A Wiener process W
t
is a process adapted to a ltration T
t
such that
1. The process starts at the origin, W
0
= 0;
2. W
t
is an T
t
-martingale with E[W
2
t
] < for all t 0 and
E[(W
t
W
s
)
2
] = t s, s t;
3. The process W
t
is continuous in t.
Since W
t
is a martingale, its increments are unpredictable
1
and hence E[W
t
W
s
] =
0; in particular E[W
t
] = 0. It is easy to show that
V ar[W
t
W
s
] = [t s[, V ar[W
t
] = t.
1
This follows from E[Wt Ws] = E[Wt Ws|Fs] = E[Wt|Fs] Ws = Ws Ws = 0.
35
Exercise 2.1.4 Show that a Brownian process B
t
is a Winer process.
The only property B
t
has and W
t
seems not to have is that the increments are nor-
mally distributed. However, there is no distinction between these two processes, as the
following result states.
Theorem 2.1.5 (Levy) A Wiener process is a Brownian motion process.
In stochastic calculus we often need to use innitesimal notation and its properties.
If dW
t
denotes the innitesimal increment of a Wiener process in the time interval dt,
the aforementioned properties become dW
t
N(0, dt), E[dW
t
] = 0, and E[(dW
t
)
2
] =
dt.
Proposition 2.1.6 If W
t
is a Wiener process with respect to the information set T
t
,
then Y
t
= W
2
t
t is a martingale.
Proof: Y
t
is integrable since
E[[Y
t
[] E[W
2
t
+t] = 2t < , t > 0.
Let s < t. Using that the increments W
t
W
s
and (W
t
W
s
)
2
are independent of the
information set T
s
and applying Proposition 1.11.4 yields
E[W
2
t
[T
s
] = E[(W
s
+W
t
W
s
)
2
[T
s
]
= E[W
2
s
+ 2W
s
(W
t
W
s
) + (W
t
W
s
)
2
[T
s
]
= E[W
2
s
[T
s
] +E[2W
s
(W
t
W
s
)[T
s
] +E[(W
t
W
s
)
2
[T
s
]
= W
2
s
+ 2W
s
E[W
t
W
s
[T
s
] +E[(W
t
W
s
)
2
[T
s
]
= W
2
s
+ 2W
s
E[W
t
W
s
] +E[(W
t
W
s
)
2
]
= W
2
s
+t s,
and hence E[W
2
t
t[T
s
] = W
2
s
s, for s < t.
The following result states the memoryless property of Brownian motion
2
W
t
.
Proposition 2.1.7 The conditional distribution of W
t+s
, given the present W
t
and the
past W
u
, 0 u < t, depends only on the present.
Proof: Using the independent increment assumption, we have
P(W
t+s
c[W
t
= x, W
u
, 0 u < t)
= P(W
t+s
W
t
c x[W
t
= x, W
u
, 0 u < t)
= P(W
t+s
W
t
c x)
= P(W
t+s
c[W
t
= x).
2
These type of processes are called Marcov processes.
36
Since W
t
is normally distributed with mean 0 and variance t, its density function is
t
(x) =
1
2t
e
x
2
2t
.
Then its distribution function is
F
t
(x) = P(W
t
x) =
1
2t
_
x
u
2
2t
du
The probability that W
t
is between the values a and b is given by
P(a W
t
b) =
1
2t
_
b
a
e
u
2
2t
du, a < b.
Even if the increments of a Brownian motion are independent, their values are still
correlated.
Proposition 2.1.8 Let 0 s t. Then
1. Cov(W
s
, W
t
) = s;
2. Corr(W
s
, W
t
) =
_
s
t
.
Proof: 1. Using the properties of covariance
Cov(W
s
, W
t
) = Cov(W
s
, W
s
+W
t
W
s
)
= Cov(W
s
, W
s
) +Cov(W
s
, W
t
W
s
)
= V ar(W
s
) +E[W
s
(W
t
W
s
)] E[W
s
]E[W
t
W
s
]
= s +E[W
s
]E[W
t
W
s
]
= s,
since E[W
s
] = 0.
We can also arrive at the same result starting from the formula
Cov(W
s
, W
t
) = E[W
s
W
t
] E[W
s
]E[W
t
] = E[W
s
W
t
].
Using that conditional expectations have the same expectation, factoring the pre-
dictable part out, and using that W
t
is a martingale, we have
E[W
s
W
t
] = E[E[W
s
W
t
[T
s
]] = E[W
s
E[W
t
[T
s
]]
= E[W
s
W
s
] = E[W
2
s
] = s,
so Cov(W
s
, W
t
) = s.
37
2. The correlation formula yields
Corr(W
s
, W
t
) =
Cov(W
s
, W
t
)
(W
t
)(W
s
)
=
s
t
=
_
s
t
.
Remark 2.1.9 Removing the order relation between s and t, the previous relations
can also be stated as
Cov(W
s
, W
t
) = mins, t;
Corr(W
s
, W
t
) =
mins, t
maxs, t
.
The following exercises state the translation and the scaling invariance of the Brow-
nian motion.
Exercise 2.1.10 For any t
0
0, show that the process X
t
= W
t+t
0
W
t
0
is a Brownian
motion. This can be also stated as saying that the Brownian motion is translation
invariant.
Exercise 2.1.11 For any > 0, show that the process X
t
=
1
W
t
is a Brownian
motion. This says that the Brownian motion is invariant by scaling.
Exercise 2.1.12 Let 0 < s < t < u. Show the following multiplicative property
Corr(W
s
, W
t
)Corr(W
t
, W
u
) = Corr(W
s
, W
u
).
Exercise 2.1.13 Find the expectations E[W
3
t
] and E[W
4
t
].
Exercise 2.1.14 (a) Use the martingale property of W
2
t
t to nd E[(W
2
t
t)(W
2
s
s)];
(b) Evaluate E[W
2
t
W
2
s
];
(c) Compute Cov(W
2
t
, W
2
s
);
(d) Find Corr(W
2
t
, W
2
s
).
Exercise 2.1.15 Consider the process Y
t
= tW1
t
, t > 0, and dene Y
0
= 0.
(a) Find the distribution of Y
t
;
(b) Find the probability density of Y
t
;
(c) Find Cov(Y
s
, Y
t
);
(d) Find E[Y
t
Y
s
] and V ar(Y
t
Y
s
) for s < t.
It is worth noting that the process Y
t
= tW1
t
, t > 0 with Y
0
= 0 is a Brownian motion.
38
50 100 150 200 250 300 350
1.0
0.5
50 100 150 200 250 300 350
2
3
4
5
a b
Figure 2.1: a Three simulations of the Brownian motion process W
t
; b Two simulations
of the geometric Brownian motion process e
Wt
.
Exercise 2.1.16 The process X
t
= [W
t
[ is called Brownian motion reected at the
origin. Show that
(a) E[[W
t
[] =
_
2t/;
(b) V ar([W
t
[) = (1
2
)t.
Exercise 2.1.17 Let 0 < s < t. Find E[W
2
t
[T
s
].
Exercise 2.1.18 Let 0 < s < t. Show that
(a) E[W
3
t
[T
s
] = 3(t s)W
s
+W
3
s
;
(b) E[W
4
t
[T
s
] = 3(t s)
2
+ 6(t s)W
2
s
+W
4
s
.
Exercise 2.1.19 Show that the following processes are Brownian motions
(a) X
t
= W
T
W
Tt
, 0 t T;
(b) Y
t
= W
t
, t 0.
2.2 Geometric Brownian Motion
The process X
t
= e
Wt
, t 0 is called geometric Brownian motion. A few simulations
of this process are contained in Fig.2.1 b. The following result will be useful in the
following.
Lemma 2.2.1 E[e
Wt
] = e
2
t/2
, for 0.
Proof: Using the denition of expectation
E[e
Wt
] =
_
e
x
t
(x) dx =
1
2t
_
e
x
2
2t
+x
dx
= e
2
t/2
,
39
where we have used the integral formula
_
e
ax
2
+bx
dx =
_
a
e
b
2
4a
, a > 0
with a =
1
2t
and b = .
Proposition 2.2.2 The geometric Brownian motion X
t
= e
Wt
is log-normally dis-
tributed with mean e
t/2
and variance e
2t
e
t
.
Proof: Since W
t
is normally distributed, then X
t
= e
Wt
will have a log-normal distri-
bution. Using Lemma 2.2.1 we have
E[X
t
] = E[e
Wt
] = e
t/2
E[X
2
t
] = E[e
2Wt
] = e
2t
,
and hence the variance is
V ar[X
t
] = E[X
2
t
] E[X
t
]
2
= e
2t
(e
t/2
)
2
= e
2t
e
t
.
The distribution function of X
t
= e
Wt
can be obtained by reducing it to the distri-
bution function of a Brownian motion.
F
X
t
(x) = P(X
t
x) = P(e
Wt
x)
= P(W
t
ln x) = F
W
t
(ln x)
=
1
2t
_
ln x
u
2
2t
du.
The density function of the geometric Brownian motion X
t
= e
Wt
is given by
p(x) =
d
dx
F
X
t
(x) =
_
_
1
x
2t
e
(ln x)
2
/(2t)
, if x > 0,
0, elsewhere.
Exercise 2.2.3 Show that
E[e
WtWs
] = e
ts
2
, s < t.
t
= e
Wt
.
(a) Show that X
t
is not a martingale.
(b) Show that e
t
2
X
t
is a martingale.
(c) Show that for any constant c R, the process Y
t
= e
cWt
1
2
c
2
t
is a martingale.
40
Exercise 2.2.5 If X
t
= e
Wt
, nd Cov(X
s
, X
t
)
(a) by direct computation;
(b) by using Exercise 2.2.4 (b).
E[e
2W
2
t
] =
_
(1 4t)
1/2
, 0 t < 1/4
, otherwise.
2.3 Integrated Brownian Motion
The stochastic process
Z
t
=
_
t
0
W
s
ds, t 0
is called integrated Brownian motion. Obviously, Z
0
= 0.
Let 0 = s
0
< s
1
< < s
k
< s
n
= t, with s
k
=
kt
n
. Then Z
t
can be written as a
limit of Riemann sums
Z
t
= lim
n
n
k=1
W
s
k
s = t lim
n
W
s
1
+ +W
sn
n
,
where s = s
k+1
s
k
=
t
n
. We are tempted to apply the Central Limit Theorem at this
point, but W
s
k
are not independent, so we rst need to transform the sum into a sum
of independent normally distributed random variables. A straightforward computation
shows that
W
s
1
+ +W
sn
= n(W
s
1
W
0
) + (n 1)(W
s
2
W
s
1
) + + (W
sn
W
s
n1
)
= X
1
+X
2
+ +X
n
. (2.3.1)
Since the increments of a Brownian motion are independent and normally distributed,
we have
X
1
N
_
0, n
2
s
_
X
2
N
_
0, (n 1)
2
s
_
X
3
N
_
0, (n 2)
2
s
_
.
.
.
X
n
N
_
0, s
_
.
Recall now the following variant of the Central Limit Theorem:
41
Theorem 2.3.1 If X
j
are independent random variables normally distributed with
mean
j
and variance
2
j
, then the sum X
1
+ + X
n
is also normally distributed
with mean
1
+ +
n
and variance
2
1
+ +
2
n
.
Then
X
1
+ +X
n
N
_
0, (1 + 2
2
+ 3
2
+ +n
2
)s
_
= N
_
0,
n(n + 1)(2n + 1)
6
s
_
,
with s =
t
n
. Using (2.3.1) yields
t
W
s
1
+ +W
sn
n
N
_
0,
(n + 1)(2n + 1)
6n
2
t
3
_
.
Taking the limit we get
Z
t
N
_
0,
t
3
3
_
.
Proposition 2.3.2 The integrated Brownian motion Z
t
has a normal distribution with
mean 0 and variance t
3
/3.
Remark 2.3.3 The aforementioned limit was taken heuristically, without specifying
the type of the convergence. In order to make this to work, the following result is
usually used:
If X
n
is a sequence of normal random variables that converges in mean square to X,
then the limit X is normal distributed, with E[X
n
] E[X] and V ar(X
n
) V ar(X),
as n .
The mean and the variance can also be computed in a direct way as follows. By
Fubinis theorem we have
E[Z
t
] = E[
_
t
0
W
s
ds] =
_
R
_
t
0
W
s
ds dP
=
_
t
0
_
R
W
s
dP ds =
_
t
0
E[W
s
] ds = 0,
since E[W
s
] = 0. Then the variance is given by
V ar[Z
t
] = E[Z
2
t
] E[Z
t
]
2
= E[Z
2
t
]
= E[
_
t
0
W
u
du
_
t
0
W
v
dv] = E[
_
t
0
_
t
0
W
u
W
v
dudv]
=
_
t
0
_
t
0
E[W
u
W
v
] dudv =
__
[0,t][0,t]
minu, v dudv
=
__
D
1
minu, v dudv +
__
D
2
minu, v dudv, (2.3.2)
42
where
D
1
= (u, v); u > v, 0 u t, D
2
= (u, v); u < v, 0 u t
The rst integral can be evaluated using Fubinis theorem
__
D
1
minu, v dudv =
__
D
1
v dudv
=
_
t
0
_
_
u
0
v dv
_
du =
_
t
0
u
2
2
du =
t
3
6
.
Similarly, the latter integral is equal to
__
D
2
minu, v dudv =
t
3
6
.
Substituting in (2.3.2) yields
V ar[Z
t
] =
t
3
6
+
t
3
6
=
t
3
3
.
Exercise 2.3.4 (a) Prove that the moment generating function of Z
t
is
m(u) = e
u
2
t
3
/6
.
(b) Use the rst part to nd the mean and variance of Z
t
.
Exercise 2.3.5 Let s < t. Show that the covariance of the integrated Brownian motion
is given by
Cov
_
Z
s
, Z
t
_
= s
2
_
t
2

s
6
_
, s < t.
(a) Cov(Z
t
, Z
t
Z
th
) =
1
2
t
2
h + o(h), where o(h) denotes a quantity such that
lim
h0
o(h)/h = 0;
(b) Cov(Z
t
, W
t
) =
t
2
2
.
E[e
Ws+Wu
] = e
u+s
2
e
min{s,u}
.
Exercise 2.3.8 Consider the process X
t
=
_
t
0
e
Ws
ds.
(a) Find the mean of X
t
;
(b) Find the variance of X
t
.
Exercise 2.3.9 Consider the process Z
t
=
_
t
0
W
u
du, t > 0.
(a) Show that E[Z
T
[T
t
] = Z
t
+W
t
(T t), for any t < T;
(b) Prove that the process M
t
= Z
t
tW
t
is an T
t
-martingale.
43
2.4 Exponential Integrated Brownian Motion
If Z
t
=
_
t
0
W
s
ds denotes the integrated Brownian motion, the process
V
t
= e
Zt
is called exponential integrated Brownian motion. The process starts at V
0
= e
0
= 1.
Since Z
t
is normally distributed, then V
t
is log-normally distributed. We compute the
mean and the variance in a direct way. Using Exercises 2.2.5 and 2.3.4 we have
E[V
t
] = E[e
Zt
] = m(1) = e
t
3
6
E[V
2
t
] = E[e
2Zt
] = m(2) = e
4t
3
6
= e
2t
3
3
V ar(V
t
) = E[V
2
t
] E[V
t
]
2
= e
2t
3
3
e
t
3
3
Cov(V
s
, V
t
) = e
t+3s
2
.
Exercise 2.4.1 Show that E[V
T
[T
t
] = V
t
e
(Tt)Wt+
(Tt)
3
3
for t < T.
2.5 Brownian Bridge
The process X
t
= W
t
tW
1
is called the Brownian bridge xed at both 0 and 1. Since
we can also write
X
t
= W
t
tW
t
tW
1
+tW
t
= (1 t)(W
t
W
0
) t(W
1
W
t
),
using that the increments W
t
W
0
and W
1
W
t
are independent and normally dis-
tributed, with
W
t
W
0
N(0, t), W
1
W
t
N(0, 1 t),
it follows that X
t
is normally distributed with
E[X
t
] = (1 t)E[(W
t
W
0
)] tE[(W
1
W
t
)] = 0
V ar[X
t
] = (1 t)
2
V ar[(W
t
W
0
)] +t
2
V ar[(W
1
W
t
)]
= (1 t)
2
(t 0) +t
2
(1 t)
= t(1 t).
This can be also stated by saying that the Brownian bridge tied at 0 and 1 is a Gaussian
process with mean 0 and variance t(1 t), so X
t
N
_
0, t(1 t)
_
.
t
= W
t
tW
1
, 0 t 1 be a Brownian bridge xed at 0 and 1.
Let Y
t
= X
2
t
. Show that Y
0
= Y
1
= 0 and nd E[Y
t
] and V ar(Y
t
).
44
2.6 Brownian Motion with Drift
The process Y
t
= t +W
t
, t 0, is called Brownian motion with drift. The process Y
t
tends to drift o at a rate . It starts at Y
0
= 0 and it is a Gaussian process with mean
E[Y
t
] = t +E[W
t
] = t
and variance
V ar[Y
t
] = V ar[t +W
t
] = V ar[W
t
] = t.
Exercise 2.6.1 Find the distribution and the density functions of the process Y
t
.
2.7 Bessel Process
This section deals with the process satised by the Euclidean distance from the origin
to a particle following a Brownian motion in R
n
. More precisely, if W
1
(t), , W
n
(t) are
independent Brownian motions, let W(t) = (W
1
(t), , W
n
(t)) be a Brownian motion
in R
n
, n 2. The process
R
t
= dist(O, W(t)) =
_
W
1
(t)
2
+ +W
n
(t)
2
is called n-dimensional Bessel process.
The probability density of this process is given by the following result.
Proposition 2.7.1 The probability density function of R
t
, t > 0 is given by
p
t
() =
_
_
2
(2t)
n/2
(n/2)

n1
e
2
2t
, 0;
0, < 0
with
_
n
2
_
=
_
_
_
(
n
2
1)! for n even;
(
n
2
1)(
n
2
2)
3
2
1
2
, for n odd.
Proof: Since the Brownian motions W
1
(t), . . . , W
n
(t) are independent, their joint
density function is
f
W
1
Wn
(t) = f
W
1
(t) f
Wn
(t)
=
1
(2t)
n/2
e
(x
2
1
++x
2
n
)/(2t)
, t > 0.
In the next computation we shall use the following formula of integration that
follows from the use of polar coordinates
45
_
{|x|}
f(x) dx = (S
n1
)
_

0
r
n1
g(r) dr, (2.7.3)
where f(x) = g([x[) is a function on R
n
with spherical symmetry, and where
(S
n1
) =
2
n/2
(n/2)
is the area of the (n 1)-dimensional sphere in R
n
.
Let 0. The distribution function of R
t
is
F
R
() = P(R
t
) =
_
{Rt}
f
W
1
Wn
(t) dx
1
dx
n
=
_
x
2
1
++x
2
n
2
1
(2t)
n/2
e
(x
2
1
++x
2
n
)/(2t)
dx
1
dx
n
=
_

0
r
n1
_
_
S(0,1)
1
(2t)
n/2
e
(x
2
1
++x
2
n
)/(2t)
d
_
dr
=
(S
n1
)
(2t)
n/2
_

0
r
n1
e
r
2
/(2t)
dr.
Dierentiating yields
p
t
() =
d
d
F
R
() =
(S
n1
)
(2t)
n/2

n1
e
2
2t
=
2
(2t)
n/2
(n/2)

n1
e
2
2t
, > 0, t > 0.
It is worth noting that in the 2-dimensional case the aforementioned density becomes
a particular case of a Weibull distribution with parameters m = 2 and = 2t, called
Walds distribution
p
t
(x) =
1
t
xe
x
2
2t
, x > 0, t > 0.
Exercise 2.7.2 Let P(R
t
t) be the probability of a 2-dimensional Brownian motion
to be inside of the disk D(0, ) at time t > 0. Show that
2
2t
_
1

2
4t
_
< P(R
t
t) <

2
2t
.
Exercise 2.7.3 Let R
t
be a 2-dimensional Bessel process. Show that
(a) E[R
t
] =
2t/2;
(b) V ar(R
t
) = 2t
_
1

4
_
.
46
t
=
R
t
t
, t > 0, where R
t
is a 2-dimensional Bessel process. Show
that X
t
0 as t in mean square.
2.8 The Poisson Process
A Poisson process describes the number of occurrences of a certain event before time t,
such as: the number of electrons arriving at an anode until time t; the number of cars
arriving at a gas station until time t; the number of phone calls received on a certain
day until time t; the number of visitors entering a museum on a certain day until time
t; the number of earthquakes that occurred in Chile during the time interval [0, t]; the
number of shocks in the stock market from the beginning of the year until time t; the
number of twisters that might hit Alabama from the beginning of the century until
time t.
The denition of a Poisson process is stated more precisely in the following.
Denition 2.8.1 A Poisson process is a stochastic process N
t
, t 0, which satises
1. The process starts at the origin, N
0
= 0;
2. N
t
has stationary, independent increments;
3. The process N
t
is right continuous in t, with left hand limits;
4. The increments N
t
N
s
, with 0 < s < t, have a Poisson distribution with
parameter (t s), i.e.
P(N
t
N
s
= k) =

k
(t s)
k
k!
e
(ts)
.
It can be shown that condition 4 in the previous denition can be replaced by the
following two conditions:
P(N
t
N
s
= 1) = (t s) +o(t s) (2.8.4)
P(N
t
N
s
2) = o(t s), (2.8.5)
where o(h) denotes a quantity such that lim
h0
o(h)/h = 0. Then the probability
that a jump of size 1 occurs in the innitesimal interval dt is equal to dt, and the
probability that at least 2 events occur in the same small interval is zero. This implies
that the random variable dN
t
may take only two values, 0 and 1, and hence satises
P(dN
t
= 1) = dt (2.8.6)
P(dN
t
= 0) = 1 dt. (2.8.7)
Exercise 2.8.2 Show that if condition 4 is satised, then conditions (2.8.4 2.8.5)
hold.
47
Exercise 2.8.3 Which of the following expressions are o(h)?
(a) f(h) = 3h
2
+h;
(b) f(h) =
h + 5;
(c) f(h) = hln [h[;
(d) f(h) = he
h
.
The fact that N
t
N
s
is stationary can be stated as
P(N
t+s
N
s
n) = P(N
t
N
0
n) = P(N
t
n) =
n
k=0
(t)
k
k!
e
t
.
From condition 4 we get the mean and variance of increments
E[N
t
N
s
] = (t s), V ar[N
t
N
s
] = (t s).
In particular, the random variable N
t
is Poisson distributed with E[N
t
] = t and
V ar[N
t
] = t. The parameter is called the rate of the process. This means that the
events occur at the constant rate .
Since the increments are independent, we have for 0 < s < t
E[N
s
N
t
] = E[(N
s
N
0
)(N
t
N
s
) +N
2
s
]
= E[N
s
N
0
]E[N
t
N
s
] +E[N
2
s
]
= s (t s) + (V ar[N
s
] +E[N
s
]
2
)
=
2
st +s. (2.8.8)
As a consequence we have the following result:
Proposition 2.8.4 Let 0 s t. Then
1. Cov(N
s
, N
t
) = s;
2. Corr(N
s
, N
t
) =
_
s
t
.
Proof: 1. Using (2.8.8) we have
Cov(N
s
, N
t
) = E[N
s
N
t
] E[N
s
]E[N
t
]
=
2
st +s st
= s.
2. Using the formula for the correlation yields
Corr(N
s
, N
t
) =
Cov(N
s
, N
t
)
(V ar[N
s
]V ar[N
t
])
1/2
=
s
(st)
1/2
=
_
s
t
.
It worth noting the similarity with Proposition 2.1.8.
48
Proposition 2.8.5 Let N
t
be T
t
-adapted. Then the process M
t
= N
t
t is an T
t
-
martingale.
Proof: Let s < t and write N
t
= N
s
+ (N
t
N
s
). Then
E[N
t
[T
s
] = E[N
s
+ (N
t
N
s
)[T
s
]
= E[N
s
[T
s
] +E[N
t
N
s
[T
s
]
= N
s
+E[N
t
N
s
]
= N
s
+(t s),
where we used that N
s
is T
s
-predictable (and hence E[N
s
[T
s
] = N
s
) and that the
increment N
t
N
s
is independent of previous values of N
s
and the information set T
s
.
Subtracting t yields
E[N
t
t[T
s
] = N
s
s,
or E[M
t
[T
s
] = M
s
. Since it is obvious that M
t
is integrable and T
t
-adapted, it follows
that M
t
is a martingale.
It is worth noting that the Poisson process N
t
is not a martingale. The martingale
process M
t
= N
t
t is called the compensated Poisson process.
Exercise 2.8.6 Compute E[N
2
t
[T
s
] for s < t. Is the process N
2
t
an T
s
-martingale?
Exercise 2.8.7 (a) Show that the moment generating function of the random variable
N
t
is
m
N
t
(x) = e
t(e
x
1)
.
(b) Deduct the expressions for the rst few moments
E[N
t
] = t
E[N
2
t
] =
2
t
2
+t
E[N
3
t
] =
3
t
3
+ 3
2
t
2
+t
E[N
4
t
] =
4
t
4
+ 6
3
t
3
+ 7
2
t
2
+t.
(c) Show that the rst few central moments are given by
E[N
t
t] = 0
E[(N
t
t)
2
] = t
E[(N
t
t)
3
] = t
E[(N
t
t)
4
] = 3
2
t
2
+t.
Exercise 2.8.8 Find the mean and variance of the process X
t
= e
Nt
.
49
Exercise 2.8.9 (a) Show that the moment generating function of the random variable
M
t
is
m
M
t
(x) = e
t(e
x
x1)
.
(b) Let s < t. Verify that
E[M
t
M
s
] = 0,
E[(M
t
M
s
)
2
] = (t s),
E[(M
t
M
s
)
3
] = (t s),
E[(M
t
M
s
)
4
] = (t s) + 3
2
(t s)
2
.
Exercise 2.8.10 Let s < t. Show that
V ar[(M
t
M
s
)
2
] = (t s) + 2
2
(t s)
2
.
2.9 Interarrival times
For each state of the world, , the path t N
t
() is a step function that exhibits unit
jumps. Each jump in the path corresponds to an occurrence of a new event. Let T
1
be the random variable which describes the time of the 1st jump. Let T
2
be the time
between the 1st jump and the second one. In general, denote by T
n
the time elapsed
between the (n 1)th and nth jumps. The random variables T
n
are called interarrival
times.
Proposition 2.9.1 The random variables T
n
are independent and exponentially dis-
tributed with mean E[T
n
] = 1/.
Proof: We start by noticing that the events T
1
> t and N
t
= 0 are the same,
since both describe the situation that no events occurred until after time t. Then
P(T
1
> t) = P(N
t
= 0) = P(N
t
N
0
= 0) = e
t
,
and hence the distribution function of T
1
is
F
T
1
(t) = P(T
1
t) = 1 P(T
1
> t) = 1 e
t
.
Dierentiating yields the density function
f
T
1
(t) =
d
dt
F
T
1
(t) = e
t
.
It follows that T
1
is has an exponential distribution, with E[T
1
] = 1/.
In order to show that the random variables T
1
and T
2
are independent, if suces to
show that
P(T
2
t) = P(T
2
t[T
1
= s),
50
i.e. the distribution function of T
2
is independent of the values of T
1
. We note rst
that from the independent increments property
P
_
0 jumps in (s, s +t], 1 jump in (0, s]
_
= P(N
s+t
N
s
= 0, N
s
N
0
= 1)
= P(N
s+t
N
s
= 0)P(N
s
N
0
= 1) = P
_
0 jumps in (s, s +t]
_
P
_
1 jump in (0, s]
_
.
Then the conditional distribution of T
2
is
F(t[s) = P(T
2
t[T
1
= s) = 1 P(T
2
> t[T
1
= s)
= 1
P(T
2
> t, T
1
= s)
P(T
1
= s)
= 1
P
_
0 jumps in (s, s +t], 1 jump in (0, s]
_
P(T
1
= s)
= 1
P
_
_
P
_
1 jump in (0, s]
_
P
_
1 jump in (0, s]
_ = 1 P
_
_
= 1 P(N
s+t
N
s
= 0) = 1 e
t
,
which is independent of s. Then T
2
is independent of T
1
and exponentially distributed.
A similar argument for any T
n
leads to the desired result.
2.10 Waiting times
The random variable S
n
= T
1
+ T
2
+ + T
n
is called the waiting time until the nth
jump. The event S
n
t means that there are n jumps that occurred before or at
time t, i.e. there are at least n events that happened up to time t; the event is equal
to N
t
n. Hence the distribution function of S
n
is given by
F
Sn
(t) = P(S
n
t) = P(N
t
n) = e
t
k=n
(t)
k
k!

Dierentiating we obtain the density function of the waiting time S
n
f
Sn
(t) =
d
dt
F
Sn
(t) =
e
t
(t)
n1
(n 1)!

Writing
f
Sn
(t) =
t
n1
e
t
(1/)
n
(n)
,
it turns out that S
n
has a gamma distribution with parameters = n and = 1/. It
follows that
E[S
n
] =
n
, V ar[S
n
] =
n
51
Figure 2.2: The Poisson process N
t
and the waiting times S
1
, S
2
, S
n
. The shaded
rectangle has area n(S
n+1
t).
The relation lim
n
E[S
n
] = states that the expectation of the waiting time is un-
bounded as n .
Exercise 2.10.1 Prove that
d
dt
F
Sn
(t) =
e
t
(t)
n1
(n 1)!

Exercise 2.10.2 Using that the interarrival times T
1
, T
2
, are independent and ex-
ponentially distributed, compute directly the mean E[S
n
] and variance V ar(S
n
).
2.11 The Integrated Poisson Process
The function u N
u
is continuous with the exception of a set of countable jumps of
size 1. It is known that such functions are Riemann integrable, so it makes sense to
dene the process
U
t
=
_
t
0
N
u
du,
called the integrated Poisson process. The next result provides a relation between the
process U
t
and the partial sum of the waiting times S
k
.
Proposition 2.11.1 The integrated Poisson process can be expressed as
U
t
= tN
t
Nt
k=1
S
k
.
52
Let N
t
= n. Since N
t
is equal to k between the waiting times S
k
and S
k+1
, the process
U
t
, which is equal to the area of the subgraph of N
u
between 0 and t, can be expressed
as
U
t
=
_
t
0
N
u
du = 1 (S
2
S
1
) + 2 (S
3
S
2
) + +n(S
n+1
S
n
) n(S
n+1
t).
Since S
n
< t < S
n+1
, the dierence of the last two terms represents the area of last
the rectangle, which has the length t S
n
and the height n. Using associativity, a
computation yields
1 (S
2
S
1
) + 2 (S
3
S
2
) + +n(S
n+1
S
n
) = nS
n+1
(S
1
+S
2
+ +S
n
).
Substituting in the aforementioned relation yields
U
t
= nS
n+1
(S
1
+S
2
+ +S
n
) n(S
n+1
t)
= nt (S
1
+S
2
+ +S
n
)
= tN
t
Nt
k=1
S
k
,
where we replaced n by N
t
.
The conditional distribution of the waiting times is provided by the following useful
result.
Theorem 2.11.2 Given that N
t
= n, the waiting times S
1
, S
2
, , S
n
have the the
joint density function given by
f(s
1
, s
2
, , s
n
) =
n!
t
n
, 0 < s
1
s
2
s
n
< t.
This is the same as the density of an ordered sample of size n from a uniform distribution
on the interval (0, t). A naive explanation of this result is as follows. If we know
that there will be exactly n events during the time interval (0, t), since the events
can occur at any time, each of them can be considered uniformly distributed, with
the density f(s
k
) = 1/t. Since it makes sense to consider the events independent,
taking into consideration all possible n! permutaions, the joint density function becomes
f(s
1
, , s
n
) = n!f(s
1
) f(s
n
) =
n!
t
n
.
Exercise 2.11.3 Find the following means
(a) E[U
t
].
(b) E
_
Nt
k=1
S
k
.
53
Exercise 2.11.4 Show that V ar(U
t
) =
t
3
3
.
Exercise 2.11.5 Can you apply a similar proof as in Proposition 2.3.2 to show that
the integrated Poisson process U
t
is also a Poisson process?
Exercise 2.11.6 Let Y : N be a discrete random variable. Then for any random
variable X we have
E[X] =
y0
E[X[Y = y]P(Y = y).
Exercise 2.11.7 Use Exercise 2.11.6 to solve Exercise 2.11.3 (b).
Exercise 2.11.8 (a) Let T
k
be the kth interarrival time. Show that
E[e
T
k
] =

+
, > 0.
(b) Let n = N
t
. Show that
U
t
= nt [nT
1
+ (n 1)T
2
+ + 2T
n1
+T
n
].
(c) Find the conditional expectation
E
_
e
Ut
N
t
= n
_
.
(Hint: If know that there are exactly n jumps in the interval [0, T], it makes sense
to consider the arrival time of the jumps T
i
independent and uniformly distributed on
[0, T]).
(d) Find the expectation
E
_
e
Ut
_
.
2.12 Submartingales
A stochastic process X
t
on the probability space (, T, P) is called submartingale with
respect to the ltration T
t
if:
(a)
_
[X
t
[ dP < (X
t
integrable);
(b) X
t
is known if T
t
is given (X
t
is adaptable to T
t
);
(c) E[X
t+s
[T
t
] X
t
, t, s 0 (future predictions exceed the present value).
Example 2.12.1 We shall prove that the process X
t
= t + W
t
, with > 0 is a
submartingale.
54
The integrability follows from the inequality [X
t
()[ t + [W
t
()[ and integrability
of W
t
. The adaptability of X
t
is obvious, and the last property follows from the
computation:
E[X
t+s
[T
t
] = E[t +W
t+s
[T
t
] +s > E[t +W
t+s
[T
t
]
= t +E[W
t+s
[T
t
] = t +W
t
= X
t
,
where we used that W
t
is a martingale.
Example 2.12.2 We shall show that the square of the Brownian motion, W
2
t
, is a
submartingale.
Using that W
2
t
t is a martingale, we have
E[W
2
t+s
[T
t
] = E[W
2
t+s
(t +s)[T
t
] +t +s = W
2
t
t +t +s
= W
2
t
+s W
2
t
.
The following result supplies examples of submatingales starting from martingales
or submartingales.
Proposition 2.12.3 (a) If X
t
is a martingale and a convex function such that (X
t
)
is integrable, then the process Y
t
= (X
t
) is a submartingale.
(b) If X
t
is a submartingale and an increasing convex function such that (X
t
) is
integrable, then the process Y
t
= (X
t
) is a submartingale.
(c) If X
t
is a martingale and f(t) is an increasing, integrable function, then Y
t
=
X
t
+f(t) is a submartingale.
Proof: (a) Using Jensens inequality for conditional probabilities, Exercise 1.12.6, we
have
E[Y
t+s
[T
t
] = E[(X
t+s
)[T
t
]
_
E[X
t+s
[T
t
]
_
= (X
t
) = Y
t
.
(b) From the submatingale property and monotonicity of we have
_
E[X
t+s
[T
t
]
_
(X
t
).
Then apply a similar computation as in part (a).
(c) We shall check only the forcast property, since the other are obvious.
E[Y
t+s
[T
t
] = E[X
t+s
+f(t +s)[T
t
] = E[X
t+s
[T
t
] +f(t +s)
= X
t
+f(t +s) X
t
+f(t) = Y
t
, s, t > 0.
55
Corollary 2.12.4 (a) Let X
t
be a martingale. Then X
2
t
, [X
t
[, e
Xt
are submartingales.
(b) Let > 0. Then e
t+Wt
is a submartingale.
Proof: (a) Results from part (a) of Proposition 2.12.3.
(b) It follows from Example 2.12 and part (b) of Proposition 2.12.3.
The following result provides important inequalities involving submartingales.
Proposition 2.12.5 (Doobs Submartingale Inequality) (a) Let X
t
be a non-negative
submartingale. Then
P(sup
st
X
s
x)
E[X
t
]
x
, x > 0.
(b) If X
t
is a right continuous submartingale, then for any x > 0
P(sup
st
X
t
x)
E[X
+
t
]
x
,
where X
+
t
= maxX
t
, 0.
Exercise 2.12.6 Let x > 0. Show the inequalities:
(a) P(sup
st
W
2
s
x)
t
x
.
(b) P(sup
st
[W
s
[ x)
_
2t/
x
.
Exercise 2.12.7 Show that st-lim
t
sup
st
[W
s
[
t
= 0.
Exercise 2.12.8 Show that for any martingale X
t
we have the inequality
P(sup
st
X
2
t
> x)
E[X
2
t
]
x
, x > 0.
It is worth noting that Doobs inequality implies Markovs inequality. Since sup
st
X
s

X
t
, then P(X
t
x) P(sup
st
X
s
x). Then Doobs inequality
P(sup
st
X
s
x)
E[X
t
]
x
implies Markovs inequality (see Theorem 1.12.8)
P(X
t
x)
E[X
t
]
x
.
56
Exercise 2.12.9 Let N
t
denote the Poisson process and consider the information set
T
t
= N
s
; s t.
(a) Show that N
t
is a submartingale;
(b) Is N
2
t
a submartingale?
Exercise 2.12.10 It can be shown that for any 0 < < we have the inequality
E
_

t
_
N
t
t

_
2
_

4
2

Using this inequality prove that mslim
t
N
t
t
= .
Chapter 3
Properties of Stochastic
Processes
3.1 Stopping Times
Consider the probability space (, T, P) and the ltration (T
t
)
t0
, i.e.
T
t
T
s
T, t < s.
Assume that the decision to stop playing a game before or at time t is determined by
the information T
t
available at time t. Then this decision can be modeled by a random
variable : [0, ] which satises
; () t T
t
.
This means that given the information set T
t
, we know whether the event ; () t
had occurred or not. We note that the possibility = is included, since the decision
to continue the game for ever is also a possible event. However, we ask the condition
P( < ) = 1. A random variable with the previous properties is called a stopping
time.
The next example illustrates a few cases when a decision is or is not a stopping time.
In order to accomplish this, think of the situation that is the time when some random
event related to a given stochastic process occurs rst.
Example 3.1.1 Let T
t
be the information available until time t regarding the evolution
of a stock. Assume the price of the stock at time t = 0 is $50 per share. The following
decisions are stopping times:
(a) Sell the stock when it reaches for the rst time the price of $100 per share;
(b) Buy the stock when it reaches for the rst time the price of $10 per share;
(c) Sell the stock at the end of the year;
57
58
(d) Sell the stock either when it reaches for the rst time $80 or at the end of the
year.
(e) Keep the stock either until the initial investment doubles or until the end of the
year;
The following decisions are not stopping times:
(f) Sell the stock when it reaches the maximum level it will ever be;
(g) Keep the stock until the initial investment at least doubles.
Part (f) is not a stopping time because it requires information about the future that is
not contained in T
t
. For part (g), since the initial stock price is S
0
= $50, the general
theory of stock prices state
P(S
t
2S
0
) = P(S
t
100) < 1,
i.e. there is a positive probability that the stock never doubles its value. This contra-
dicts the condition P( = ) = 0. In part (e) there are two conditions; the latter one
has the occurring probability equal to 1.
Exercise 3.1.2 Show that any positive constant, = c, is a stopping time with respect
to any ltration.
Exercise 3.1.3 Let () = inft > 0; [W
t
()[ > K, with K > 0 constant. Show that
is a stopping time with respect to the ltration T
t
= (W
s
; s t).
The random variable is called the rst exit time of the Brownian motion W
t
from the
interval (K, K). In a similar way one can dene the rst exit time of the process X
t
from the interval (a, b):
() = inft > 0; X
t
() / (a, b) = inft > 0; X
t
() > b or X
t
() < a).
Let X
0
< a. The rst entry time of X
t
in the interval (a, b) is dened as
() = inft > 0; X
t
() (a, b) = inft > 0; X
t
() > a or X
t
() < b).
If let b = , we obtain the rst hitting time of the level a
() = inft > 0; X
t
() > a).
We shall deal with hitting times in more detail in section 3.3.
t
be a continuous stochastic process. Prove that the rst exit
time of X
t
from the interval (a, b) is a stopping time.
We shall present in the following some properties regarding operations with stopping
times. Consider the notations
1

2
= max
1
,
2
,
1

2
= min
1
,
2
,
n
=
sup
n1
n
and
n
= inf
n1
n
.
59
Proposition 3.1.5 Let
1
and
2
be two stopping times with respect to the ltration
T
t
. Then
1.
1

2
2.
1

2
3.
1
+
2
are stopping times.
Proof: 1. We have
;
1

2
t = ;
1
t ;
2
t T
t
,
since ;
1
t T
t
and ;
2
t T
t
. From the subadditivity of P we get
P
_
;
1

2
=
_
= P
_
;
1
= ;
2
=
_
P
_
;
1
=
_
+P
_
;
2
=
_
.
Since
P
_
;
i
=
_
= 1 P
_
;
i
<
_
= 0, i = 1, 2,
it follows that P
_
;
1

2
=
_
= 0 and hence P
_
;
1

2
<
_
= 1. Then
1

2
is a stopping time.
2. The event ;
1

2
t T
t
if and only if ;
1

2
> t T
t
.
;
1

2
> t = ;
1
> t ;
2
> t T
t
,
since ;
1
> t T
t
and ;
2
> t T
t
(the -algebra T
t
is closed to complements).
The fact that
1

2
< almost surely has a similar proof.
3. We note that
1
+
2
t if there is a c (0, t) such that
1
c,
2
t c.
Using that the rational numbers are dense in R, we can write
;
1
+
2
t =
_
0<c<t,cQ
_
;
1
c ;
2
t c
_
T
t
,
since
;
1
c T
c
T
t
, ;
2
t c T
tc
T
t
.
Writing
;
1
+
2
= = ;
1
= ;
2
=
yields
P
_
;
1
+
2
=
_
P
_
;
1
=
_
+P
_
;
2
=
_
= 0,
60
Hence P
_
;
1
+
2
<
_
= 1. It follows that
1
+
2
is a stopping time.
A ltration T
t
is called right-continuous if T
t
=
n=1
T
t+
1
n
, for t > 0. This means
that the information available at time t is a good approximation for any future in-
nitesimal information T
t+
; or, equivalently, nothing more can be learned by peeking
innitesimally far into the future.
Exercise 3.1.6 (a) Let T
t
= W
s
; s t, where W
t
is a Brownian motion. Show
that T
t
is right-continuous.
(b) Let A
t
= N
s
; s t, where N
t
is a Poisson motion. Is A
t
right-continuous?
Proposition 3.1.7 Let T
t
be right-continuous and (
n
)
n1
be a sequence of bounded
stopping times.
(a) Then sup
n
and inf
n
are stopping times.
(b) If the sequence (
n
)
n1
converges to , ,= 0, then is a stopping time.
Proof: (a) The fact that
n
is a stopping time follows from
;
n
t
n1
;
n
t T
t
,
and from the boundedness, which implies P(
n
< ) = 1.
In order to show that
n
is a stopping time we shall proceed as in the following.
Using that T
t
is right-continuous and closed to complements, it suces to show that
;
n
t T
t
. This follows from
;
n
t =
n1
;
n
> t T
t
.
(b) Let lim
n
n
= . Then there is an increasing (or decreasing) subsequence
n
k
of
stopping times that tends to , so sup
n
k
= (or inf
n
k
= ). Since
n
k
are stopping
times, by part (a), it follows that is a stopping time.
The condition that
n
is bounded is signicant, since if take
n
= n as stopping
times, then sup
n
n
= with probability 1, which does not satisfy the stopping time
denition.
Exercise 3.1.8 Let be a stopping time.
(a) Let c 1 be a constant. Show that c is a stopping time.
(b) Let f : [0, ) R be a continuous, increasing function satisfying f(t) t.
Prove that f() is a stopping time.
(c) Show that e
is a stopping time.
61
Exercise 3.1.9 Let be a stopping time and c > 0 a constant. Prove that + c is a
stopping time.
Exercise 3.1.10 Let a be a constant and dene = inft 0; W
t
= a. Is a
stopping time?
Exercise 3.1.11 Let be a stopping time. Consider the following sequence
n
=
(m + 1)2
n
if m2
n
< (m + 1)2
n
(stop at the rst time of the form k2
n
after
). Prove that
n
is a stopping time.
3.2 Stopping Theorem for Martingales
The next result states that in a fair game, the expected nal fortune of a gambler, who
is using a stopping time to quit the game, is the same as the expected initial fortune.
From the nancially point of view, the theorem says that if you buy an asset at some
initial time and adopt a strategy of deciding when to sell it, then the expected price at
the selling time is the initial price; so one cannot make money by buying and selling an
asset whose price is a martingale. Fortunately, the price of a stock is not a martingale,
and people can still expect to make money buying and selling stocks.
If (M
t
)
t0
is an T
t
-martingale, then taking the expectation in
E[M
t
[T
s
] = M
s
, s < t
and using Example 1.11.2 yields
E[M
t
] = E[M
s
], s < t.
In particular, E[M
t
] = E[M
0
], for any t > 0. The next result states necessary conditions
under which this identity holds if t is replaced by any stopping time .
Theorem 3.2.1 (Optional Stopping Theorem) Let (M
t
)
t0
be a right continuous
T
t
-martingale and be a stopping time with respect to T
t
. If either one of the following
conditions holds:
1. is bounded, i.e. N < such that N;
2. c > 0 such that E[[M
t
[] c, t > 0,
then E[M
] = E[M
0
]. If M
t
is an T
t
-submartingale, then E[M
] E[M
0
].
Proof: We shall sketch the proof for the rst case only. Taking the expectation in
relation
M
= M
t
+ (M
M
t
)1
{>t}
,
see Exercise 3.2.3, yields
E[M
] = E[M
t
] +E[M
1
{>t}
] E[M
t
1
{>t}
].
62
Since M
t
is a martingale, see Exercise 3.2.4 (b), then E[M
t
] = E[M
0
]. The previous
relation becomes
E[M
] = E[M
0
] +E[M
1
{>t}
] E[M
t
1
{>t}
], t > 0.
Taking the limit yields
E[M
] = E[M
0
] + lim
t
E[M
1
{>t}
] lim
t
E[M
t
1
{>t}
]. (3.2.1)
We shall show that both limits are equal to zero.
Since [M
1
{>t}
[ [M
[, t > 0, and M
is integrable, see Exercise 3.2.4 (a), by

the dominated convergence theorem we have
lim
t
E[M
1
{>t}
] = lim
t
_
1
{>t}
dP =
_
lim
t
M
1
{>t}
dP = 0.
For the second limit
lim
t
E[M
t
1
{>t}
] = lim
t
_
M
t
1
{>t}
dP = 0,
since for t > N the integrand vanishes. Hence relation (3.2.1) yields E[M
] = E[M
0
].
It is worth noting that the previous theorem is a special case of the more general
Optional Stopping Theorem of Doob:
Theorem 3.2.2 Let M
t
be a right continuous martingale and , be two bounded
stopping times, with . Then M
, M
are integrable and

E[M
[T
] = M
a.s.
In particular, taking expectations, we have
E[M
] = E[M
] a.s.
In the case when M
t
is a submartingale then E[M
] E[M
] a.s.
M
= M
t
+ (M
M
t
)1
{>t}
,
where
1
{>t}
() =
_
1, () > t;
0, () t
is the indicator function of the set > t.
Exercise 3.2.4 Let M
t
be a right continuous martingale and be a stopping time.
Show that
(a) M
is integrable;
(b) M
t
is a martingale.
Exercise 3.2.5 Show that if let = 0 in Theorem 3.2.2 yields Theorem 3.2.1.
63
a
Ta
W
t
50 100 150 200 250 300 350
1.5
2.0
2.5
Figure 3.1: The rst hitting time T
a
given by W
Ta
= a.
3.3 The First Passage of Time
The rst passage of time is a particular type of hitting time, which is useful in nance
when studying barrier options and lookback options. For instance, knock-in options
enter into existence when the stock price hits for the rst time a certain barrier before
option maturity. A lookback option is priced using the maximum value of the stock
until the present time. The stock price is not a Brownian motion, but it depends on
one. Hence the need for studying the hitting time for the Brownian motion.
The rst result deals with the rst hitting time for a Brownian motion to reach the
barrier a R, see Fig.3.1.
Lemma 3.3.1 Let T
a
be the rst time the Brownian motion W
t
hits a. Then the
distribution function of T
a
is given by
P(T
a
t) =
2
2
_

|a|/
t
e
y
2
/2
dy.
Proof: If A and B are two events, then
P(A) = P(A B) +P(A B)
= P(A[B)P(B) +P(A[B)P(B). (3.3.2)
Let a > 0. Using formula (3.3.2) for A = ; W
t
() a and B = ; T
a
() t
yields
P(W
t
a) = P(W
t
a[T
a
t)P(T
a
t)
+P(W
t
a[T
a
> t)P(T
a
> t) (3.3.3)
If T
a
> t, the Brownian motion did not reach the barrier a yet, so we must have W
t
< a.
Therefore
P(W
t
a[T
a
> t) = 0.
64
If T
a
t, then W
Ta
= a. Since the Brownian motion is a Markov process, it starts
fresh at T
a
. Due to symmetry of the density function of a normal variable, W
t
has
equal chances to go up or go down after the time interval t T
a
. It follows that
P(W
t
a[T
a
t) =
1
2
.
Substituting into (3.3.3) yields
P(T
a
t) = 2P(W
t
a)
=
2
2t
_

a
e
x
2
/(2t)
dx =
2
2
_

a/
t
e
y
2
/2
dy.
If a < 0, symmetry implies that the distribution of T
a
is the same as that of T
a
, so
we get
P(T
a
t) = P(T
a
t) =
2
2
_

a/
t
e
y
2
/2
dy.
Remark 3.3.2 The previous proof is based on a more general principle called the Re-
ection Principle: If is a stopping time for the Brownian motion W
t
, then the Brow-
nian motion reected at is also a Brownian motion.
Theorem 3.3.3 Let a R be xed. Then the Brownian motion hits a (in a nite
amount of time) with probability 1.
Proof: The probability that W
t
hits a (in a nite amount of time) is
P(T
a
< ) = lim
t
P(T
a
t) = lim
t
2
2
_

|a|/
t
e
y
2
/2
dy
=
2
2
_

0
e
y
2
/2
dy = 1,
where we used the well known integral
_

0
e
y
2
/2
dy =
1
2
_

e
y
2
/2
dy =
1
2
2.
The previous result stated that the Brownian motion hits the barrier a almost
surely. The next result shows that the expected time to hit the barrier is innite.
Proposition 3.3.4 The random variable T
a
has a Pearson 5 distribution given by
p(t) =
[a[
2
e
a
2
2t
t
3
2
, t > 0.
It has the mean E[T
a
] = and the mode
a
2
3
.
65
a 3
2
1 2 3 4
0.1
0.2
0.3
0.4
Figure 3.2: The distribution of the rst hitting time T
a
.
Proof: Dierentiating in the formula of distribution function
1
F
Ta
(t) = P(T
a
t) =
2
2
_

a/
t
e
y
2
/2
dy
yields the following probability density function
p(t) =
dF
Ta
(t)
dt
=
a
2
e
a
2
2t
t
3
2
, t > 0.
This is a Pearson 5 distribution with parameters = 1/2 and = a
2
/2. The expecta-
tion is
E[T
a
] =
_

0
tp(t) dt =
a
2
_

0
1
t
e
a
2
2t
dt.
Using the inequality e
a
2
2t
> 1
a
2
2t
, t > 0, we have the estimation
E[T
a
] >
a
2
_

0
1
t
dt
a
3
2
2
_

0
1
t
3/2
dt = , (3.3.4)
since
_
0
1
t
dt is divergent and
_
0
1
t
3/2
dt is convergent.
The mode of T
a
is given by
+ 1
=
a
2
2(
1
2
+ 1)
=
a
2
3
.
1
One may use Leibnitzs formula
d
dt
(t)
(t)
f(u) du = f((t))
(t) f((t))
(t).
66
Remark 3.3.5 The distribution has a peak at a
2
/3. Then if we need to pick a small
time interval [t dt, t +dt] in which the probability that the Brownian motion hits the
barrier a is maximum, we need to choose t = a
2
/3.
Remark 3.3.6 Formula (3.3.4) states that the expected waiting time for W
t
to reach
the barrier a is innite. However, the expected waiting time for the Brownian motion
W
t
to hit either a or a is nite, see Exercise 3.3.9.
Corollary 3.3.7 A Brownian motion process returns to the origin in a nite amount
time with probability 1.
Proof: Choose a = 0 and apply Theorem 3.3.3.
Exercise 3.3.8 Try to apply the proof of Lemma 3.3.1 for the following stochastic
processes
(a) X
t
= t +W
t
, with , > 0 constants;
(b) X
t
=
_
t
0
W
s
ds.
Where is the diculty?
Exercise 3.3.9 Let a > 0 and consider the hitting time
a
= inft > 0; W
t
= a or W
t
= a = inft > 0; [W
t
[ = a.
Prove that E[
a
] = a
2
.
Exercise 3.3.10 (a) Show that the distribution function of the process
X
t
= max
s[0,t]
W
s
is given by
P(X
t
a) =
2
2
_
a/
t
0
e
y
2
/2
dy.
(b) Show that E[X
t
] =
_
2t/ and V ar(X
t
) = t
_
1
2
_
.
Exercise 3.3.11 (a) Find the distribution of Y
t
= [W
t
[, t 0;
(b) Show that E
_
max
0tT
[W
t
[
=
_
T
2
.
The fact that a Brownian motion returns to the origin or hits a barrier almost surely
is a property characteristic to the rst dimension only. The next result states that in
larger dimensions this is no longer possible.
67
Theorem 3.3.12 Let (a, b) R
2
. The 2-dimensional Brownian motion W(t) =
_
W
1
(t), W
2
(t)
_
(with W
1
(t) and W
2
(t) independent) hits the point (a, b) with proba-
bility zero. The same result is valid for any n-dimensional Brownian motion, with
n 2.
However, if the point (a, b) is replaced by the disk D
(x
0
) = x R
2
; [x x
0
[ ,
there is a dierence in the behavior of the Brownian motion from n = 2 to n > 2.
Theorem 3.3.13 The 2-dimensional Brownian motion W(t) =
_
W
1
(t), W
2
(t)
_
hits
the disk D
(x
0
) with probability one.
Theorem 3.3.14 Let n > 2. The n-dimensional Brownian motion W(t) =
_
W
1
(t), W
n
(t)
_
hits the ball D
(x
0
) with probability
P =
_
[x
0
[
_
2n
< 1.
The previous results can be stated by saying that that Brownian motion is transient
in R
n
, for n > 2. If n = 2 the previous probability equals 1. We shall come back with
proofs to the aforementioned results in a later chapter.
Remark 3.3.15 If life spreads according to a Brownian motion, the aforementioned
results explain why life is more extensive on earth rather than in space. The probability
for a form of life to reach a planet of radius R situated at distance d is
R
d
. Since d is
large the probability is very small, unlike in the plane, where the probability is always
1.
Exercise 3.3.16 Is the one-dimensional Brownian motion transient or recurrent in
R?
3.4 The Arc-sine Laws
In this section we present a few results which provide certain probabilities related with
the behavior of a Brownian motion in terms the arc-sine of a quotient of two time
instances. These results are generally known under the name of law of arc-sines.
The following result will be used in the proof of the rst law of Arc-sine.
Proposition 3.4.1 (a) If X : N is a discrete random variable, then for any
subset A , we have
P(A) =
xN
P(A[X = x)P(X = x).
(b) If X : R is a continuous random variable, then
P(A) =
_
P(A[X = x)dP =
_
P(A[X = x)f
X
(x) dx.
68
t
t 1
2
W
t
a
50 100 150 200 250 300 350
0.6
0.4
0.2
0.2
0.4
Figure 3.3: The event A(a; t
1
, t
2
) in the Law of Arc-sine.
Proof: (a) The sets X
1
(x) = X = x = ; X() = x form a partition of the
sample space , i.e.:
(i) =
x
X
1
(x);
(ii) X
1
(x) X
1
(y) = for x ,= y.
Then A =
_
x
_
A X
1
(x)
_
=
_
x
_
A X = x
_
, and hence
P(A) =
x
P
_
A X = x
_
=
x
P(A X = x)
P(X = x)
P(X = x)
=
x
P(A[X = x)P(X = x).
(b) In the case when X is continuous, the sum is replaced by an integral and the
probability P(X = x) by f
X
(x)dx, where f
X
is the density function of X.
The zero set of a Brownian motion W
t
is dened by t 0; W
t
= 0. Since W
t
is
continuous, the zero set is closed with no isolated points almost surely. The next result
deals with the probability that the zero set does not intersect the interval (t
1
, t
2
).
Theorem 3.4.2 (The law of Arc-sine) The probability that a Brownian motion W
t
does not have any zeros in the interval (t
1
, t
2
) is equal to
P(W
t
,= 0, t
1
t t
2
) =
2
arcsin
_
t
1
t
2
.
Proof: Let A(a; t
1
, t
2
) denote the event that the Brownian motion W
t
takes on the
value a between t
1
and t
2
. In particular, A(0; t
1
, t
2
) denotes the event that W
t
has (at
least) a zero between t
1
and t
2
. Substituting A = A(0; t
1
, t
2
) and X = W
t
1
into the
formula provided by Proposition 3.4.1
69
P(A) =
_
P(A[X = x)f
X
(x) dx
yields
P
_
A(0; t
1
, t
2
)
_
=
_
P
_
A(0; t
1
, t
2
)[W
t
1
= x
_
f
W
t
1
(x) dx (3.4.5)
=
1
2t
1
_

P
_
A(0; t
1
, t
2
)[W
t
1
= x
_
e
x
2
2t
1
dx
Using the properties of W
t
with respect to time translation and symmetry we have
P
_
A(0; t
1
, t
2
)[W
t
1
= x
_
= P
_
A(0; 0, t
2
t
1
)[W
0
= x
_
= P
_
A(x; 0, t
2
t
1
)[W
0
= 0
_
= P
_
A([x[; 0, t
2
t
1
)[W
0
= 0
_
= P
_
A([x[; 0, t
2
t
1
)
_
= P
_
T
|x|
t
2
t
1
_
,
the last identity stating that W
t
hits [x[ before t
2
t
1
. Using Lemma 3.3.1 yields
P
_
A(0; t
1
, t
2
)[W
t
1
= x
_
=
2
_
2(t
2
t
1
)
_

|x|
e
y
2
2(t
2
t
1
)
dy.
Substituting into (3.4.5) we obtain
P
_
A(0; t
1
, t
2
)
_
=
1
2t
1
_

_
2
_
2(t
2
t
1
)
_

|x|
e
y
2
2(t
2
t
1
)
dy
_
e
x
2
2t
1
dx
=
1
_
t
1
(t
2
t
1
)
_

0
_

|x|
e
y
2
2(t
2
t
1
)
x
2
2t
1
dydx.
The above integral can be evaluated to get (see Exercise 3.4.3 )
P
_
A(0; t
1
, t
2
)
_
= 1
2
arcsin
_
t
1
t
2
.
Using P(W
t
,= 0, t
1
t t
2
) = 1P
_
A(0; t
1
, t
2
)
_
we obtain the desired result.
Exercise 3.4.3 Use polar coordinates to show
1
_
t
1
(t
2
t
1
)
_

0
_

|x|
e
y
2
2(t
2
t
1
)
x
2
2t
1
dydx = 1
2
arcsin
_
t
1
t
2
.
Exercise 3.4.4 Find the probability that a 2-dimensional Brownian motion W(t) =
_
W
1
(t), W
2
(t)
_
stays in the same quadrant for the time interval t (t
1
, t
2
).
70
Exercise 3.4.5 Find the probability that a Brownian motion W
t
does not take the
value a in the interval (t
1
, t
2
).
Exercise 3.4.6 Let a ,= b. Find the probability that a Brownian motion W
t
does not
take any of the values a, b in the interval (t
1
, t
2
). Formulate and prove a generaliza-
tion.
We provide below without proof a few similar results dealing with arc-sine probabilities.
The rst result deals with the amount of time spent by a Brownian motion on the
positive half-axis.
Theorem 3.4.7 (Arc-sine Law of Levy) Let L
+
t
=
_
t
0
sgn
+
W
s
ds be the amount of
time a Brownian motion W
t
is positive during the time interval [0, t]. Then
P(L
+
t
) =
2
arcsin
_
t
.
The next result deals with the Arc-sine law for the last exit time of a Brownian
motion from 0.
Theorem 3.4.8 (Arc-sine Law of exit from 0) Let
t
= sup0 s t; W
s
= 0.
Then
P(
t
) =
2
arcsin
_
t
, 0 t.
The Arc-sine law for the time the Brownian motion attains its maximum on the
interval [0, t] is given by the next result.
Theorem 3.4.9 (Arc-sine Law of maximum) Let M
t
= max
0st
W
s
and dene
t
= sup0 s t; W
s
= M
t
.
Then
P(
t
s) =
2
arcsin
_
s
t
, 0 s t, t > 0.
3.5 More on Hitting Times
In this section we shall deal with results regarding hitting times of Brownian motion
with drift. They will be useful in the sequel when pricing barrier options.
Theorem 3.5.1 Let X
t
= t +W
t
denote a Brownian motion with nonzero drift rate
, and consider , > 0. Then
P(X
t
goes up to before down to ) =
e
2
1
e
2
e
2
.
71
Proof: Let T = inft > 0; X
t
= or X
t
= be the rst exit time of X
t
from the
interval (, ), which is a stopping time, see Exercise 3.1.4. The exponential process
M
t
= e
cWt
c
2
2
t
, t 0
is a martingale, see Exercise 2.2.4(c). Then E[M
t
] = E[M
0
] = 1. By the Optional
Stopping Theorem (see Theorem 3.2.1), we get E[M
T
] = 1. This can be written as
1 = E[e
cW
T
1
2
c
2
T
] = E[e
cX
T
(c+
1
2
c
2
)T
]. (3.5.6)
Choosing c = 2 yields E[e
2X
T
] = 1. Since the random variable X
T
takes only the
values and , if let p
= P(X
T
= ), the previous relation becomes
e
2
p
+e
2
(1 p
) = 1.
Solving for p
yields
p
=
e
2
1
e
2
e
2
. (3.5.7)
Noting that
p
= P(X
t
goes up to before down to )
leads to the desired answer.
It is worth noting how the previous formula changes in the case when the drift rate
is zero, i.e. when = 0, and X
t
= W
t
. The previous probability is computed by taking
the limit 0 and using LHospitals rule
lim
0
e
2
1
e
2
e
2
= lim
0
2e
2
2e
2
+ 2e
2
=

+
.
Hence
P(W
t

+
.
Taking the limit we recover the following result
P(W
t
hits ) = 1.
If = we obtain
P(W
t
1
2
,
which shows that the Brownian motion is equally likely to go up or down an amount
in a given time interval.
If T
and T
denote the times when the process X

t
reaches and , respectively,
then the aforementioned probabilities can be written using inequalities. For instance
the rst identity becomes
P(T
) =
e
2
1
e
2
e
2
.
72
t
= t + W
t
, and consider > 0.
(a) If > 0 show that
P(X
t
goes up to ) = 1.
(b) If < 0 show that
P(X
t
goes up to ) = e
2
< 1.
Formula (a) can be written equivalently as
P(sup
t0
(W
t
+t) ) = 1, 0,
while formula (b) becomes
P(sup
t0
(W
t
+t) ) = e
2
, < 0,
or
P(sup
t0
(W
t
t) ) = e
2
, > 0,
which is known as one of the Doobs inequalities. This can be also described in terms
of stopping times as follows. Dene the stopping time
= inft > 0; W
t
t .
Using
P(
< ) = P
_
sup
t0
(W
t
t)
_
yields the identities
P(
< ) = e
2
, > 0,
P(
< ) = 1, 0.
t
= t + W
t
, and consider > 0. Show that the probability that X
t
never hits is given by
_
1 e
2
, if > 0
0, if < 0.
Recall that T is the rst time when the process X
t
hits or .
Exercise 3.5.4 (a) Show that
E[X
T
] =
e
2
+e
2

e
2
e
2
.
(b) Find E[X
2
T
];
(c) Compute V ar(X
T
).
73
The next result deals with the time one has to wait (in expectation) for the process
X
t
= t +W
t
to reach either or .
Proposition 3.5.5 The expected value of T is
E[T] =
e
2
+e
2

(e
2
e
2
)
.
Proof: Using that W
t
is a martingale, with E[W
t
] = E[W
0
] = 0, applying the Optional
Stopping Theorem, Theorem 3.2.1, yields
0 = E[W
T
] = E[X
T
T] = E[X
T
] E[T].
Then by Exercise 3.5.4(a) we get
E[T] =
E[X
T
]
=
e
2
+be
2

(e
2
e
2
)
.
Exercise 3.5.6 Take the limit 0 in the formula provided by Proposition 3.5.5 to
nd the expected time for a Brownian motion to hit either or .
Exercise 3.5.7 Find E[T
2
] and V ar(T).
Exercise 3.5.8 (Walds identities) Let T be a nite stopping time for the Brownian
motion W
t
. Show that
(a) E[W
T
] = 0;
(b) E[W
2
T
] = E[T].
The previous techniques can be also applied to right continuous martingales. Let
a > 0 and consider the hitting time of the Poisson process of the barrier a
= inft > 0; N
t
a.
Proposition 3.5.9 The expected waiting time for N
t
to reach the barrier a is E[] =
a
.
Proof: Since M
t
= N
t
t is a right continuous martingale, by the Optional Stopping
Theorem E[M
] = E[M
0
] = 0. Then E[N
] = 0 and hence E[] =

1
E[N
] =
a
.
74
3.6 The Inverse Laplace Transform Method
In this section we shall use the Optional Stopping Theorem in conjunction with the
inverse Laplace transform to obtain the probability density for hitting times.
A. The case of standard Brownian motion Let x > 0. The rst hitting time
= T
x
= inft > 0; W
t
= x is a stopping time. Since M
t
= e
cWt
1
2
c
2
t
, t 0, is a
martingale, with E[M
t
] = E[M
0
] = 1, by the Optional Stopping Theorem, see Theorem
3.2.1, we have
E[M
] = 1.
This can be written equivalently as E[e
cW
e
1
2
c
2
] = 1. Using W
= x, we get
E[e
1
2
c
2
] = e
cx
.
It is worth noting that c > 0. This is implied from the fact that e
1
2
c
2
< 1 and
, x > 0.
Substituting s =
1
2
c
2
, the previous relation becomes
E[e
s
] = e
2sx
. (3.6.8)
This relation has a couple of useful applications.
Proposition 3.6.1 The moments of the rst hitting time are all innite E[
n
] = ,
n 1.
Proof: The nth moment of can be obtained by dierentiating and taking s = 0
d
n
ds
n
E[e
s
]
s=0
= E[()
n
e
s
]
s=0
= (1)
n
E[
n
].
Using (3.6.8) yields
E[
n
] = (1)
n
d
n
ds
n
e
2sx
s=0
.
Since by induction we have
d
n
ds
n
e
2sx
= (1)
n
e
2sx
n1
k=0
M
k
2
r
k
/2
x
nk
s
(n+k)/2
,
with M
k
, r
k
positive integers, it easily follows that E[
n
] = .
For instance, in the case n = 1 we have
E[] =
d
ds
e
2sx
s=0
= lim
s0
+
e
2sx
x
2
2sx
= +.
Another application involves the inverse Laplace transform to get the probability
density. This way we can retrieve the result of Proposition 3.3.4.
75
Proposition 3.6.2 The probability density of the hitting time is given by
p(t) =
[x[
2t
3
e
x
2
2t
, t > 0. (3.6.9)
Proof: Let x > 0. The expectation
E[e
s
] =
_

0
e
s
p() d = /p()(s)
is the Laplace transform of p(). Applying the inverse Laplace transform yields
p() = /
1
E[e
s
]() = /
1
e
2sx
()
=
x
2
3
e
x
2
2
, > 0.
In the case x < 0 we obtain
p() =
x
2
3
e
x
2
2
, > 0,
which leads to (3.6.9).
The computation on the inverse Laplace transform /
1
e
2sx
() is beyound the
goal of this book. The reader can obtain the value of this inverse Laplace transform
using the Mathematica software. However, the more mathematical interested reader
is reered to consult the method of complex integration in a book on inverse Laplace
transforms.
Another application of formula (3.6.8) is the following inequality.
Proposition 3.6.3 (Cherno bound) Let denote the rst hitting time when the
Brownian motion W
t
hits the barrier x, x > 0. Then
P( ) e
x
2
2
, > 0.
Proof: Let s = t in the part 2 of Theorem 1.12.10 and use (3.6.8) to get
P( )
E[e
tX
]
e
t
=
E[e
sX
]
e
s
= e
sx
2s
, s > 0.
Then P( ) e
min
s>0
f(s)
, where f(s) = s x
2s. Since f
(s) =
x
2s
, then
f(s) reaches its minimum at the critical point s
0
=
x
2
2
2
. The minimum value is
min
s>0
f(s) = f(s
0
) =
x
2
2
.
Substituting in the previous inequality leads to the required result.
76
B. The case of Brownian motion with drift
Consider the Brownian motion with drift X
t
= t +W
t
, with , > 0. Let
= inft > 0; X
t
= x
denote the rst hitting time of the barrier x, with x > 0. We shall compute the
distribution of the random variable and its rst two moments.
Applying the Optional Stopping Theorem (Theorem 3.2.1) to the martingale M
t
=
e
cWt
1
2
c
2
t
yields
E[M
] = E[M
0
] = 1.
Using that W
=
1
(X
) and X
= x, the previous relation becomes

E[e
(
c
+
1
2
c
2
)
] = e
x
. (3.6.10)
Substituting s =
c
+
1
2
c
2
and completing to a square yields
2s +

2
2
=
_
c +

_
2
.
Solving for c we get the solutions
c =
+
_
2s +

2
2
, c =

_
2s +

2
2
.
Assume c < 0. Then substituting the second solution into (3.6.10) yields
E[e
s
] = e
1
2
(+
2s
2
+
2
)x
.
This relation is contradictory since e
s
< 1 while e
1
2
(+
2s+
2
)x
> 1, where we used
that s, x, > 0. Hence it follows that c > 0. Substituting the rst solution into (3.6.10)
leads to
E[e
s
] = e
1
2
(
2s
2
+
2
)x
.
We arrived at the following result:
Proposition 3.6.4 Assume , x > 0. Let be the time the process X
t
= t + W
t
hits x for the rst time. Then we have
E[e
s
] = e
1
2
(
2s
2
+
2
)x
, s > 0. (3.6.11)
Proposition 3.6.5 Let be the time the process X
t
= t + W
t
hits x, with x > 0
and > 0.
(a) Then the density function of is given by
p() =
x
2
3/2
e
(x)
2
2
2
, > 0. (3.6.12)
77
(b) The mean and variance of are
E[] =
x
, V ar() =
x
2
3
.
Proof: (a) Let p() be the density function of . Since
E[e
s
] =
_

0
e
s
p() d = /p()(s)
is the Laplace transform of p(), applying the inverse Laplace transform yields
p() = /
1
E[e
s
] = /
1
e
1
2
(
2s
2
+
2
)x
=
x
2
3/2
e
(x)
2
2
2
, > 0.
It is worth noting that the computation of the previous inverse Laplace transform is
non-elementary; however, it can be easily computed using the Mathematica software.
(b) The moments are obtained by dierentiating the moment generating function
and taking the value at s = 0
E[] =
d
ds
E[e
s
]
s=0
=
d
ds
e
1
2
(
2s
2
+
2
)x
s=0
=
x
_
2
+ 2s
e
1
2
(
2s
2
+
2
)x
s=0
=
x
.
E[
2
] = (1)
2
d
2
ds
2
E[e
s
]
s=0
=
d
2
ds
2
e
1
2
(
2s
2
+
2
)x
s=0
=
x(
2
+x
_
2
+ 2s
2
)
(
2
+ 2s
2
)
3/2
e
1
2
(
2s
2
+
2
)x
s=0
=
x
2
3
+
x
2
2
.
Hence
V ar() = E[
2
] E[]
2
=
x
2
3
.
It is worth noting that we can arrive at the formula E[] =
x
in the following heuristic

way. Taking the expectation in the equation + W
= x yields E[] = x, where

we used that E[W
] = 0 for any nite stopping time (see Exercise 3.5.8 (a)). Solving
for E[] yields the aforementioned formula.
Even if the computations are more or less similar with the previous result, we shall
treat next the case of the negative barrier in its full length. This is because of its
particular importance in being useful in pricing perpetual American puts.
78
Proposition 3.6.6 Assume , x > 0. Let be the time the process X
t
= t + W
t
hits x for the rst time.
(a) We have
E[e
s
] = e
1
2
(+
2s
2
+
2
)x
, s > 0. (3.6.13)
(b) Then the density function of is given by
p() =
x
2
3/2
e
(x+)
2
2
2
, > 0. (3.6.14)
(c) The mean of is
E[] =
x
2x
2
.
Proof: (a) Consider the stopping time = inft > 0; X
t
= x. By the Optional
Stopping Theorem (Theorem 3.2.1) applied to the martingale M
t
= e
cWt
c
2
2
t
yields
1 = M
0
= E[M
] = E
_
e
cW
c
2
2

= E
_
e
c
(X )
c
2
2

= E
_
e
x
c

c
2
2

= e
x
E
_
e
(
c
+
c
2
2
)
.
Therefore
E
_
e
(
c
+
c
2
2
)
= e
c
x
. (3.6.15)
If let s =
c
+
c
2
2
, then solving for c yields c =

_
2s +

2
2
, but only the negative
solution works out; this comes from the fact that both terms of the equation (3.6.15)
have to be less than 1. Hence (3.6.15) becomes
E[e
s
] = e
1
2
(+
2s
2
+
2
)x
, s > 0.
(b) Relation (3.6.13) can be written equivalently as
/
_
p()
_
= e
1
2
(+
2s
2
+
2
)x
.
Taking the inverse Laplace transform, and using Mathematica software to compute it,
we obtain
p() = /
1
_
e
1
2
(+
2s
2
+
2
)x
_
() =
x
2
3/2
e
(x+)
2
2
2
, > 0.
(c) Dierentiating and evaluating at s = 0 we obtain
E[] =
d
ds
E[e
s
]
s=0
= e
2
x
x
2
_
2
=
x
2x
2
.
79
Exercise 3.6.7 Assume the hypothesis of Proposition 3.6.6 satised. Find V ar().
Exercise 3.6.8 Find the modes of distributions (3.6.12) and (3.6.14). What do you
notice?
t
= 2t + 3W
t
and Y
t
= 6t +W
t
.
(a) Show that the expected times for X
t
and Y
t
to reach any barrier x > 0 are the
same.
(b) If X
t
and Y
t
model the prices of two stocks, which one would you like to own?
Exercise 3.6.10 Does 4t + 2W
t
hit 9 faster (in expectation) than 5t + 3W
t
hits 14?
Exercise 3.6.11 Let be the rst time the Brownian motion with drift X
t
= t +W
t
hits x, where , x > 0. Prove the inequality
P( ) e
x
2
+
2
2
2
+x
, > 0.
C. The double barrier case
In the following we shall consider the case of double barrier. Consider the Brownian
motion with drift X
t
= t +W
t
, > 0. Let , > 0 and dene the stopping time
T = inft > 0; X
t
= or X
t
= .
Relation (3.5.6) states
E[e
cX
T
e
(c+
1
2
c
2
)T
] = 1.
Since the random variables T and X
T
are independent (why?), we have
E[e
cX
T
]E[e
(c+
1
2
c
2
)T
] = 1.
Using E[e
cX
T
] = e
c
p
+e
c
(1 p
), with p
given by (3.5.7), then

E[e
(c+
1
2
c
2
)T
] =
1
e
c
p
+e
c
(1 p
If substitute s = c +
1
2
c
2
, then
E[e
sT
] =
1
e
(+
2s+
2
)
p
+e
(+
2s+
2
)
(1 p
)
(3.6.16)
The probability density of the stopping time T is obtained by taking the inverse Laplace
transform of the right side expression
p(T) = /
1
_
1
e
(+
2s+
2
)
p
+e
(+
2s+
2
)
(1 p
)
_
(),
an expression which is not feasible for having closed form solution. However, expression
(3.6.16) would be useful for computing the price for double barrier derivatives.
80
Exercise 3.6.12 Use formula (3.6.16) to nd the expectation E[T].
Exercise 3.6.13 Denote by M
t
= N
t
t the compensated Poisson process and let
c > 0 be a constant.
(a) Show that
X
t
= e
cMtt(e
c
c1)
is an T
t
-martingale, with T
t
= (N
u
; u t).
(b) Let a > 0 and T = inft > 0; M
t
> a be the rst hitting time of the level a. Use
the Optional Stopping Theorem to show that
E[e
sT
] = e
(s)a
, s > 0,
where : [0, ) [0, ) is the inverse function of f(x) = e
x
x 1.
(c) Show that E[T] = .
(d) Can you use the inverse Laplace transform to nd the probability density function
of T?
3.7 Limits of Stochastic Processes
Let (X
t
)
t0
be a stochastic process. We can make sense of the limit expression X =
lim
t
X
t
, in a similar way as we did in section 1.13 for sequences of random variables.
We shall rewrite the denitions for the continuous case.
Almost Certain Limit
The process X
t
converges almost certainly to X, if for all states of the world , except
a set of probability zero, we have
lim
t
X
t
() = X().
We shall write ac-lim
t
X
t
= X. It is also sometimes called strong convergence.
Mean Square Limit
We say that the process X
t
converges to X in the mean square if
lim
t
E[(X
t
X)
2
] = 0.
In this case we write ms-lim
t
X
t
= X.
81
Limit in Probability or Stochastic Limit
The stochastic process X
t
converges in stochastic limit to X if
lim
t
P
_
; [X
t
() X()[ >
_
= 0.
This limit is abbreviated by st-lim
t
X
t
= X.
It is worth noting that, like in the case of sequences of random variables, both almost
certain convergence and convergence in mean square imply the stochastic convergence,
which implies the limit in distribution.
Limit in Distribution
We say that X
t
converges in distribution to X if for any continuous bounded function
(x) we have
lim
t
(X
t
) = (X).
It is worth noting that the stochastic convergence implies the convergence in distribu-
tion.
3.8 Convergence Theorems
The following property is a reformulation of Exercise 1.13.1 in the continuous setup.
Proposition 3.8.1 Consider a stochastic process X
t
such that E[X
t
] k, a constant,
and V ar(X
t
) 0 as t . Then ms-lim
t
X
t
= k.
It is worthy to note that the previous statement holds true if the limit to innity is
replaced by a limit to any other number.
Next we shall provide a few applications.
Application 3.8.2 If > 1/2, then
ms-lim
t
W
t
t
= 0.
Proof: Let X
t
=
W
t
t
. Then E[X
t
] =
E[W
t
]
t
= 0, and V ar[X
t
] =
1
t
2
V ar[W
t
] =
t
t
2
=
1
t
21
, for any t > 0. Since
1
t
21
0 as t , applying Proposition 3.8.1 yields
ms-lim
t
W
t
t
= 0.
Corollary 3.8.3 ms-lim
t
W
t
t
= 0.
82
Application 3.8.4 Let Z
t
=
_
t
0
W
s
ds. If > 3/2, then
ms-lim
t
Z
t
t
= 0.
Proof: Let X
t
=
Z
t
t
. Then E[X
t
] =
E[Z
t
]
t
= 0, and V ar[X
t
] =
1
t
2
V ar[Z
t
] =
t
3
3t
2
=
1
3t
23
, for any t > 0. Since
1
3t
23
0 as t , applying Proposition 3.8.1 leads to
the desired result.
Application 3.8.5 For any p > 0, c 1 we have
ms-lim
t
e
Wtct
t
p
= 0.
Proof: Consider the process X
t
=
e
Wtct
t
p
=
e
Wt
t
p
e
ct
. Since
E[X
t
] =
E[e
Wt
]
t
p
e
ct
=
e
t/2
t
p
e
ct
=
1
e
(c
1
2
)t
1
t
p
0, as t
V ar[X
t
] =
V ar[e
Wt
]
t
2p
e
2ct
=
e
2t
e
t
t
2p
e
2ct
=
1
t
2p
_
1
e
2t(c1)

1
e
t(2c1)
_
0,
as t , Proposition 3.8.1 leads to the desired result.
Application 3.8.6 Show that
ms-lim
t
max
0st
W
s
t
= 0.
Proof: Let X
t
=
max
0st
W
s
t
. Since by Exercise 3.3.10
E[ max
0st
W
s
] = 0
V ar( max
0st
W
s
) = 2
t,
then
E[X
t
] = 0
V ar[X
t
] =
2
t
t
2
0, t .
Apply Proposition 3.8.1 to get the desired result.
83
Remark 3.8.7 One of the strongest result regarding limits of Brownian motions is
called the law of iterated logarithms and was rst proved by Lamperti:
lim
t
sup
W
t
_
2t ln(ln t)
= 1,
almost certainly.
Exercise 3.8.8 Use the law of iterated logarithms to show
lim
t0
+
W
t
_
2t ln ln(1/t)
= 1.
Application 3.8.9 We shall show that ac-lim
t
W
t
t
= 0.
From the law of iterated logarithms
W
t
_
2t ln(ln t)
< 1 for t large. Then
W
t
t
=
W
t
_
2t ln(ln t)
_
2 ln(ln t)
t
<
_
2 ln(ln t)
t
.
Let
t
=
_
2 ln(ln t)
t
. Then
W
t
t
<
t
for t large. As an application of the lHospital
rule, it is not hard to see that
t
satises the following limits
t
0, t
t , t .
In order to show that ac-lim
t
W
t
t
= 0, it suces to prove
P
_
;
W
t
()
t
<
t
_
1, t . (3.8.17)
We have
P
_
;
W
t
()
t
<
t
_
= P
_
; t
t
< W
t
() < t
t
_
=
_
tt
tt
1
2t
e
u
2
2t
du
=
_
t
t
t
t
1
2
e
v
2
2
dv
_

2
e
v
2
2
dv = 1, t ,
where we used
t
t , as t , which proves (3.8.17).

Proposition 3.8.10 Let X
t
be a stochastic process. Then
ms-lim
t
X
t
= 0 ms-lim
t
X
2
t
= 0.
84
Proof: Left as an exercise.
t
be a stochastic process. Show that
ms-lim
t
X
t
= 0 ms-lim
t
[X
t
[ = 0.
Another convergence result can be obtained if we consider the continuous analog of
Example 1.13.6:
t
be a stochastic process such that there is a p > 0 such that
E[[X
t
[
p
] 0 as t . Then st-lim
t
X
t
= 0.
Application 3.8.13 We shall show that for any > 1/2
st-lim
t
W
t
t
= 0.
Proof: Consider the process X
t
=
W
t
t
. By Proposition 3.8.10 it suces to show

ms-lim
t
X
2
t
= 0. Since the mean square convergence implies the stochastic convergence,
we get st-lim
t
X
2
t
= 0. Since
E[[X
t
[
2
] = E[X
2
t
] = E
_
W
2
t
t
2
_
=
E[W
2
t
]
t
2
=
t
t
2
=
1
t
21
0, t ,
then Proposition 3.8.12 yields st-lim
t
X
t
= 0.
The following result can be regarded as the LHospitals rule for sequences:
Lemma 3.8.14 (Cesaro-Stoltz) Let x
n
and y
n
be two sequences of real numbers,
n 1. If the limit lim
n
x
n+1
x
n
y
n+1
y
n
exists and is equal to L, then the following limit
exists
lim
n
x
n
y
n
= L.
Proof: (Sketch) Assume there are dierentiable functions f and g such that f(n) = x
n
and g(n) = y
n
. (How do you construct these functions?) From Cauchys theorem
2
there is a c
n
(n, n + 1) such that
L = lim
n
x
n+1
x
n
y
n+1
y
n
= lim
n
f(n + 1) f(n)
g(n + 1) g(n)
= lim
n
f
(c
n
)
g
(c
n
)
.
2
This says that if f and g are dierentiable on (a, b) and continuous on [a, b], then there is a c (a, b)
such that
f(a) f(b)
g(a) g(b)
=
f
(c)
g
(c)
.
85
Since c
n
as n , we can also write the aforementioned limit as
lim
t
f
(t)
g
(t)
= L.
(Here one may argue against this, but we recall the freedom of choice for the functions
f and g such that c
n
can be any number between n and n+1). By lHospitals rule we
get
lim
t
f(t)
g(t)
= L.
Making t = n yields lim
t
x
n
y
n
= L.
The next application states that if a sequence is convergent, then the arithmetic
average of its terms is also convergent, and the sequences have the same limit.
Example 3.8.1 Let a
n
be a convergent sequence with lim
n
a
n
= L. Let
A
n
=
a
1
+a
2
+ +a
n
n
be the arithmetic average of the rst n terms. Then A
n
is convergent and
lim
n
A
n
= L.
Proof: This is an application of Cesaro-Stoltz lemma. Consider the sequences x
n
=
a
1
+a
2
+ +a
n
and y
n
= n. Since
x
n+1
x
n
y
n+1
y
n
=
(a
1
+ +a
n+1
) (a
1
+ +a
n
)
(n + 1) n
=
a
n+1
1
,
then
lim
n
x
n+1
x
n
y
n+1
y
n
= lim
n
a
n+1
= L.
Applying the Cesaro-Stoltz lemma yields
lim
n
A
n
= lim
n
x
n
y
n
= L.
Exercise 3.8.15 Let b
n
be a convergent sequence with lim
n
b
n
= L. Let
G
n
= (b
1
b
2
b
n
)
1/n
be the geometric average of the rst n terms. Show that G
n
is convergent and
lim
n
G
n
= L.
86
The following result extends the Cesaro-Stoltz lemma to sequences of random vari-
ables.
n
be a sequence of random variables on the probability space
(, T, P), such that
ac- lim
n
X
n+1
X
n
Y
n+1
Y
n
= L.
Then
ac- lim
n
X
n
Y
n
= L.
Proof: Consider the sets
A = ; lim
n
X
n+1
() X
n
()
Y
n+1
() Y
n
()
= L
B = ; lim
n
X
n
()
Y
n
()
= L.
Since for any given state of the world , the sequences x
n
= X
n
() and y
n
= Y
n
()
are numerical sequences, Lemma 3.8.14 yields the inclusion A B. This implies
P(A) P(B). Since P(A) = 1, it follows that P(B) = 1, which leads to the desired
conclusion.
Example 3.8.17 Let S
n
denote the price of a stock on day n, and assume that
ac-lim
n
S
n
= L.
Then
ac-lim
n
S
1
+ +S
n
n
= L and ac-lim
n
(S
1
S
n
)
1/n
= L.
This says that if almost all future simulations of the stock price approach the steady
state limit L, the arithmetic and geometric averages converge to the same limit. The
statement is a consequence of Proposition 3.8.16 and follows a similar proof as Example
3.8.1. Asian options have payos depending on these type of averages, as we shall see
in Part II.
3.9 The Martingale Convergence Theorem
We state now, without proof, a result which is a powerful way of proving the almost
certain convergence. We start with the discrete version:
Theorem 3.9.1 Let X
n
be a martingale with bounded means
M > 0 such that E[[X
n
[] M, n 1. (3.9.18)
Then there is L < such that ac- lim
n
X
n
= L, i.e.
P
_
; lim
n
X
n
() = L
_
= 1.
87
Since E[[X
n
[]
2
E[X
2
n
], the condition (3.9.18) can be replaced by its stronger version
M > 0 such that E[X
2
n
] M, n 1.
The following result deals with the continuous version of the Martingale Conver-
gence Theorem. Denote the innite knowledge by T
=
_
t
T
t
_
.
Theorem 3.9.2 Let X
t
be an T
t
-martingale such that
M > 0 such that E[[X
t
[] < M, t > 0.
Then there is an T
-measurable random variable X
such that X
t
X
a.c. as
t .
The next exrcise deals with a process that is a.c-convergent but is not ms-convergent.
Exercise 3.9.3 It is known that X
t
= e
Wtt/2
is a martingale. Since
E[[X
t
[] = E[e
Wtt/2
] = e
t/2
E[e
Wt
] = e
t/2
e
t/2
= 1,
by the Martingale Convergence Theorem there is a number L such that X
t
L a.c. as
t .
(a) What is the limit L? How did you make your guess?
(b) Show that
E[[X
t
1[
2
] = V ar(X
t
) +
_
E(X
t
) 1
_
2
.
(c) Show that X
t
does not converge in the mean square to 1.
(d) Prove that the sequence X
t
is a.c-convergent but it is not ms-convergent.
3.10 The Squeeze Theorem
The following result is the analog of the Squeeze Theorem from usual Calculus.
Theorem 3.10.1 Let X
n
, Y
n
, Z
n
be sequences of random variables on the probability
space (, T, P) such that
X
n
Y
n
Z
n
a.s. n 1.
If X
n
and Z
n
converge to L as n almost certainly (or in mean square, or stochastic
or in distribution), then Y
n
converges to L in a similar mode.
Proof: For any state of the world consider the sequences x
n
= X
n
(), y
n
=
Y
n
() and z
n
= Z
n
() and apply the usual Squeeze Theorem to them.
88
Remark 3.10.2 The previous theorem remains valid if n is replaced by a continuous
positive parameter t.
Example 3.10.1 Show that ac-lim
t
W
t
sin(W
t
)
t
= 0.
Proof: Consider the sequences X
t
= 0, Y
t
=
W
t
sin(W
t
)
t
and Z
t
=
W
t
t
. From
Application 3.8.9 we have ac-lim
t
Z
t
= 0. Applying the Squeeze Theorem we obtain
the desired result.
Exercise 3.10.3 Use the Squeeze Theorem to nd the following limits:
(a) ac-lim
t
sin(W
t
)
t
;
(b) ac-lim
t0
t cos W
t
;
(c) ac- lim
t
e
t
(sin W
t
)
2
.
3.11 Quadratic Variations
For some stochastic processes the sum of squares of consecutive increments tends in
mean square to a nite number, as the norm of the partition decreases to zero. We
shall encounter in the following a few important examples that will be useful when
dealing with stochastic integrals.
3.11.1 The Quadratic Variation of W
t
The next result states that the quadratic variation of the Brownian motion W
t
on the
interval [0, T] is T. More precisely, we have:
Proposition 3.11.1 Let T > 0 and consider the equidistant partition 0 = t
0
< t
1
<
t
n1
< t
n
= T. Then
ms- lim
n
n1
i=0
(W
t
i+1
W
t
i
)
2
= T. (3.11.19)
Proof: Consider the random variable
X
n
=
n1
i=0
(W
t
i+1
W
t
i
)
2
.
89
Since the increments of a Brownian motion are independent, Proposition 4.2.1 yields
E[X
n
] =
n1
i=0
E[(W
t
i+1
W
t
i
)
2
] =
n1
i=0
(t
i+1
t
i
)
= t
n
t
0
= T;
V ar(X
n
) =
n1
i=0
V ar[(W
t
i+1
W
t
i
)
2
] =
n1
i=0
2(t
i+1
t
i
)
2
= n 2
_
T
n
_
2
=
2T
2
n
,
where we used that the partition is equidistant. Since X
n
satises the conditions
E[X
n
] = T, n 1;
V ar[X
n
] 0, n ,
by Proposition 3.8.1 we obtain ms-lim
n
X
n
= T, or
ms- lim
n
n1
i=0
(W
t
i+1
W
t
i
)
2
= T. (3.11.20)
Exercise 3.11.2 Prove that the quadratic variation of the Brownian motion W
t
on
[a, b] is equal to b a.
The Fundamental Relation dW
2
t
= dt
The relation discussed in this section can be regarded as the fundamental relation of
Stochastic Calculus. We shall start by recalling relation (3.11.20)
ms- lim
n
n1
i=0
(W
t
i+1
W
t
i
)
2
= T. (3.11.21)
The right side can be regarded as a regular Riemann integral
T =
_
T
0
dt,
while the left side can be regarded as a stochastic integral with respect to dW
2
t
_
T
0
(dW
t
)
2
= ms- lim
n
n1
i=0
(W
t
i+1
W
t
i
)
2
.
90
_
T
0
(dW
t
)
2
=
_
T
0
dt, T > 0.
The dierential form of this integral equation is
dW
2
t
= dt.
In fact, this expression also holds in the mean square sense, as it can be inferred from
the next exercise.
(a) E[dW
2
t
dt] = 0;
(b) V ar(dW
2
t
dt) = o(dt);
(c) ms-lim
dt0
(dW
2
t
dt) = 0.
Roughly speaking, the process dW
2
t
, which is the square of innitesimal increments
of a Brownian motion, is totally predictable. This relation plays a central role in
Stochastic Calculus and will be useful when dealing with Itos lemma.
The following exercise states that dtdW
t
= 0, which is another important stochastic
relation useful in Itos lemma.
Exercise 3.11.4 Consider the equidistant partition 0 = t
0
< t
1
< t
n1
< t
n
= T.
Then
ms- lim
n
n1
i=0
(W
t
i+1
W
t
i
)(t
i+1
t
i
) = 0. (3.11.22)
3.11.2 The Quadratic Variation of N
t
t
The following result deals with the quadratic variation of the compensated Poisson
process M
t
= N
t
t.
Proposition 3.11.5 Let a < b and consider the partition a = t
0
< t
1
< < t
n1
<
t
n
= b. Then
ms lim
n0
n1
k=0
(M
t
k+1
M
t
k
)
2
= N
b
N
a
, (3.11.23)
where |
n
| := sup
0kn1
(t
k+1
t
k
).
91
Proof: For the sake of simplicity we shall use the following notations:
t
k
= t
k+1
t
k
, M
k
= M
t
k+1
M
t
k
, N
k
= N
t
k+1
N
t
k
.
The relation we need to prove can also be written as
ms-lim
n
n1
k=0
_
(M
k
)
2
N
k
= 0.
Let
Y
k
= (M
k
)
2
N
k
= (M
k
)
2
M
k
t
k
.
It suces to show that
E
_
n1
k=0
Y
k
_
= 0, (3.11.24)
lim
n
V ar
_
n1
k=0
Y
k
_
= 0. (3.11.25)
The rst identity follows from the properties of Poisson processes (see Exercise 2.8.9)
E
_
n1
k=0
Y
k
_
=
n1
k=0
E[Y
k
] =
n1
k=0
E[(M
k
)
2
] E[N
k
]
=
n1
k=0
(t
k
t
k
) = 0.
For the proof of the identity (3.11.25) we need to nd rst the variance of Y
k
.
V ar[Y
k
] = V ar[(M
k
)
2
(M
k
+t
k
)] = V ar[(M
k
)
2
M
k
]
= V ar[(M
k
)
2
] +V ar[M
k
] 2Cov[M
2
k
, M
k
]
= t
k
+ 2
2
t
2
k
+t
k
2
_
E[(M
k
)
3
] E[(M
k
)
2
]E[M
k
]
_
= 2
2
(t
k
)
2
,
where we used Exercise 2.8.9 and the fact that E[M
k
] = 0. Since M
t
is a process
with independent increments, then Cov[Y
k
, Y
j
] = 0 for i ,= j. Then
V ar
_
n1
k=0
Y
k
_
=
n1
k=0
V ar[Y
k
] + 2
k=j
Cov[Y
k
, Y
j
] =
n1
k=0
V ar[Y
k
]
= 2
2
n1
k=0
(t
k
)
2
2
2
|
n
|
n1
k=0
t
k
= 2
2
(b a)|
n
|,
92
and hence V ar
_
n1
k=0
Y
n
_
0 as |
n
| 0. According to the Example 1.13.1, we obtain
the desired limit in mean square.
The previous result states that the quadratic variation of the martingale M
t
between
a and b is equal to the jump of the Poisson process between a and b.
The Fundamental Relation dM
2
t
= dN
t
Recall relation (3.11.23)
ms- lim
n
n1
k=0
(M
t
k+1
M
t
k
)
2
= N
b
N
a
. (3.11.26)
The right side can be regarded as a Riemann-Stieltjes integral
N
b
N
a
=
_
b
a
dN
t
,
while the left side can be regarded as a stochastic integral with respect to (dM
t
)
2
_
b
a
(dM
t
)
2
:= ms- lim
n
n1
k=0
(M
t
k+1
M
t
k
)
2
.
_
b
a
(dM
t
)
2
=
_
b
a
dN
t
,
for any a < b. The equivalent dierential form is
(dM
t
)
2
= dN
t
. (3.11.27)
The Relations dt dM
t
= 0, dW
t
dM
t
= 0
In order to show that dt dM
t
= 0 in the mean square sense, we need to prove the limit
ms- lim
n
n1
k=0
(t
k+1
t
k
)(M
t
k+1
M
t
k
) = 0. (3.11.28)
This can be thought as a vanishing integral of the increment process dM
t
with respect
to dt
_
b
a
dM
t
dt = 0, a, b R.
93
Denote
X
n
=
n1
k=0
(t
k+1
t
k
)(M
t
k+1
M
t
k
) =
n1
k=0
t
k
M
k
.
In order to show (3.11.28) it suces to prove that
1. E[X
n
] = 0;
2. lim
n
V ar[X
n
] = 0.
Using the additivity of the expectation and Exercise 2.8.9, (ii)
E[X
n
] = E
_
n1
k=0
t
k
M
k
_
=
n1
k=0
t
k
E[M
k
] = 0.
Since the Poisson process N
t
has independent increments, the same property holds for
the compensated Poisson process M
t
. Then t
k
M
k
and t
j
M
j
are independent
for k ,= j, and using the properties of variance we have
V ar[X
n
] = V ar
_
n1
k=0
t
k
M
k
_
=
n1
k=0
(t
k
)
2
V ar[M
k
] =
n1
k=0
(t
k
)
3
,
where we used
V ar[M
k
] = E[(M
k
)
2
] (E[M
k
])
2
= t
k
,
see Exercise 2.8.9 (ii). If we let |
n
| = max
k
t
k
, then
V ar[X
n
] =
n1
k=0
(t
k
)
3
|
n
|
2
n1
k=0
t
k
= (b a)|
n
|
2
0
as n . Hence we proved the stochastic dierential relation
dt dM
t
= 0. (3.11.29)
For showing the relation dW
t
dM
t
= 0, we need to prove
ms- lim
n
Y
n
= 0, (3.11.30)
where we have denoted
Y
n
=
n1
k=0
(W
k+1
W
k
)(M
t
k+1
M
t
k
) =
n1
k=0
W
k
M
k
.
94
Since the Brownian motion W
t
and the process M
t
have independent increments and
W
k
is independent of M
k
, we have
E[Y
n
] =
n1
k=0
E[W
k
M
k
] =
n1
k=0
E[W
k
]E[M
k
] = 0,
where we used E[W
k
] = E[M
k
] = 0. Using also E[(W
k
)
2
] = t
k
, E[(M
k
)
2
] =
t
k
, and invoking the independence of W
k
and M
k
, we get
V ar[W
k
M
k
] = E[(W
k
)
2
(M
k
)
2
] (E[W
k
M
k
])
2
= E[(W
k
)
2
]E[(M
k
)
2
] E[W
k
]
2
E[M
k
]
2
= (t
k
)
2
.
Then using the independence of the terms in the sum, we get
V ar[Y
n
] =
n1
k=0
V ar[W
k
M
k
] =
n1
k=0
(t
k
)
2
|
n
|
n1
k=0
t
k
= (b a)|
n
| 0,
as n . Since Y
n
is a random variable with mean zero and variance decreasing to
zero, it follows that Y
n
0 in the mean square sense. Hence we proved that
dW
t
dM
t
= 0. (3.11.31)
Exercise 3.11.6 Show the following stochastic dierential relations:
(a) dt dN
t
= 0; (b) dW
t
dN
t
= 0; (c) dt dW
t
= 0;
(d) (dN
t
)
2
= dN
t
; (e) (dM
t
)
2
= dN
t
; (f) (dM
t
)
4
= dN
t
.
The relations proved in this section will be useful in the Part II when developing
the stochastic model of a stock price that exhibits jumps modeled by a Poisson process.
Chapter 4
Stochastic Integration
This chapter deals with one of the most useful stochastic integrals, called the Ito inte-
gral. This type of integral was introduced in 1944 by the Japanese mathematician K.
Ito, and was originally motivated by a construction of diusion processes.
4.1 Nonanticipating Processes
Consider the Brownian motion W
t
. A process F
t
is called a nonanticipating process
if F
t
is independent of any future increment W
t
W
t
for any t and t
with t < t
.
Consequently, the process F
t
is independent of the behavior of the Brownian motion
in the future, i.e. it cannot anticipate the future. For instance, W
t
, e
Wt
, W
2
t
W
t
+t
are examples of nonanticipating processes, while W
t+1
or
1
2
(W
t+1
W
t
)
2
are not.
Nonanticipating processes are important because the Ito integral concept applies
only to them.
If T
t
denotes the information known until time t, where this information is generated
by the Brownian motion W
s
; s t, then any T
t
-adapted process F
t
is nonanticipat-
ing.
4.2 Increments of Brownian Motions
In this section we shall discuss a few basic properties of the increments of a Brownian
motion, which will be useful when computing stochastic integrals.
Proposition 4.2.1 Let W
t
be a Brownian motion. If s < t, we have
1. E[(W
t
W
s
)
2
] = t s.
2. V ar[(W
t
W
s
)
2
] = 2(t s)
2
.
Proof: 1. Using that W
t
W
s
N(0, t s), we have
E[(W
t
W
s
)
2
] = E[(W
t
W
s
)
2
] (E[W
t
W
s
])
2
= V ar(W
t
W
s
) = t s.
95
96
2. Dividing by the standard deviation yields the standard normal random variable
W
t
W
s
t s
N(0, 1). Its square,
(W
t
W
s
)
2
t s
is
2
-distributed with 1 degree of free-
dom.
1
Its mean is 1 and its variance is 2. This implies
E
_
(W
t
W
s
)
2
t s
_
= 1 = E[(W
t
W
s
)
2
] = t s;
V ar
_
(W
t
W
s
)
2
t s
_
= 2 = V ar[(W
t
W
s
)
2
] = 2(t s)
2
.
Remark 4.2.2 The innitesimal version of the previous result is obtained by replacing
t s with dt
1. E[dW
2
t
] = dt;
2. V ar[dW
2
t
] = 2dt
2
.
We shall see in an upcoming section that in fact dW
2
t
and dt are equal in a mean square
sense.
(a) E[(W
t
W
s
)
4
] = 3(t s)
2
;
(b) E[(W
t
W
s
)
6
] = 15(t s)
3
.
4.3 The Ito Integral
The Ito integral is dened in a way that is similar to the Riemann integral. The
Ito integral is taken with respect to innitesimal increments of a Brownian motion,
dW
t
, which are random variables, while the Riemann integral considers integration
with respect to the predictable innitesimal changes dt. It is worth noting that the Ito
integral is a random variable, while the Riemann integral is just a real number. Despite
this fact, there are several common properties and relations between these two types
of integrals.
Consider 0 a < b and let F
t
= f(W
t
, t) be a nonanticipating process with
E
_
_
b
a
F
2
t
dt
_
< . (4.3.1)
The role of the previous condition will be made more clear when we shall discuss
about the martingale property of a the Ito integral. Divide the interval [a, b] into n
subintervals using the partition points
a = t
0
< t
1
< < t
n1
< t
n
= b,
1
A
2
-distributed random variable with n degrees of freedom has mean n and variance 2n.
97
and consider the partial sums
S
n
=
n1
i=0
F
t
i
(W
t
i+1
W
t
i
).
We emphasize that the intermediate points are the left endpoints of each interval, and
this is the way they should be always chosen. Since the process F
t
is nonanticipative,
the random variables F
t
i
and W
t
i+1
W
t
i
are independent; this is an important feature
in the denition of the Ito integral.
The Ito integral is the limit of the partial sums S
n
ms-lim
n
S
n
=
_
b
a
F
t
dW
t
,
provided the limit exists. It can be shown that the choice of partition does not inuence
the value of the Ito integral. This is the reason why, for practical purposes, it suces
to assume the intervals equidistant, i.e.
t
i+1
t
i
=
(b a)
n
, i = 0, 1, , n 1.
The previous convergence is in the mean square sense, i.e.
lim
n
E
__
S
n
_
b
a
F
t
dW
t
_
2
_
= 0.
Existence of the Ito integral
It is known that the Ito stochastic integral
_
b
a
F
t
dW
t
exists if the process F
t
= f(W
t
, t)
1. The paths t F
t
() are continuous on [a, b] for any state of the world ;
2. The process F
t
is nonanticipating for t [a, b];
3. E
_
_
b
a
F
2
t
dt
_
< .
For instance, the following stochastic integrals exist:
_
T
0
W
2
t
dW
t
,
_
T
0
sin(W
t
) dW
t
,
_
b
a
cos(W
t
)
t
dW
t
.
4.4 Examples of Ito integrals
As in the case of the Riemann integral, using the denition is not an ecient way of
computing integrals. The same philosophy applies to Ito integrals. We shall compute
in the following two simple Ito integrals. In later sections we shall introduce more
ecient methods for computing Ito integrals.
98
4.4.1 The case F
t
= c, constant
In this case the partial sums can be computed explicitly
S
n
=
n1
i=0
F
t
i
(W
t
i+1
W
t
i
) =
n1
i=0
c(W
t
i+1
W
t
i
)
= c(W
b
W
a
),
and since the answer does not depend on n, we have
_
b
a
c dW
t
= c(W
b
W
a
).
In particular, taking c = 1, a = 0, and b = T, since the Brownian motion starts at 0,
we have the following formula:
_
T
0
dW
t
= W
T
.
4.4.2 The case F
t
= W
t
We shall integrate the process W
t
between 0 and T. Considering an equidistant parti-
tion, we take t
k
=
kT
n
, k = 0, 1, , n 1. The partial sums are given by
S
n
=
n1
i=0
W
t
i
(W
t
i+1
W
t
i
).
Since
xy =
1
2
[(x +y)
2
x
2
y
2
],
letting x = W
t
i
and y = W
t
i+1
W
t
i
yields
W
t
i
(W
t
i+1
W
t
i
) =
1
2
W
2
t
i+1

1
2
W
2
t
i

1
2
(W
t
i+1
W
t
i
)
2
.
Then after pair cancelations the sum becomes
S
n
=
1
2
n1
i=0
W
2
t
i+1

1
2
n1
i=0
W
2
t
i

1
2
n1
i=0
(W
t
i+1
W
t
i
)
2
=
1
2
W
2
tn

1
2
n1
i=0
(W
t
i+1
W
t
i
)
2
.
Using t
n
= T, we get
S
n
=
1
2
W
2
T

1
2
n1
i=0
(W
t
i+1
W
t
i
)
2
.
99
Since the rst term on the right side is independent of n, using Proposition 3.11.1, we
have
ms- lim
n
S
n
=
1
2
W
2
T
ms- lim
n
1
2
n1
i=0
(W
t
i+1
W
t
i
)
2
(4.4.2)
=
1
2
W
2
T

1
2
T. (4.4.3)
We have now obtained the following explicit formula of a stochastic integral:
_
T
0
W
t
dW
t
=
1
2
W
2
T

1
2
T.
In a similar way one can obtain
_
b
a
W
t
dW
t
=
1
2
(W
2
b
W
2
a
)
1
2
(b a).
It is worth noting that the right side contains random variables depending on the limits
of integration a and b.
Exercise 4.4.1 Show the following identities:
(a) E[
_
T
0
dW
t
] = 0;
(b) E[
_
T
0
W
t
dW
t
] = 0;
(c) V ar[
_
T
0
W
t
dW
t
] =
T
2
2
.
4.5 Properties of the Ito Integral
We shall start with some properties which are similar with those of the Riemannian
integral.
Proposition 4.5.1 Let f(W
t
, t), g(W
t
, t) be nonanticipating processes and c R.
Then we have
1. Additivity:
_
T
0
[f(W
t
, t) +g(W
t
, t)] dW
t
=
_
T
0
f(W
t
, t) dW
t
+
_
T
0
g(W
t
, t) dW
t
.
2. Homogeneity:
_
T
0
cf(W
t
, t) dW
t
= c
_
T
0
f(W
t
, t) dW
t
.
3. Partition property:
_
T
0
f(W
t
, t) dW
t
=
_
u
0
f(W
t
, t) dW
t
+
_
T
u
f(W
t
, t) dW
t
, 0 < u < T.
100
Proof: 1. Consider the partial sum sequences
X
n
=
n1
i=0
f(W
t
i
, t
i
)(W
t
i+1
W
t
i
)
Y
n
=
n1
i=0
g(W
t
i
, t
i
)(W
t
i+1
W
t
i
).
Since ms-lim
n
X
n
=
_
T
0
f(W
t
, t) dW
t
and ms-lim
n
Y
n
=
_
T
0
g(W
t
, t) dW
t
, using Propo-
sition 1.14.2 yields
_
T
0
_
f(W
t
, t) +g(W
t
, t)
_
dW
t
= ms-lim
n
n1
i=0
_
f(W
t
i
, t
i
) +g(W
t
i
, t
i
)
_
(W
t
i+1
W
t
i
)
= ms-lim
n
_
n1
i=0
_
f(W
t
i
, t
i
)(W
t
i+1
W
t
i
) +
n1
i=0
g(W
t
i
, t
i
)(W
t
i+1
W
t
i
)
_
= ms-lim
n
(X
n
+Y
n
) = ms-lim
n
X
n
+ ms-lim
n
Y
n
=
_
T
0
f(W
t
, t) dW
t
+
_
T
0
g(W
t
, t) dW
t
.
The proofs of parts 2 and 3 are left as an exercise for the reader.
Some other properties, such as monotonicity, do not hold in general. It is possible
to have a nonnegative random variable F
t
for which the random variable
_
T
0
F
t
dW
t
has
negative values. More precisely, let F
t
= 1. Then F
t
> 0 but
_
T
0
1 dW
t
= W
T
it is not
always positive. The probability to be negative is P(W
T
< 0) = 1/2.
Some of the random variable properties of the Ito integral are given by the following
result:
Proposition 4.5.2 We have
1. Zero mean:
E
_
_
b
a
f(W
t
, t) dW
t
_
= 0.
2. Isometry:
E
__
_
b
a
f(W
t
, t) dW
t
_
2
_
= E
_
_
b
a
f(W
t
, t)
2
dt
_
.
3. Covariance:
E
__
_
b
a
f(W
t
, t) dW
t
__
_
b
a
g(W
t
, t) dW
t
__
= E
_
_
b
a
f(W
t
, t)g(W
t
, t) dt
_
.
101
We shall discuss the previous properties giving rough reasons why they hold true.
The detailed proofs are beyond the goal of this book.
1. The Ito integral I =
_
b
a
f(W
t
, t) dW
t
is the mean square limit of the partial sums
S
n
=
n1
i=0
f
t
i
(W
t
i+1
W
t
i
), where we denoted f
t
i
= f(W
t
i
, t
i
). Since f(W
t
, t) is a
nonanticipative process, then f
t
i
is independent of the increments W
t
i+1
W
t
i
, and
hence we have
E[S
n
] = E
_
n1
i=0
f
t
i
(W
t
i+1
W
t
i
)
_
=
n1
i=0
E[f
t
i
(W
t
i+1
W
t
i
)]
=
n1
i=0
E[f
t
i
]E[(W
t
i+1
W
t
i
)] = 0,
because the increments have mean zero. Applying the Squeeze Theorem in the double
inequality
0
_
E[S
n
I]
_
2
E[(S
n
I)
2
] 0, n
yields E[S
n
] E[I] 0. Since E[S
n
] = 0 it follows that E[I] = 0, i.e. the Ito integral
has zero mean.
2. Since the square of the sum of partial sums can be written as
S
2
n
=
_
n1
i=0
f
t
i
(W
t
i+1
W
t
i
)
_
2
=
n1
i=0
f
2
t
i
(W
t
i+1
W
t
i
)
2
+ 2
i=j
f
t
i
(W
t
i+1
W
t
i
)f
t
j
(W
t
j+1
W
t
j
),
using the independence yields
E[S
2
n
] =
n1
i=0
E[f
2
t
i
]E[(W
t
i+1
W
t
i
)
2
]
+2
i=j
E[f
t
i
]E[(W
t
i+1
W
t
i
)]E[f
t
j
]E[(W
t
j+1
W
t
j
)]
=
n1
i=0
E[f
2
t
i
](t
i+1
t
i
),
which are the Riemann sums of the integral
_
b
a
E[f
2
t
] dt = E
_
_
b
a
f
2
t
dt
_
, where the last
identity follows from Fubinis theorem. Hence E[S
2
n
] converges to the aforementioned
integral.
102
3. Consider the partial sums
S
n
=
n1
i=0
f
t
i
(W
t
i+1
W
t
i
), V
n
=
n1
j=0
g
t
j
(W
t
j+1
W
t
j
).
Their product is
S
n
V
n
=
_
n1
i=0
f
t
i
(W
t
i+1
W
t
i
)
__
n1
j=0
g
t
j
(W
t
j+1
W
t
j
)
_
=
n1
i=0
f
t
i
g
t
i
(W
t
i+1
W
t
i
)
2
+
n1
i=j
f
t
i
g
t
j
(W
t
i+1
W
t
i
)(W
t
j+1
W
t
j
)
Using that f
t
and g
t
are nonanticipative and that
E[(W
t
i+1
W
t
i
)(W
t
j+1
W
t
j
)] = E[W
t
i+1
W
t
i
]E[W
t
j+1
W
t
j
] = 0, i ,= j
E[(W
t
i+1
W
t
i
)
2
] = t
i+1
t
i
,
it follows that
E[S
n
V
n
] =
n1
i=0
E[f
t
i
g
t
i
]E[(W
t
i+1
W
t
i
)
2
]
=
n1
i=0
E[f
t
i
g
t
i
](t
i+1
t
i
),
which is the Riemann sum for the integral
_
b
a
E[f
t
g
t
] dt.
From 1 and 2 it follows that the random variable
_
b
a
f(W
t
, t) dW
t
has mean zero
and variance
V ar
_
_
b
a
f(W
t
, t) dW
t
_
= E
_
_
b
a
f(W
t
, t)
2
dt
_
.
From 1 and 3 it follows that
Cov
_
_
b
a
f(W
t
, t) dW
t
,
_
b
a
g(W
t
, t) dW
t
_
=
_
b
a
E[f(W
t
, t)g(W
t
, t)] dt.
Corollary 4.5.3 (Cauchys integral inequality) Let f
t
= f(W
t
, t) and g
t
= g(W
t
, t).
Then
_
_
b
a
E[f
t
g
t
] dt
_
2
_
_
b
a
E[f
2
t
] dt
__
_
b
a
E[g
2
t
] dt
_
.
103
Proof: It follows from the previous theorem and from the correlation formula [Corr(X, Y )[ =
[Cov(X, Y )[
[V ar(X)V ar(Y )]
1/2
1.
Let T
t
be the information set at time t. This implies that f
t
i
and W
t
i+1
W
t
i
are
known at time t, for any t
i+1
t. It follows that the partial sum S
n
=
n1
i=0
f
t
i
(W
t
i+1

W
t
i
) is T
t
-predictable. The following result, whose proof is omited for technical reasons,
states that this is also valid after taking the limit in the mean square:
Proposition 4.5.4 The Ito integral
_
t
0
f
s
dW
s
is T
t
-predictable.
The following two results state that if the upper limit of an Ito integral is replaced
by the parameter t we obtain a continuous martingale.
Proposition 4.5.5 For any s < t we have
E
_
_
t
0
f(W
u
, u) dW
u
[T
s
_
=
_
s
0
f(W
u
, u) dW
u
.
Proof: Using part 3 of Proposition 4.5.2 we get
E
_
_
t
0
f(W
u
, u) dW
u
[T
s
_
= E
_
_
s
0
f(W
u
, u) dW
u
+
_
t
s
f(W
u
, u) dW
u
[T
s
_
= E
_
_
s
0
f(W
u
, u) dW
u
[T
s
_
+E
_
_
t
s
f(W
u
, u) dW
u
[T
s
_
. (4.5.4)
Since
_
s
0
f(W
u
, u) dW
u
is T
s
-predictable (see Proposition 4.5.4), by part 2 of Proposition
1.11.4
E
_
_
s
0
f(W
u
, u) dW
u
[T
s
_
=
_
s
0
f(W
u
, u) dW
u
.
Since
_
t
s
f(W
u
, u) dW
u
contains only information between s and t, it is independent of
the information set T
s
, so we can drop the condition in the expectation; using that Ito
integrals have zero mean we obtain
E
_
_
t
s
f(W
u
, u) dW
u
[T
s
_
= E
_
_
t
s
f(W
u
, u) dW
u
_
= 0.
Substituting into (4.5.4) yields the desired result.
Proposition 4.5.6 Consider the process X
t
=
_
t
0
f(W
s
, s) dW
s
. Then X
t
is continu-
ous, i.e. for almost any state of the world , the path t X
t
() is continuous.
104
Proof: A rigorous proof is beyond the purpose of this book. We shall provide a rough
sketch. Assume the process f(W
t
, t) satises E[f(W
t
, t)
2
] < M, for some M > 0. Let
t
0
be xed and consider h > 0. Consider the increment Y
h
= X
t
0
+h
X
t
0
. Using the
aforementioned properties of the Ito integral we have
E[Y
h
] = E[X
t
0
+h
X
t
0
] = E
_
_
t
0
+h
t
0
f(W
t
, t) dW
t
_
= 0
E[Y
2
h
] = E
__
_
t
0
+h
t
0
f(W
t
, t) dW
t
_
2
_
=
_
t
0
+h
t
0
E[f(W
t
, t)
2
] dt
< M
_
t
0
+h
t
0
dt = Mh.
The process Y
h
has zero mean for any h > 0 and its variance tends to 0 as h 0.
Using a convergence theorem yields that Y
h
tends to 0 in mean square, as h 0. This
is equivalent with the continuity of X
t
at t
0
.
t
=
_
t
0
f(W
s
, s) dW
s
, with E
_ _
0
f
2
(s, W
s
) ds
< . Then
X
t
is a continuous T
t
-martingale.
Proof: We shall check in the following the properties of a martingale.
Integrability: Using properties of Ito integrals
E[X
2
t
] = E
_
_
_
t
0
f(W
s
, s) dW
s
_
2
= E
_
_
t
0
f
2
(W
s
, s) ds
< E
_
_

0
f
2
(W
s
, s) ds
< ,
and then from the inequality E[[X
t
[]
2
E[X
2
t
] we obtain E[[X
t
[] < , for all t 0.
Predictability: X
t
is T
t
-predictable from Proposition 4.5.4.
Forecast: E[X
t
[T
s
] = X
s
for s < t by Proposition 4.5.5.
Continuity: See Proposition 4.5.6.
4.6 The Wiener Integral
The Wiener integral is a particular case of the Ito stochastic integral. It is obtained by
replacing the nonanticipating stochastic process f(W
t
, t) by the deterministic function
f(t). The Wiener integral
_
b
a
f(t) dW
t
is the mean square limit of the partial sums
S
n
=
n1
i=0
f(t
i
)(W
t
i+1
W
t
i
).
All properties of Ito integrals also hold for Wiener integrals. The Wiener integral is a
random variable with zero mean
E
_
_
b
a
f(t) dW
t
_
= 0
105
and variance
E
__
_
b
a
f(t) dW
t
_
2
_
=
_
b
a
f(t)
2
dt.
However, in the case of Wiener integrals we can say something about their distribution.
Proposition 4.6.1 The Wiener integral I(f) =
_
b
a
f(t) dW
t
is a normal random vari-
able with mean 0 and variance
V ar[I(f)] =
_
b
a
f(t)
2
dt := |f|
2
L
2
.
Proof: Since increments W
t
i+1
W
t
i
are normally distributed with mean 0 and variance
t
i+1
t
i
, then
f(t
i
)(W
t
i+1
W
t
i
) N(0, f(t
i
)
2
(t
i+1
t
i
)).
Since these random variables are independent, by the Central Limit Theorem (see
Theorem 2.3.1), their sum is also normally distributed, with
S
n
=
n1
i=0
f(t
i
)(W
t
i+1
W
t
i
) N
_
0,
n1
i=0
f(t
i
)
2
(t
i+1
t
i
)
_
.
Taking n and max
i
|t
i+1
t
i
| 0, the normal distribution tends to
N
_
0,
_
b
a
f(t)
2
dt
_
.
The previous convergence holds in distribution, and it still needs to be shown in the
mean square. However, we shall omit this essential proof detail.
Exercise 4.6.2 Show that the random variable X =
_
T
1
1
t
dW
t
is normally distributed
with mean 0 and variance ln T.
Exercise 4.6.3 Let Y =
_
T
1
t dW
t
. Show that Y is normally distributed with mean
0 and variance (T
2
1)/2.
Exercise 4.6.4 Find the distribution of the integral
_
t
0
e
ts
dW
s
.
Exercise 4.6.5 Show that X
t
=
_
t
0
(2tu) dW
u
and Y
t
=
_
t
0
(3t4u) dW
u
are Gaussian
processes with mean 0 and variance
7
3
t
3
.
Exercise 4.6.6 Show that ms- lim
t0
1
t
_
t
0
udW
u
= 0.
Exercise 4.6.7 Find all constants a, b such that X
t
=
_
t
0
_
a +
bu
t
_
dW
u
is normally
distributed with variance t.
106
Exercise 4.6.8 Let n be a positive integer. Prove that
Cov
_
W
t
,
_
t
0
u
n
dW
u
_
=
t
n+1
n + 1
.
Formulate and prove a more general result.
4.7 Poisson Integration
In this section we deal with the integration with respect to the compensated Poisson
process M
t
= N
t
t, which is a martingale. Consider 0 a < b and let F
t
= F(t, M
t
)
be a nonanticipating process with
E
_
_
b
a
F
2
t
dt
_
< .
Consider the partition
a = t
0
< t
1
< < t
n1
< t
n
= b
of the interval [a, b], and associate the partial sums
S
n
=
n1
i=0
F
t
i
(M
t
i+1
M
t
i
),
where F
t
i
is the left-hand limit at t
i
. For predictability reasons, the intermediate
points are the left-handed limit to the endpoints of each interval. Since the process F
t
is nonanticipative, the random variables F
t
i
and M
t
i+1
M
t
i
are independent.
The integral of F
t
with respect to M
t
is the mean square limit of the partial sum
S
n
ms-lim
n
S
n
=
_
T
0
F
t
dM
t
,
provided the limit exists. More precisely, this convergence means that
lim
n
E
__
S
n
_
b
a
F
t
dM
t
_
2
_
= 0.
Exercise 4.7.1 Let c be a constant. Show that
_
b
a
c dM
t
= c(M
b
M
a
).
107
4.8 An Work Out Example: the case F
t
= M
t
We shall integrate the process M
t
between 0 and T with respect to M
t
. Consider the
equidistant partition points t
k
=
kT
n
, k = 0, 1, , n 1. Then the partial sums are
given by
S
n
=
n1
i=0
M
t
i
(M
t
i+1
M
t
i
).
Using xy =
1
2
[(x +y)
2
x
2
y
2
], by letting x = M
t
i
and y = M
t
i+1
M
t
i
, we get
M
t
i
(M
t
i+1
M
t
i
) =
1
2
(M
t
i+1
M
t
i
+M
t
i
)
2
1
2
M
2
t
i

1
2
(M
t
i+1
M
t
i
)
2
.
Let J be the set of jump instances between 0 and T. Using that M
t
i
= M
t
i
for t
i
/ J,
and M
t
i
= 1 +M
t
i
for t
i
J yields
M
t
i+1
M
t
i
+M
t
i
=
_
M
t
i+1
, if t
i
/ J
M
t
i+1
1, if t
i
J.
Splitting the sum, canceling in pairs, and applying the dierence of squares formula we
have
S
n
=
1
2
n1
i=0
(M
t
i+1
M
t
i
+M
t
i
)
2
1
2
n1
i=0
M
2
t
i

1
2
n1
i=0
(M
t
i+1
M
t
i
)
2
=
1
2
t
i
J
(M
t
i+1
1)
2
+
1
2
t
i
/ J
M
2
t
i+1

1
2
t
i
/ J
M
2
t
i

1
2
t
i
J
M
2
t
i

1
2
n1
i=0
(M
t
i+1
M
t
i
)
2
=
1
2
t
i
J
_
(M
t
i+1
1)
2
M
2
t
i
_
+
1
2
M
2
tn

1
2
n1
i=0
(M
t
i+1
M
t
i
)
2
=
1
2
t
i
J
(M
t
i+1
1 M
t
i
. .
=0
)(M
t
i+1
1 +M
t
i
) +
1
2
M
2
tn

1
2
n1
i=0
(M
t
i+1
M
t
i
)
2
=
1
2
M
2
tn

1
2
n1
i=0
(M
t
i+1
M
t
i
)
2
.
Hence we have arrived at the following formula
_
T
0
M
t
dM
t
=
1
2
M
2
T

1
2
N
T
.
Similarly, one can obtain
_
b
a
M
t
dM
t
=
1
2
(M
2
b
M
2
a
)
1
2
(N
b
N
a
).
108
Exercise 4.8.1 (a) Show that E
_
_
b
a
M
t
dM
t
_
= 0,
(b) Find V ar
_
_
b
a
M
t
dM
t
_
.
Remark 4.8.2 (a) Let be a xed state of the world and assume the sample path
t N
t
() has a jump in the interval (a, b). Even if beyound the scope of this book,
one can be shown that the integral
_
b
a
N
t
() dN
t
does not exist in the Riemann-Stieltjes sense.
(b) Let N
t
denote the left-hand limit of N
t
. It can be shown that N
t
is predictable,
while N
t
is not.
The previous remarks provide the reason why in the following we shall work with
M
t
instead of M
t
: the integral
_
b
a
M
t
dN
t
might not exist, while
_
b
a
M
t
dN
t
does
exist.
_
T
0
N
t
dM
t
=
1
2
(N
2
t
N
t
)
_
t
0
N
t
dt.
Exercise 4.8.4 Find the variance of
_
T
0
N
t
dM
t
.
The following integrals with respect to a Poisson process N
t
are considered in the
Riemann-Stieltjes sense.
Proposition 4.8.5 For any continuous function f we have
(a) E
_
_
t
0
f(s) dN
s
_
=
_
t
0
f(s) ds;
(b) E
__
_
t
0
f(s) dN
s
_
2
_
=
_
t
0
f(s)
2
ds +
2
_
_
t
0
f(s) ds
_
2
;
(c) E
_
e
t
0
f(s) dNs
_
= e
t
0
(e
f(s)
1) ds
.
Proof: (a) Consider the equidistant partition 0 = s
0
< s
1
< < s
n
= t, with
s
k+1
s
k
= s. Then
109
E
_
_
t
0
f(s) dN
s
_
= lim
n
E
_
n1
i=0
f(s
i
)(N
s
i+1
N
s
i
)
_
= lim
n
n1
i=0
f(s
i
)E
_
N
s
i+1
N
s
i
_
= lim
n
n1
i=0
f(s
i
)(s
i+1
s
i
) =
_
t
0
f(s) ds.
(b) Using that N
t
is stationary and has independent increments, we have respectively
E[(N
s
i+1
N
s
i
)
2
] = E[N
2
s
i+1
s
i
] = (s
i+1
s
i
) +
2
(s
i+1
s
i
)
2
= s +
2
(s)
2
,
E[(N
s
i+1
N
s
i
)(N
s
j+1
N
s
j
)] = E[(N
s
i+1
N
s
i
)]E[(N
s
j+1
N
s
j
)]
= (s
i+1
s
i
)(s
j+1
s
j
) =
2
(s)
2
.
Applying the expectation to the formula
_
n1
i=0
f(s
i
)(N
s
i+1
N
s
i
)
_
2
=
n1
i=0
f(s
i
)
2
(N
s
i+1
N
s
i
)
2
+2
i=j
f(s
i
)f(s
j
)(N
s
i+1
N
s
i
)(N
s
j+1
N
s
j
)
yields
E
__
n1
i=0
f(s
i
)(N
s
i+1
N
s
i
)
_
2
_
=
n1
i=0
f(s
i
)
2
(s +
2
(s)
2
) + 2
i=j
f(s
i
)f(s
j
)
2
(s)
2
=
n1
i=0
f(s
i
)
2
s +
2
_
n1
i=0
f(s
i
)
2
(s)
2
+ 2
i=j
f(s
i
)f(s
j
)(s)
2
_
=
n1
i=0
f(s
i
)
2
s +
2
_
n1
i=0
f(s
i
) s
_
2

_
t
0
f(s)
2
ds +
2
_
_
t
0
f(s) ds
_
2
, as n .
(c) Using that N
t
is stationary with independent increments and has the moment
110
generating function E[e
kNt
] = e
(e
k
1)t
, we have
E
_
e
t
0
f(s) dNs
_
= lim
n
E
_
e
n1
i=0
f(s
i
)(Ns
i+1
Ns
i
)
_
= lim
n
E
_
n1
i=0
e
f(s
i
)(Ns
i+1
Ns
i
)
_
= lim
n
n1
i=0
E
_
e
f(s
i
)(Ns
i+1
Ns
i
)
_
= lim
n
n1
i=0
E
_
e
f(s
i
)(N
s
i+1
s
i
)
_
= lim
n
n1
i=0
e
(e
f(s
i
)
1)(s
i+1
s
i
)
= lim
n
e
n1
i=0
(e
f(s
i
)
1)(s
i+1
s
i
)
= e
t
0
(e
f(s)
1) ds
.
Since f is continuous, the Poisson integral
_
t
0
f(s) dN
s
can be computed in terms
of the waiting times S
k
_
t
0
f(s) dN
s
=
Nt
k=1
f(S
k
).
This formula can be used to give a proof for the previous result. For instance, taking
the expectation and using conditions over N
t
= n, yields
E
_
_
t
0
f(s) dN
s
_
= E
_
Nt
k=1
f(S
k
)
_
=
n0
E
_
n
k=1
f(S
k
)[N
t
= n
_
P(N
t
= n)
=
n0
n
t
_
t
0
f(x) dx
(t)
n
n!
e
t
= e
t
_
t
0
f(x) dx
1
t
n0
(t)
n
(n 1)!
= e
t
_
t
0
f(x) dxe
t
=
_
t
0
f(x) dx.
Exercise 4.8.6 Solve parts (b) and (c) of Proposition 4.8.5 using a similar idea with
the one presented above.
E
__
_
t
0
f(s) dM
s
_
2
_
=
_
t
0
f(s)
2
ds,
where M
t
= N
t
t is the compensated Poisson process.
Exercise 4.8.8 Prove that
V ar
_
_
t
0
f(s) dN
s
_
=
_
t
0
f(s)
2
dN
s
.
111
Exercise 4.8.9 Find
E
_
e
t
0
f(s) dMs
_
.
Proposition 4.8.10 Let T
t
= (N
s
; 0 s t). Then for any constant c, the process
M
t
= e
cNt+(1e
c
)t
, t 0
is an T
t
-martingale.
Proof: Let s < t. Since N
t
N
s
is independent of T
s
and N
t
is stationary, we have
E[e
c(NtNs)
[T
s
] = E[e
c(NtNs)
] = E[e
cN
ts
]
= e
(e
c
1)(ts)
.
On the other side, taking out the predictable part yields
E[e
c(NtNs)
[T
s
] = e
cNs
E[e
cNt
[T
s
].
Equating the last two relations we arrive at
E[e
cNt+(1e
c
)t
[T
s
] = e
cNs+(1e
c
)s
,
which is equivalent with the martingale condition E[M
t
[T
s
] = M
s
.
We shall present an application of the previous result. Consider the waiting time
until the nth jump, S
n
= inft > 0; N
t
= n, which is a stopping time, and the ltration
T
t
= (N
s
; 0 s t). Since
M
t
= e
cNt+(1e
c
)t
is an T
t
-martingale, by the Optional Stopping Theorem (Theorem 3.2.1) we have
E[M
Sn
] = E[M
0
] = 1, which is equivalent with E[e
(1e
c
)Sn
] = e
cn
. Substituting
s = (1e
c
), then c = ln(1+
s
). Since s, > 0, then c > 0. The previous expression

becomes
E[e
sSn
] = e
nln(1+
s
)
=
_

+s
_
x
.
Since the expectation on the left side is the Laplace transform of the probability density
of S
n
, then
p(S
n
) = /
1
E[e
sSn
] = /
1
__

+s
_
x
_
=
e
t
t
n1
n
(n)
,
which shows that S
n
has a gamma distribution.
112
4.9 The distribution function of X
T
=
_
T
0
g(t) dN
t
In this section we consider the function g(t) continuous. Let S
1
< S
2
< < S
Nt
denote the waiting times until time t. Since the increments dN
t
are equal to 1 at S
k
and 0 otherwise, the integral can be written as
X
T
=
_
T
0
g(t) dN
t
= g(S
1
) + +g(S
Nt
).
The distribution function of the random variable X
T
=
_
T
0
g(t) dN
t
can be obtained
conditioning over the N
t
P(X
T
u) =
k0
P(X
T
u[N
T
= k) P(N
T
= k)
=
k0
P(g(S
1
) + +g(S
Nt
) u[N
T
= k) P(N
T
= k)
=
k0
P(g(S
1
) + +g(S
k
) u) P(N
T
= k). (4.9.5)
Considering S
1
, S
2
, , S
k
independent and uniformly distributed over the interval
[0, T], we have
P
_
g(S
1
) + +g(S
k
) u
_
=
_
D
k
1
T
k
dx
1
dx
k
=
vol(D
k
)
T
k
,
where
D
k
= g(x
1
) +g(x
2
) + +g(x
k
) u 0 x
1
, , x
k
T.
Substituting back in (4.9.5) yields
P(X
T
u) =
k0
P(g(S
1
) + +g(S
k
) u) P(N
T
= k)
=
k0
vol(D
k
)
T
k
k
T
k
k!
e
T
= e
T
k0
k
vol(D
k
)
k!
. (4.9.6)
In general, the volume of the k-dimensional solid D
k
is not obvious easy to obtain.
However, there are simple cases when this can be computed explicitly.
A Particular Case. We shall do an explicit computation of the partition function of
X
T
=
_
T
0
s
2
dN
s
. In this case the solid D
k
is the intersection between the k-dimensional
ball of radius

u centered at the origin and the k-dimensional cube [0, T]
k
. There are
three possible shapes for D
k
, which depend on the size of

u:
(a) if 0

u < T, then D
k
is a
1
2
k
-part of a k-dimensional sphere;
(b) if T

u < T
k, then D
k
has a complicated shape;
113
(c) if T
k

u, then D
k
is the entire k-dimensional cube, and then vol(D
k
) = T
k
.
Since the volume of the k-dimensional ball of radius R is given by

k/2
R
k
(
k
2
+ 1)
, then
the volume of D
k
in case (a) becomes
vol(D
k
) =

k/2
u
k/2
2
k
(
k
2
+ 1)
.
P(X
T
u) = e
T
k0
(
2
u)
k/2
k!(
k
2
+ 1)
, 0
u < T.
It is worth noting that for u , the inequality T
k

u is satised for all
k 0; hence relation (4.9.6) yields
lim
u
P(X
T
u) = e
T
k0
k
T
k
k!
= e
kT
e
kT
= 1.
The computation in case (b) is more complicated and will be omitted.
Exercise 4.9.1 Calculate the expectation E
_
_
T
0
e
ks
dN
s
_
and the variance V ar
_
_
T
0
e
ks
dN
s
_
.
Exercise 4.9.2 Compute the distribution function of X
t
=
_
T
0
s dN
s
.
114
Chapter 5
Stochastic Dierentiation
5.1 Dierentiation Rules
Most stochastic processes are not dierentiable. For instance, the Brownian motion
process W
t
is a continuous process which is nowhere dierentiable. Hence, derivatives
like
dWt
dt
do not make sense in stochastic calculus. The only quantities allowed to be
used are the innitesimal changes of the process, in our case, dW
t
.
The innitesimal change of a process
The change in the process X
t
between instances t and t + t is given by X
t
=
X
t+t
X
t
. When t is innitesimally small, we obtain the innitesimal change of a
process X
t
dX
t
= X
t+dt
X
t
.
Sometimes it is useful to use the equivalent formula X
t+dt
= X
t
+dX
t
.
5.2 Basic Rules
The following rules are the analog of some familiar dierentiation rules from elementary
Calculus.
The constant multiple rule
If X
t
is a stochastic process and c is a constant, then
d(c X
t
) = c dX
t
.
The verication follows from a straightforward application of the innitesimal change
formula
d(c X
t
) = c X
t+dt
c X
t
= c(X
t+dt
X
t
) = c dX
t
.
115
116
The sum rule
If X
t
and Y
t
are two stochastic processes, then
d(X
t
+Y
t
) = dX
t
+dY
t
.
The verication is as in the following:
d(X
t
+Y
t
) = (X
t+dt
+Y
t+dt
) (X
t
+Y
t
)
= (X
t+dt
X
t
) + (Y
t+dt
Y
t
)
= dX
t
+dY
t
.
The dierence rule
If X
t
and Y
t
d(X
t
Y
t
) = dX
t
dY
t
.
The proof is similar to the one for the sum rule.
The product rule
If X
t
and Y
t
d(X
t
Y
t
) = X
t
dY
t
+Y
t
dX
t
+dX
t
dY
t
.
The proof is as follows:
d(X
t
Y
t
) = X
t+dt
Y
t+dt
X
t
Y
t
= X
t
(Y
t+dt
Y
t
) +Y
t
(X
t+dt
X
t
) + (X
t+dt
X
t
)(Y
t+dt
Y
t
)
= X
t
dY
t
+Y
t
dX
t
+dX
t
dY
t
,
where the second identity is veried by direct computation.
If the process X
t
is replaced by the deterministic function f(t), then the aforemen-
tioned formula becomes
d(f(t)Y
t
) = f(t) dY
t
+Y
t
df(t) +df(t) dY
t
.
Since in most practical cases the process Y
t
is an Ito diusion
dY
t
= a(t, W
t
)dt +b(t, W
t
)dW
t
,
using the relations dt dW
t
= dt
2
= 0, the last term vanishes
df(t) dY
t
= f
(t)dtdY
t
= 0,
and hence
d(f(t)Y
t
) = f(t) dY
t
+Y
t
df(t).
117
This relation looks like the usual product rule.
The quotient rule
If X
t
and Y
t
d
_
X
t
Y
t
_
=
Y
t
dX
t
X
t
dY
t
dX
t
dY
t
Y
2
t
+
X
t
Y
3
t
(dY
t
)
2
.
The proof follows from Itos formula and will be addressed in section 5.3.3.
When the process Y
t
is replaced by the deterministic function f(t), and X
t
is an
Ito diusion, then the previous formula becomes
d
_
X
t
f(t)
_
=
f(t)dX
t
X
t
df(t)
f(t)
2
.
Example 5.2.1 We shall show that
d(W
2
t
) = 2W
t
dW
t
+dt.
Applying the product rule and the fundamental relation (dW
t
)
2
= dt, yields
d(W
2
t
) = W
t
dW
t
+W
t
dW
t
+dW
t
dW
t
= 2W
t
dW
t
+dt.
Example 5.2.2 Show that
d(W
3
t
) = 3W
2
t
dW
t
+ 3W
t
dt.
Applying the product rule and the previous exercise yields
d(W
3
t
) = d(W
t
W
2
t
) = W
t
d(W
2
t
) +W
2
t
dW
t
+d(W
2
t
) dW
t
= W
t
(2W
t
dW
t
+dt) +W
2
t
dW
t
+dW
t
(2W
t
dW
t
+dt)
= 2W
2
t
dW
t
+W
t
dt +W
2
t
dW
t
+ 2W
t
(dW
t
)
2
+dt dW
t
= 3W
2
t
dW
t
+ 3W
t
dt,
where we used (dW
t
)
2
= dt and dt dW
t
= 0.
Example 5.2.3 Show that d(tW
t
) = W
t
dt +t dW
t
.
Using the product rule and dt dW
t
= 0, we get
d(tW
t
) = W
t
dt +t dW
t
+dt dW
t
= W
t
dt +t dW
t
.
Example 5.2.4 Let Z
t
=
_
t
0
W
u
du be the integrated Brownian motion. Show that
dZ
t
= W
t
dt.
118
The innitesimal change of Z
t
is
dZ
t
= Z
t+dt
Z
t
=
_
t+dt
t
W
s
ds = W
t
dt,
since W
s
is a continuous function in s.
Example 5.2.5 Let A
t
=
1
t
Z
t
=
1
t
_
t
0
W
u
du be the average of the Brownian motion on
the time interval [0, t]. Show that
dA
t
=
1
t
_
W
t
1
t
Z
t
_
dt.
We have
dA
t
= d
_
1
t
_
Z
t
+
1
t
dZ
t
+d
_
1
t
_
dZ
t
=
1
t
2
Z
t
dt +
1
t
W
t
dt +
1
t
2
W
t
dt
2
..
=0
=
1
t
_
W
t
1
t
Z
t
_
dt.
Exercise 5.2.1 Let G
t
=
1
t
_
t
0
e
Wu
du be the average of the geometric Brownian motion
on [0, t]. Find dG
t
.
5.3 Itos Formula
Itos formula is the analog of the chain rule from elementary Calculus. We shall start
by reviewing a few concepts regarding function approximations.
Let f be a dierentiable function of a real variable x. Let x
0
be xed and consider
the changes x = x x
0
and f(x) = f(x) f(x
0
). It is known from Calculus that
the following second order Taylor approximation holds
f(x) = f
(x)x +
1
2
f
(x)(x)
2
+O(x)
3
.
When x is innitesimally close to x
0
, we replace x by the dierential dx and obtain
df(x) = f
(x)dx +
1
2
f
(x)(dx)
2
+O(dx)
3
. (5.3.1)
In the elementary Calculus, all terms involving terms of equal or higher order to dx
2
are neglected; then the aforementioned formula becomes
df(x) = f
(x)dx.
119
Now, if we consider x = x(t) to be a dierentiable function of t, substituting into the
previous formula we obtain the dierential form of the well known chain rule
df
_
x(t)
_
= f
_
x(t)
_
dx(t) = f
_
x(t)
_
x
(t) dt.
We shall present a similar formula for the stochastic environment. In this case the
deterministic function x(t) is replaced by a stochastic process X
t
. The composition
between the dierentiable function f and the process X
t
is denoted by F
t
= f(X
t
).
Neglecting the increment powers higher than or equal to (dX
t
)
3
, the expression
(5.3.1) becomes
dF
t
= f
_
X
t
_
dX
t
+
1
2
f
_
X
t
__
dX
t
_
2
. (5.3.2)
In the computation of dX
t
we may take into the account stochastic relations such as
dW
2
t
= dt, or dt dW
t
= 0.
5.3.1 Itos formula for diusions
The previous formula is a general case of Itos formula. However, in most cases the
increments dX
t
are given by some particular relations. An important case is when the
increment is given by
dX
t
= a(W
t
, t)dt +b(W
t
, t)dW
t
.
A process X
t
satisfying this relation is called an Ito diusion.
Theorem 5.3.1 (Itos formula for diusions) If X
t
is an Ito diusion, and F
t
=
f(X
t
), then
dF
t
=
_
a(W
t
, t)f
(X
t
) +
b(W
t
, t)
2
2
f
(X
t
)
_
dt +b(W
t
, t)f
(X
t
) dW
t
. (5.3.3)
Proof: We shall provide a formal proof. Using relations dW
2
t
= dt and dt
2
= dW
t
dt =
0, we have
(dX
t
)
2
=
_
a(W
t
, t)dt +b(W
t
, t)dW
t
_
2
= a(W
t
, t)
2
dt
2
+ 2a(W
t
, t)b(W
t
, t)dW
t
dt +b(W
t
, t)
2
dW
2
t
= b(W
t
, t)
2
dt.
dF
t
= f
_
X
t
_
dX
t
+
1
2
f
_
X
t
__
dX
t
_
2
= f
_
X
t
_
_
a(W
t
, t)dt +b(W
t
, t)dW
t
_
+
1
2
f
_
X
t
_
b(W
t
, t)
2
dt
=
_
a(W
t
, t)f
(X
t
) +
b(W
t
, t)
2
2
f
(X
t
)
_
dt +b(W
t
, t)f
(X
t
) dW
t
.
120
In the case X
t
= W
t
we obtain the following consequence:
Corollary 5.3.2 Let F
t
= f(W
t
). Then
dF
t
=
1
2
f
(W
t
)dt +f
(W
t
) dW
t
. (5.3.4)
Particular cases
1. If f(x) = x
, with constant, then f
(x) = x
1
and f
(x) = (1)x
2
. Then
(5.3.4) becomes the following useful formula
d(W
t
) =
1
2
( 1)W
2
t
dt +W
1
t
dW
t
.
A couple of useful cases easily follow:
d(W
2
t
) = 2W
t
dW
t
+dt
d(W
3
t
) = 3W
2
t
dW
t
+ 3W
t
dt.
2. If f(x) = e
kx
, with k constant, f
(x) = ke
kx
, f
(x) = k
2
e
kx
. Therefore
d(e
kWt
) = ke
kWt
dW
t
+
1
2
k
2
e
kWt
dt.
In particular, for k = 1, we obtain the increments of a geometric Brownian motion
d(e
Wt
) = e
Wt
dW
t
+
1
2
e
Wt
dt.
3. If f(x) = sin x, then
d(sin W
t
) = cos W
t
dW
t
1
2
sin W
t
dt.
Exercise 5.3.3 Use the previous rules to nd the following increments
(a) d(W
t
e
Wt
)
(b) d(3W
2
t
+ 2e
5Wt
)
(c) d(e
t+W
2
t
)
(d) d
_
(t +W
t
)
n
_
.
(e) d
_
1
t
_
t
0
W
u
du
_
(f) d
_
1
t
_
t
0
e
Wu
du
_
, where is a constant.
121
In the case when the function f = f(t, x) is also time dependent, the analog of
(5.3.1) is given by
df(t, x) =
t
f(t, x)dt +
x
f(t, x)dx +
1
2
2
x
f(t, x)(dx)
2
+O(dx)
3
+O(dt)
2
. (5.3.5)
Substituting x = X
t
yields
df(t, X
t
) =
t
f(t, X
t
)dt +
x
f(t, X
t
)dX
t
+
1
2
2
x
f(t, X
t
)(dX
t
)
2
. (5.3.6)
If X
t
is an Ito diusion we obtain an extra-term in formula (5.3.3)
dF
t
=
_
t
f(t, X
t
) +a(W
t
, t)
x
f(t, X
t
) +
b(W
t
, t)
2
2

2
x
f(t, X
t
)
_
dt
+b(W
t
, t)
x
f(t, X
t
) dW
t
. (5.3.7)
d(tW
2
t
) = (t +W
2
t
)dt + 2tW
t
dW
t
.
Exercise 5.3.5 Find the following increments
(a) d(tW
t
) (c) d(t
2
cos W
t
)
(b) d(e
t
W
t
) (d) d(sin t W
2
t
).
5.3.2 Itos formula for Poisson processes
Consider the process F
t
= F(M
t
), where M
t
= N
t
t is the compensated Poisson
process. Itos formula for the process F
t
takes the following integral form. For a proof
the reader can consult Kuo [9].
Proposition 5.3.6 Let F be a twice dierentiable function. Then for any a < t we
have
F
t
= F
a
+
_
t
a
F
(M
s
) dM
s
+
a<st
_
F(M
s
) F
(M
s
)M
s
_
,
where M
s
= M
s
M
s
and F(M
s
) = F(M
s
) F(M
s
).
We shall apply the aforementioned result for the case F
t
= F(M
t
) = M
2
t
. We have
M
2
t
= M
2
a
+ 2
_
t
a
M
s
dM
s
+
a<st
_
M
2
s
M
2
s
2M
s
(M
s
M
s
)
_
. (5.3.8)
122
Since the jumps in N
s
are of size 1, we have (N
s
)
2
= N
s
. Since the dierence of the
processes M
s
and N
s
is continuous, then M
s
= N
s
. Using these formulas we have
_
M
2
s
M
2
s
2M
s
(M
s
M
s
)
_
= (M
s
M
s
)
_
M
s
+M
s
2M
s
_
= (M
s
M
s
)
2
= (M
s
)
2
= (N
s
)
2
= N
s
= N
s
N
s
.
Since the sum of the jumps between s and t is
a<st
N
s
= N
t
N
a
, formula (5.3.8)
becomes
M
2
t
= M
2
a
+ 2
_
t
a
M
s
dM
s
+N
t
N
a
. (5.3.9)
The dierential form is
d(M
2
t
) = 2M
t
dM
t
+dN
t
,
which is equivalent with
d(M
2
t
) = (1 + 2M
t
) dM
t
+dt,
since dN
t
= dM
t
+dt.
_
T
0
M
t
dM
t
=
1
2
(M
2
T
N
T
).
Exercise 5.3.8 Use Itos formula for the Poison process to nd the conditional expec-
tation E[M
2
t
[T
s
] for s < t.
5.3.3 Itos multidimensional formula
If the process F
t
depends on several Ito diusions, say F
t
= f(t, X
t
, Y
t
), then a similar
formula to (5.3.7) leads to
dF
t
=
f
t
(t, X
t
, Y
t
)dt +
f
x
(t, X
t
, Y
t
)dX
t
+
f
y
(t, X
t
, Y
t
)dY
t
+
1
2
2
f
x
2
(t, X
t
, Y
t
)(dX
t
)
2
+
1
2
2
f
y
2
(t, X
t
, Y
t
)(dY
t
)
2
+

2
f
xy
(t, X
t
, Y
t
)dX
t
dY
t
.
Particular cases
In the case when F
t
= f(X
t
, Y
t
), with X
t
= W
1
t
, Y
t
= W
2
t
independent Brownian
123
motions, we have
dF
t
=
f
x
dW
1
t
+
f
y
dW
2
t
+
1
2
2
f
x
2
(dW
1
t
)
2
+
1
2
2
f
y
2
(dW
2
t
)
2
+
1
2
2
f
xy
dW
1
t
dW
2
t
=
f
x
dW
1
t
+
f
y
dW
2
t
+
1
2
_
2
f
x
2
+

2
f
y
2
_
dt
The expression
f =
1
2
_
2
f
x
2
+

2
f
y
2
_
is called the Laplacian of f. We can rewrite the previous formula as
dF
t
=
f
x
dW
1
t
+
f
y
dW
2
t
+ f dt
A function f with f = 0 is called harmonic. The aforementioned formula in the case
of harmonic functions takes the simple form
dF
t
=
f
x
dW
1
t
+
f
y
dW
2
t
. (5.3.10)
Exercise 5.3.9 Let W
1
t
, W
2
t
be two independent Brownian motions. If the function f
is harmonic, show that F
t
= f(W
1
t
, W
2
t
) is a martingale. Is the converse true?
Exercise 5.3.10 Use the previous formulas to nd dF
t
in the following cases
(a) F
t
= (W
1
t
)
2
+ (W
2
t
)
2
(b) F
t
= ln[(W
1
t
)
2
+ (W
2
t
)
2
].
Exercise 5.3.11 Consider the Bessel process R
t
=
_
(W
1
t
)
2
+ (W
2
t
)
2
, where W
1
t
and
W
2
t
are two independent Brownian motions. Prove that
dR
t
=
1
2R
t
dt +
W
1
t
R
t
dW
1
t
+
W
2
t
R
t
dW
2
t
.
Example 5.3.1 (The product rule) Let X
t
and Y
t
be two processes. Show that
d(X
t
Y
t
) = Y
t
dX
t
+X
t
dY
t
+dX
t
dY
t
.
Consider the function f(x, y) = xy. Since
x
f = y,
y
f = x,
2
x
f =
2
y
f = 0,
x
y
= 1,
then Itos multidimensional formula yields
d(X
t
Y
t
) = d
_
f(X
,
Y
t
)
_
=
x
f dX
t
+
y
f dY
t
+
1
2
2
x
f(dX
t
)
2
+
1
2
2
y
f(dY
t
)
2
+
x
y
f dX
t
dY
t
= Y
t
dX
t
+X
t
dY
t
+dX
t
dY
t
.
124
Example 5.3.2 (The quotient rule) Let X
t
and Y
t
be two processes. Show that
d
_
X
t
Y
t
_
=
Y
t
dX
t
X
t
dY
t
dX
t
dY
t
Y
2
t
+
X
t
Y
2
t
(dY
t
)
2
.
Consider the function f(x, y) =
x
y
. Since
x
f =
1
y
,
y
f =
x
y
2
,
2
x
f = 0,
2
y
f =
x
y
2
,
y
=
1
y
2
, then applying Itos multidimensional formula yields
d
_
X
t
Y
t
_
= d
_
f(X
,
Y
t
)
_
=
x
f dX
t
+
y
f dY
t
+
1
2
2
x
f(dX
t
)
2
+
1
2
2
y
f(dY
t
)
2
+
x
y
f dX
t
dY
t
=
1
Y
t
dX
t

X
t
Y
2
t
dY
t
1
Y
2
t
dX
t
dY
t
=
Y
t
dX
t
X
t
dY
t
dX
t
dY
t
Y
2
t
+
X
t
Y
2
t
(dY
t
)
2
.
Chapter 6
Stochastic Integration Techniques
Computing a stochastic integral starting from the denition of the Ito integral is a
quite inecient method. Like in the elementary Calculus, several methods can be
developed to compute stochastic integrals. In order to keep the analogy with the
elementary Calculus, we have called them Fundamental Theorem of Stochastic Calculus
and Integration by Parts. The integration by substitution is more complicated in the
stochastic environment and we have considered only a particular case of it, which we
called The method of heat equation.
6.1 Fundamental Theorem of Stochastic Calculus
Consider a process X
t
whose increments satisfy the equation dX
t
= f(t, W
t
)dW
t
. In-
tegrating formally between a and t yields
_
t
a
dX
s
=
_
t
a
f(s, W
s
)dW
s
. (6.1.1)
The integral on the left side can be computed as in the following. If we consider the
partition 0 = t
0
< t
1
< < t
n1
< t
n
= t, then
_
t
a
dX
s
= ms-lim
n
n1
j=0
(X
t
j+1
X
t
j
) = X
t
X
a
,
since we canceled the terms in pairs. Substituting into formula (6.1.1) yields X
t
=
X
a
+
_
t
a
f(s, W
s
)dW
s
, and hence dX
t
= d
_
_
t
a
f(s, W
s
)dW
s
_
, since X
a
is a constant.
Theorem 6.1.1 (The Fundamental Theorem of Stochastic Calculus)
(i) For any a < t, we have
d
_
_
t
a
f(s, W
s
)dW
s
_
= f(t, W
t
)dW
t
.
125
126
(ii) If Y
t
is a stochastic process, such that Y
t
dW
t
= dF
t
, then
_
b
a
Y
t
dW
t
= F
b
F
a
.
We shall provide a few applications of the aforementioned theorem.
Example 6.1.1 Verify the stochastic formula
_
t
0
W
s
dW
s
=
W
2
t
2

t
2
.
Let X
t
=
_
t
0
W
s
dW
s
and Y
t
=
W
2
t
2

t
2
. From Itos formula
dY
t
= d
_
W
2
t
2
_
d
_
t
2
_
=
1
2
(2W
t
dW
t
+dt)
1
2
dt = W
t
dW
t
,
and from the Fundamental Theorem of Stochastic Calculus
dX
t
= d
_
_
t
0
W
s
dW
s
_
= W
t
dW
t
.
Hence dX
t
= dY
t
, or d(X
t
Y
t
) = 0. Since the process X
t
Y
t
has zero increments,
then X
t
Y
t
= c, constant. Taking t = 0, yields
c = X
0
Y
0
=
_
0
0
W
s
dW
s
_
W
2
0
2

0
2
_
= 0,
and hence c = 0. It follows that X
t
= Y
t
, which veries the desired relation.
Example 6.1.2 Verify the formula
_
t
0
sW
s
dW
s
=
t
2
_
W
2
t

t
2
_
1
2
_
t
0
W
2
s
ds.
Consider the stochastic processes X
t
=
_
t
0
sW
s
dW
s
, Y
t
=
t
2
_
W
2
t
1
_
, and Z
t
=
1
2
_
t
0
W
2
s
ds. The Fundamental Theorem yields
dX
t
= tW
t
dW
t
dZ
t
=
1
2
W
2
t
dt.
Applying Itos formula, we get
dY
t
= d
_
t
2
_
W
2
t

t
2
__
=
1
2
d(tW
2
t
) d
_
t
2
4
_
=
1
2
_
(t +W
2
t
)dt + 2tW
t
dW
t
_
1
2
tdt
=
1
2
W
2
t
dt +tW
t
dW
t
.
127
We can easily see that
dX
t
= dY
t
dZ
t
.
This implies d(X
t
Y
t
+Z
t
) = 0, i.e. X
t
Y
t
+Z
t
= c, constant. Since X
0
= Y
0
= Z
0
= 0,
it follows that c = 0. This proves the desired relation.
Example 6.1.3 Show that
_
t
0
(W
2
s
s) dW
s
=
1
3
W
3
t
tW
t
.
Consider the function f(t, x) =
1
3
x
3
tx, and let F
t
= f(t, W
t
). Since
t
f = x,
x
f = x
2
t, and
2
x
f = 2x, then Itos formula provides
dF
t
=
t
f dt +
x
f dW
t
+
1
2
2
x
f (dW
t
)
2
= W
t
dt + (W
2
t
t) dW
t
+
1
2
2W
t
dt
= (W
2
t
t)dW
t
.
From the Fundamental Theorem we get
_
t
0
(W
2
s
s) dW
s
=
_
t
0
dF
s
= F
t
F
0
= F
t
=
1
3
W
3
t
tW
t
.
6.2 Stochastic Integration by Parts
Consider the process F
t
= f(t)g(W
t
), with f and g dierentiable. Using the product
rule yields
dF
t
= df(t) g(W
t
) +f(t) dg(W
t
)
= f
(t)g(W
t
)dt +f(t)
_
g
(W
t
)dW
t
+
1
2
g
(W
t
)dt
_
= f
(t)g(W
t
)dt +
1
2
f(t)g
(W
t
)dt +f(t)g
(W
t
)dW
t
.
Writing the relation in the integral form, we obtain the rst integration by parts for-
mula:
_
b
a
f(t)g
(W
t
) dW
t
= f(t)g(W
t
)
b
a
_
b
a
f
(t)g(W
t
) dt
1
2
_
b
a
f(t)g
(W
t
) dt.
This formula is to be used when integrating a product between a function of t and
a function of the Brownian motion W
t
, for which an antiderivative is known. The
following two particular cases are important and useful in applications.
128
1. If g(W
t
) = W
t
, the aforementioned formula takes the simple form
_
b
a
f(t) dW
t
= f(t)W
t
t=b
t=a
_
b
a
f
(t)W
t
dt. (6.2.2)
It is worth noting that the left side is a Wiener integral.
2. If f(t) = 1, then the formula becomes
_
b
a
g
(W
t
) dW
t
= g(W
t
)
t=b
t=a
1
2
_
b
a
g
(W
t
) dt. (6.2.3)
Application 1 Consider the Wiener integral I
T
=
_
T
0
t dW
t
. From the general theory,
see Proposition 4.6.1, it is known that I is a random variable normally distributed with
mean 0 and variance
V ar[I
T
] =
_
T
0
t
2
dt =
T
3
3
.
Recall the denition of integrated Brownian motion
Z
t
=
_
t
0
W
u
du.
Formula (6.2.2) yields a relationship between I and the integrated Brownian motion
I
T
=
_
T
0
t dW
t
= TW
T

_
T
0
W
t
dt = TW
T
Z
T
,
and hence I
T
+ Z
T
= TW
T
. This relation can be used to compute the covariance
between I
T
and Z
T
.
Cov(I
T
+Z
T
, I
T
+Z
T
) = V ar[TW
T
]
V ar[I
T
] +V ar[Z
T
] + 2Cov(I
T
, Z
T
) = T
2
V ar[W
T
]
T
3
/3 +T
3
/3 + 2Cov(I
T
, Z
T
) = T
3
Cov(I
T
, Z
T
) = T
3
/6,
where we used that V ar[Z
T
] = T
3
/3. The processes I
t
and Z
t
Their correlation coecient is 0.5 as the following calculation shows
Corr(I
T
, Z
T
) =
Cov(I
T
, Z
T
)
_
V ar[I
T
]V ar[Z
T
]
_
1/2
=
T
3
/6
T
3
/3
= 1/2.
129
Application 2 If we let g(x) =
x
2
2
in formula (6.2.3), we get
_
b
a
W
t
dW
t
=
W
2
b
W
2
a
2

1
2
(b a).
It is worth noting that letting a = 0 and b = T, we retrieve a formula that was proved
by direct methods in chapter 2
_
T
0
W
t
dW
t
=
W
2
T
2

T
2
.
Similarly, if we let g(x) =
x
3
3
in (6.2.3) yields
_
b
a
W
2
t
dW
t
=
W
3
t
3
b
a
_
b
a
W
t
dt.
Application 3
Choosing f(t) = e
t
and g(x) = cos x, we shall compute the stochastic integral
_
T
0
e
t
cos W
t
dW
t
using the formula of integration by parts
_
T
0
e
t
cos W
t
dW
t
=
_
T
0
e
t
(sin W
t
)
dW
t
= e
t
sin W
t
T
0
_
T
0
(e
t
)
sin W
t
dt
1
2
_
T
0
e
t
(cos W
t
)
dt
= e
T
sin W
T

_
T
0
e
t
sin W
t
dt +
1
2
_
T
0
e
t
sin W
t
dt
= e
T
sin W
T

_

1
2
_
_
T
0
e
t
sin W
t
dt.
The particular case =
1
2
leads to the following exact formula of a stochastic integral
_
T
0
e
t
2
cos W
t
dW
t
= e
T
2
sin W
T
. (6.2.4)
In a similar way, we can obtain an exact formula for the stochastic integral
_
T
0
e
t
sin W
t
dW
t
as follows
_
T
0
e
t
sin W
t
dW
t
=
_
T
0
e
t
(cos W
t
)
dW
t
= e
t
cos W
t
T
0
+
_
T
0
e
t
cos W
t
dt
1
2
_
T
0
e
t
cos W
t
dt.
130
Taking =
1
2
yields the closed form formula
_
T
0
e
t
2
sin W
t
dW
t
= 1 e
T
2
cos W
T
. (6.2.5)
A consequence of the last two formulas and of Eulers formula
e
iWt
= cos W
t
+i sin W
t
,
is
_
T
0
e
t
2
+iWt
dW
t
= i(1 e
T
2
+iW
T
).
The proof details are left to the reader.
A general form of the integration by parts formula
In general, if X
t
and Y
t
are two Ito diusions, from the product formula
d(X
t
Y
t
) = X
t
dY
t
+Y
t
dX
t
+dX
t
dY
t
.
Integrating between the limits a and b
_
b
a
d(X
t
Y
t
) =
_
b
a
X
t
dY
t
+
_
b
a
Y
t
dX
t
+
_
b
a
dX
t
dY
t
.
From the Fundamental Theorem
_
b
a
d(X
t
Y
t
) = X
b
Y
b
X
a
Y
a
,
so the previous formula takes the following form of integration by parts
_
b
a
X
t
dY
t
= X
b
Y
b
X
a
Y
a
_
b
a
Y
t
dX
t
_
b
a
dX
t
dY
t
.
This formula is of theoretical value. In practice, the term dX
t
dY
t
needs to be computed
using the rules W
2
t
= dt, and dt dW
t
= 0.
Exercise 6.2.1 (a) Use integration by parts to get
_
T
0
1
1 +W
2
t
dW
t
= tan
1
(W
T
) +
_
T
0
W
t
(1 +W
2
t
)
2
dt, T > 0.
(b) Show that
E[tan
1
(W
T
)] =
_
T
0
E
_
W
t
(1 +W
2
t
)
2
_
dt.
131
(c) Prove the double inequality
3
16

x
(1 +x
2
)
2

3
3
16
, x R.
(d) Use part (c) to obtain
3
16
T
_
T
0
W
t
(1 +W
2
t
)
2
dt
3
3
16
T.
(e) Use part (d) to get
3
16
T E[tan
1
(W
T
)]
3
3
16
T.
(f) Does part (e) contradict the inequality
2
< tan
1
(W
T
) <

2
?
Exercise 6.2.2 (a) Show the relation
_
T
0
e
Wt
dW
t
= e
W
T
1
1
2
_
T
0
e
Wt
dt.
(b) Use part (a) to nd E[e
Wt
].
Exercise 6.2.3 (a) Use integration by parts to show
_
T
0
W
t
e
Wt
dW
t
= 1 +W
T
e
W
T
e
W
T
1
2
_
T
0
e
Wt
(1 +W
t
) dt;
(b) Use part (a) to nd E[W
t
e
Wt
];
(c) Show that Cov(W
t
, e
Wt
) = te
t/2
;
(d) Prove that Corr(W
t
, e
Wt
) =
_
t
e
t
1
, and compute the limits as t 0 and t .
Exercise 6.2.4 (a) Let T > 0. Show the following relation using integration by parts
_
T
0
2W
t
1 +W
2
t
dW
t
= ln(1 +W
2
T
)
_
T
0
1 W
2
t
(1 +W
2
t
)
2
dt.
(b) Show that for any real number x the following double inequality holds
1
8

1 x
2
(1 +x
2
)
2
1.
132
(c) Use part (b) to show that
1
8
T
_
T
0
1 W
2
t
(1 +W
2
t
)
2
dt T.
(d) Use parts (a) and (c) to get
T
8
E[ln(1 +W
2
T
)] T.
(e) Use Jensens inequality to get
E[ln(1 +W
2
T
)] ln(1 +T).
Does this contradict the upper bound provided in (d)?
6.3 The Heat Equation Method
In elementary Calculus, integration by substitution is the inverse application of the
chain rule. In the stochastic environment, this will be the inverse application of Itos
formula. This is dicult to apply in general, but there is a particular case of great
importance.
Let (t, x) be a solution of the equation
t
+
1
2
2
x
= 0. (6.3.6)
This is called the heat equation without sources. The non-homogeneous equation
t
+
1
2
2
x
= G(t, x) (6.3.7)
is called heat equation with sources. The function G(t, x) represents the density of heat
sources, while the function (t, x) is the temperature at point x at time t in a one-
dimensional wire. If the heat source is time independent, then G = G(x), i.e. G is a
function of x only.
Example 6.3.1 Find all solutions of the equation (6.3.6) of type (t, x) = a(t) +b(x).
Substituting into equation (6.3.6) yields
1
2
b
(x) = a
(t).
Since the left side is a function of x only, while the right side is a function of variable
t, the only where the previous equation is satised is when both sides are equal to the
133
same constant C. This is called a separation constant. Therefore a(t) and b(x) satisfy
the equations
a
(t) = C,
1
2
b
(x) = C.
Integrating yields a(t) = Ct +C
0
and b(x) = Cx
2
+C
1
x +C
2
. It follows that
(t, x) = C(x
2
t) +C
1
x +C
3
,
with C
0
, C
1
, C
2
, C
3
arbitrary constants.
Example 6.3.2 Find all solutions of the equation (6.3.6) of the type (t, x) = a(t)b(x).
Substituting into the equation and dividing by a(t)b(x) yields
a
(t)
a(t)
+
1
2
b
(x)
b(x)
= 0.
There is a separation constant C such that
a
(t)
a(t)
= C and
b
(x)
b(x)
= 2C. There are
three distinct cases to discuss:
1. C = 0. In this case a(t) = a
0
and b(x) = b
1
x+b
0
, with a
0
, a
1
, b
0
, b
1
real constants.
Then
(t, x) = a(t)b(x) = c
1
x +c
0
, c
0
, c
1
R
is just a linear function in x.
2. C > 0. Let > 0 such that 2C =
2
. Then a
(t) =
2
2
a(t) and b
(x) =
2
b(x),
with solutions
a(t) = a
0
e
2
t/2
b(x) = c
1
e
x
+c
2
e
x
.
The general solution of (6.3.6) is
(t, x) = e
2
t/2
(c
1
e
x
+c
2
e
x
), c
1
, c
2
R.
3. C < 0. Let > 0 such that 2C =
2
. Then a
(t) =

2
2
a(t) and b
(x) =
2
b(x). Solving yields
a(t) = a
0
e
2
t/2
b(x) = c
1
sin(x) +c
2
cos(x).
The general solution of (6.3.6) in this case is
(t, x) = e
2
t/2
_
c
1
sin(x) +c
2
cos(x)
_
, c
1
, c
2
R.
In particular, the functions x, x
2
t, e
xt/2
, e
xt/2
, e
t/2
sinx and e
t/2
cos x, or any
linear combination of them are solutions of the heat equation (6.3.6). However, there
are other solutions which are not of the previous type.
134
Exercise 6.3.1 Prove that (t, x) =
1
3
x
3
tx is a solution of the heat equation (6.3.6).
Exercise 6.3.2 Show that (t, x) = t
1/2
e
x
2
/(2t)
is a solution of the heat equation
(6.3.6) for t > 0.
Exercise 6.3.3 Let = u(), with =
x
2
t
, t > 0. Show that satises the heat
equation (6.3.6) if and only if u
+ 2u
= 0.
Exercise 6.3.4 Let erfc(x) =
2
_

x
e
r
2
dr. Show that = erfc(x/(2
t)) is a
solution of the equation (6.3.6).
Exercise 6.3.5 (the fundamental solution) Show that (t, x) =
1
4t
e
x
2
4t
, t > 0
satises the equation (6.3.6).
Sometimes it is useful to generate new solutions for the heat equation from other
solutions. Below we present a few ways to accomplish this:
(i) by linear combination: if
1
and
2
are solutions, then a
1
1
+a
1
2
is a solution,
where a
1
, a
2
constants.
(ii) by translation: if (t, x) is a solution, then (t , x ) is a solution, where
(, ) is a translation vector.
(iii) by ane transforms: if (t, x) is a solution, then (,
2
x) is a solution, for
any constant .
(iv) by dierentiation: if (t, x) is a solution, then

n+m
n
x
m
t
(t, x) is a solution.
(v) by convolution: if (t, x) is a solution, then so are
_
b
a
(t, x )f() d
_
b
a
(t , x)g() dt.
For more detail on the subject the reader can consult Widder [17] and Cannon [4].
Theorem 6.3.6 Let (t, x) be a solution of the heat equation (6.3.6) and denote
f(t, x) =
x
(t, x). Then
_
b
a
f(t, W
t
) dW
t
= (b, W
b
) (a, W
a
).
135
Proof: Let F
t
= (t, W
t
). Applying Itos formula we get
dF
t
=
x
(t, W
t
) dW
t
+
_
t
+
1
2
2
x
_
dt.
Since
t
+
1
2
2
x
= 0 and
x
(t, W
t
) = f(t, W
t
), we have
dF
t
= f(t, W
t
) dW
t
.
Applying the Fundamental Theorem yields
_
b
a
f(t, W
t
) dW
t
=
_
b
a
dF
t
= F
b
F
a
= (b, W
b
) (a, W
a
).
_
T
0
W
t
dW
t
=
1
2
W
2
T

1
2
T.
Choose the solution of the heat equation (6.3.6) given by (t, x) = x
2
t. Then
f(t, x) =
x
(t, x) = 2x. Theorem 6.3.6 yields
_
T
0
2W
t
dW
t
=
_
T
0
f(t, W
t
) dW
t
= (t, x)
T
0
= W
2
T
T.
Dividing by 2 leads to the desired result.
_
T
0
(W
2
t
t) dW
t
=
1
3
W
3
T
TW
T
.
Consider the function (t, x) =
1
3
x
3
tx, which is a solution of the heat equation
(6.3.6), see Exercise 6.3.1. Then f(t, x) =
x
(t, x) = x
2
t. Applying Theorem 6.3.6
yields
_
T
0
(W
2
t
t) dW
t
=
_
T
0
f(t, W
t
) dW
t
= (t, W
t
)
T
0
=
1
3
W
3
T
TW
T
.
Application 6.3.9 Let > 0. Prove the identities
_
T
0
e
2
t
2
Wt
dW
t
=
1
_
e
2
T
2
W
T
1
_
.
136
Consider the function (t, x) = e
2
t
2
x
, which is a solution of the homogeneous heat
equation (6.3.6), see Example 6.3.2. Then f(t, x) =
x
(t, x) = e
2
t
2
x
. Apply
Theorem 6.3.6 to get
_
T
0
e
2
t
2
x
dW
t
=
_
T
0
f(t, W
t
) dW
t
= (t, W
t
)
T
0
= e
2
T
2
W
T
1.
Dividing by the constant ends the proof.
In particular, for = 1 the aforementioned formula becomes
_
T
0
e
t
2
+Wt
dW
t
= e
T
2
+W
T
1. (6.3.8)
Application 6.3.10 Let > 0. Prove the identity
_
T
0
e
2
t
2
cos(W
t
) dW
t
=
1
2
T
2
sin(W
T
).
From the Example 6.3.2 we know that (t, x) = e
2
t
2
sin(x) is a solution of the heat
equation. Applying Theorem 6.3.6 to the function f(t, x) =
x
(t, x) = e
2
t
2
cos(x),
yields
_
T
0
e
2
t
2
cos(W
t
) dW
t
=
_
T
0
f(t, W
t
) dW
t
= (t, W
t
)
T
0
= e
2
t
2
sin(W
t
)
T
0
= e
2
T
2
sin(W
T
).
Divide by to end the proof.
If we choose = 1 we recover a result already familiar to the reader from section
6.2
_
T
0
e
t
2
cos(W
t
) dW
t
= e
T
2
sinW
T
. (6.3.9)
Application 6.3.11 Let > 0. Show that
_
T
0
e
2
t
2
sin(W
t
) dW
t
=
1
_
1 e
2
T
2
cos(W
T
)
_
.
Choose (t, x) = e
2
t
2
cos(x) to be a solution of the heat equation. Apply Theorem
6.3.6 for the function f(t, x) =
x
(t, x) = e
2
t
2
sin(x) to get
_
T
0
()e
2
t
2
sin(W
t
) dW
t
= (t, W
t
)
T
0
= e
2
T
2
cos(W
t
)
T
0
= e
2
T
2
cos(W
T
) 1,
137
and then divide by .
Application 6.3.12 Let 0 < a < b. Show that
_
b
a
t
3
2
W
t
e
W
2
t
2t
dW
t
= a
1
2
e
W
2
a
2a
b
1
2
e
W
2
b
2b
. (6.3.10)
From Exercise 6.3.2 we have that (t, x) = t
1/2
e
x
2
/(2t)
is a solution of the homoge-
neous heat equation. Since f(t, x) =
x
(t, x) = t
3/2
xe
x
2
/(2t)
, applying Theorem
6.3.6 yields to the desired result. The reader can easily ll in the details.
Integration techniques will be used when solving stochastic dierential equations in
the next chapter.
Exercise 6.3.13 Find the value of the following stochastic integrals
(a)
_
1
0
e
t
cos(
2W
t
) dW
t
(b)
_
3
0
e
2t
cos(2W
t
) dW
t
(c)
_
4
0
e
t+
2Wt
dW
t
.
Exercise 6.3.14 Let (t, x) be a solution of the following non-homogeneous heat equa-
tion with time-dependent and uniform heat source G(t)
t
+
1
2
2
x
= G(t).
Denote f(t, x) =
x
(t, x). Show that
_
b
a
f(t, W
t
) dW
t
= (b, W
b
) (a, W
a
)
_
b
a
G(t) dt.
How does the formula change if the heat source G is constant?
6.4 Table of Usual Stochastic Integrals
Now we present a user-friendly table, which enlists identities that are among the most
important of our standard techniques. This table is far too complicated to be memo-
rized in full. However, the rst couple of identities in this table are the most memorable,
and should be remembered.
Let a < b and 0 < T. Then we have:
138
1.
_
b
a
dW
t
= W
b
W
a
;
2.
_
T
0
W
t
dW
t
=
W
2
T
2

T
2
;
3.
_
T
0
(W
2
t
t) dW
t
=
W
2
T
3
TW
T
;
4.
_
T
0
t dW
t
= TW
T

_
T
0
W
t
dt, 0 < T;
5.
_
T
0
W
2
t
dW
t
=
W
3
T
3

_
T
0
W
t
dt;
6.
_
T
0
e
t
2
cos W
t
dW
t
= e
T
2
sin W
T
;
7.
_
T
0
e
t
2
sin W
t
dW
t
= 1 e
T
2
cos W
T
;
8.
_
T
0
e
t
2
+Wt
= e
T
2
+W
T
1;
9.
_
T
0
e
2
t
2
cos(W
t
) dW
t
=
1
2
T
2
sin(W
T
);
10.
_
T
0
e
2
t
2
sin(W
t
) dW
t
=
1
_
1 e
2
T
2
cos(W
T
)
_
;
11.
_
T
0
e
2
t
2
Wt
dW
t
=
1
_
e
2
T
2
W
T
1
_
;
12.
_
b
a
t
3
2
W
t
e
W
2
t
2t
dW
t
= a
1
2
e
W
2
a
2a
b
1
2
e
W
2
b
2b
;
13. d
_
_
t
a
f(s, W
s
) dW
s
_
= f(t, W
t
) dW
t
;
14.
_
b
a
Y
t
dW
t
= F
b
F
a
, if Y
t
dW
t
= dF
t
;
15.
_
b
a
f(t) dW
t
= f(t)W
t
[
b
a
_
b
a
f
(t)W
t
dt;
16.
_
b
a
g
(W
t
) dW
t
= g(W
t
)
b
a
1
2
_
b
a
g
(W
t
) dt.
Chapter 7
Stochastic Dierential Equations
7.1 Denitions and Examples
Let X
t
be a continuous stochastic process. If small changes in the process X
t
can
be written as a linear combination of small changes in t and small increments of the
Brownian motion W
t
, we may write
dX
t
= a(t, W
t
, X
t
)dt +b(t, W
t
, X
t
) dW
t
(7.1.1)
and call it a stochastic dierential equation. In fact, this dierential relation has the
following integral meaning:
X
t
= X
0
+
_
t
0
a(s, W
s
, X
s
) ds +
_
t
0
b(s, W
s
, X
s
) dW
s
, (7.1.2)
where the last integral is taken in the Ito sense. Relation (7.1.2) is taken as the denition
for the stochastic dierential equation (7.1.1). However, since it is convenient to use
stochastic dierentials informally, we shall approach stochastic dierential equations by
analogy with the ordinary dierential equations, and try to present the same methods
of solving equations in the new stochastic environment.
The functions a(t, W
t
, X
t
) and b(t, W
t
, X
t
) are called drift rate and volatility, re-
spectively. A process X
t
is called a (strong) solution for the stochastic equation (7.1.1)
if it satises the equation. We shall start with an example.
Example 7.1.1 (The Brownian bridge) Let a, b R. Show that the process
X
t
= a(1 t) +bt + (1 t)
_
t
0
1
1 s
dW
s
, 0 t < 1
is a solution of the stochastic dierential equation
dX
t
=
b X
t
1 t
dt +dW
t
, 0 t < 1, X
0
= a.
139
140
We shall perform a routine verication to show that X
t
is a solution. First we compute
the quotient
b X
t
1 t
:
b X
t
= b a(1 t) bt (1 t)
_
t
0
1
1 s
dW
s
= (b a)(1 t) (1 t)
_
t
0
1
1 s
dW
s
,
and dividing by 1 t yields
b X
t
1 t
= b a
_
t
0
1
1 s
dW
s
. (7.1.3)
Using
d
_
_
t
0
1
1 s
dW
s
_
=
1
1 t
dW
t
,
the product rule yields
dX
t
= a d(1 t) +bdt +d(1 t)
_
t
0
1
1 s
dW
s
+ (1 t)d
_
_
t
0
1
1 s
dW
s
_
=
_
b a
_
t
0
1
1 s
dW
s
_
dt +dW
t
=
b X
t
1 t
dt +dW
t
,
where the last identity comes from (7.1.3). We just veried that the process X
t
is
a solution of the given stochastic equation. The question of how this solution was
obtained in the rst place, is the subject of study for the next few sections.
7.2 Finding Mean and Variance from the Equation
For most practical purposes, the most important information one needs to know about
a process is its mean and variance. These can be found directly from the stochastic
equation in some particular cases without solving explicitly the equation. We shall deal
in the present section with this problem.
Taking the expectation in (7.1.2) and using the property of the Ito integral as a
zero mean random variable yields
E[X
t
] = X
0
+
_
t
0
E[a(s, W
s
, X
s
)] ds. (7.2.4)
Applying the Fundamental Theorem of Calculus we obtain
d
dt
E[X
t
] = E[a(t, W
t
, X
t
)].
141
We note that X
t
is not dierentiable, but its expectation E[X
t
] is. This equation can
be solved exactly in a few particular cases.
1. If a(t, W
t
, X
t
) = a(t), then
d
dt
E[X
t
] = a(t) with the exact solution E[X
t
] =
X
0
+
_
t
0
a(s) ds.
2. If a(t, W
t
, X
t
) = (t)X
t
+ (t), with (t) and (t) continuous deterministic
function, then
d
dt
E[X
t
] = (t)E[X
t
] +(t),
which is a linear dierential equation in E[X
t
]. Its solution is given by
E[X
t
] = e
A(t)
_
X
0
+
_
t
0
e
A(s)
(s) ds
_
, (7.2.5)
where A(t) =
_
t
0
(s) ds. It is worth noting that the expectation E[X
t
] does not depend
on the volatility term b(t, W
t
, X
t
).
Exercise 7.2.1 If dX
t
= (2X
t
+e
2t
)dt +b(t, W
t
, X
t
)dW
t
, then
E[X
t
] = e
2t
(X
0
+t).
t
be a process satisfying the stochastic equation
dX
t
= (t)X
t
dt +b(t)dW
t
.
Then the mean and variance of X
t
are given by
E[X
t
] = e
A(t)
X
0
V ar[X
t
] = e
2A(t)
_
t
0
e
A(s)
b
2
(s) ds,
where A(t) =
_
t
0
(s) ds.
Proof: The expression of E[X
t
] follows directly from formula (7.2.5) with = 0. In
order to compute the second moment we rst compute
(dX
t
)
2
= b
2
(t) dt;
d(X
2
t
) = 2X
t
dX
t
+ (dX
t
)
2
= 2X
t
_
(t)X
t
dt +b(t)dW
t
_
+b
2
(t)dt
=
_
2(t)X
2
t
+b
2
(t)
_
dt + 2b(t)X
t
dW
t
,
where we used Itos formula. If we let Y
t
= X
2
t
, the previous equation becomes
dY
t
=
_
2(t)Y
t
+b
2
(t)
_
dt + 2b(t)
_
Y
t
dW
t
.
142
Applying formula (7.2.5) with (t) replaced by 2(t) and (t) by b
2
(t), yields
E[Y
t
] = e
2A(t)
_
Y
0
+
_
t
0
e
2A(s)
b
2
(s) ds
_
,
which is equivalent with
E[X
2
t
] = e
2A(t)
_
X
2
0
+
_
t
0
e
2A(s)
b
2
(s) ds
_
.
It follows that the variance is
V ar[X
t
] = E[X
2
t
] (E[X
t
])
2
= e
2A(t)
_
t
0
e
2A(s)
b
2
(s) ds.
Remark 7.2.3 We note that the previous equation is of linear type. This shall be
solved explicitly in a future section.
The mean and variance for a given stochastic process can be computed by working
out the associated stochastic equation. We shall provide next a few examples.
Example 7.2.1 Find the mean and variance of e
kWt
, with k constant.
From Itos formula
d(e
kWt
) = ke
kWt
dW
t
+
1
2
k
2
e
kWt
dt,
and integrating yields
e
kWt
= 1 +k
_
t
0
e
kWs
dW
s
+
1
2
k
2
_
t
0
e
kWs
ds.
Taking the expectations we have
E[e
kWt
] = 1 +
1
2
k
2
_
t
0
E[e
kWs
] ds.
If we let f(t) = E[e
kWt
], then dierentiating the previous relations yields the dierential
equation
f
(t) =
1
2
k
2
f(t)
with the initial condition f(0) = E[e
kW
0
] = 1. The solution is f(t) = e
k
2
t/2
, and hence
E[e
kWt
] = e
k
2
t/2
.
The variance is
V ar(e
kWt
) = E[e
2kWt
] (E[e
kWt
])
2
= e
4k
2
t/2
e
k
2
t
= e
k
2
t
(e
k
2
t
1).
143
Example 7.2.2 Find the mean of the process W
t
e
Wt
.
We shall set up a stochastic dierential equation for W
t
e
Wt
. Using the product formula
and Itos formula yields
d(W
t
e
Wt
) = e
Wt
dW
t
+W
t
d(e
Wt
) +dW
t
d(e
Wt
)
= e
Wt
dW
t
+ (W
t
+dW
t
)(e
Wt
dW
t
+
1
2
e
Wt
dt)
= (
1
2
W
t
e
Wt
+e
Wt
)dt + (e
Wt
+W
t
e
Wt
)dW
t
.
Integrating and using that W
0
e
W
0
= 0 yields
W
t
e
Wt
=
_
t
0
(
1
2
W
s
e
Ws
+e
Ws
) ds +
_
t
0
(e
Ws
+W
s
e
Ws
) dW
s
.
Since the expectation of an Ito integral is zero, we have
E[W
t
e
Wt
] =
_
t
0
_
1
2
E[W
s
e
Ws
] +E[e
Ws
]
_
ds.
Let f(t) = E[W
t
e
Wt
]. Using E[e
Ws
] = e
s/2
, the previous integral equation becomes
f(t) =
_
t
0
(
1
2
f(s) +e
s/2
) ds,
Dierentiating yields the following linear dierential equation
f
(t) =
1
2
f(t) +e
t/2
with the initial condition f(0) = 0. Multiplying by e
t/2
yields the following exact
equation
(e
t/2
f(t))
= 1.
The solution is f(t) = te
t/2
. Hence we obtained that
E[W
t
e
Wt
] = te
t/2
.
Exercise 7.2.4 Find a E[W
2
t
e
Wt
]; (b) E[W
t
e
kWt
].
Example 7.2.3 Show that for any integer k 0 we have
E[W
2k
t
] =
(2k)!
2
k
k!
t
k
, E[W
2k+1
t
] = 0.
In particular, E[W
4
t
] = 3t
2
, E[W
6
t
] = 15t
3
.
144
From Itos formula we have
d(W
n
t
) = nW
n1
t
dW
t
+
n(n 1)
2
W
n2
t
dt.
Integrate and get
W
n
t
= n
_
t
0
W
n1
s
dW
s
+
n(n 1)
2
_
t
0
W
n2
s
ds.
Since the expectation of the rst integral on the right side is zero, taking the expectation
yields the following recursive relation
E[W
n
t
] =
n(n 1)
2
_
t
0
E[W
n2
s
] ds.
Using the initial values E[W
t
] = 0 and E[W
2
t
] = t, the method of mathematical induc-
tion implies that E[W
2k+1
t
] = 0 and E[W
2k
t
] =
(2k)!
2
k
k!
t
k
.
Exercise 7.2.5 (a) Is W
4
t
3t
2
an T
t
-martingale?
(b) What about W
3
t
?
Example 7.2.4 Find E[sin W
t
].
From Itos formula
d(sin W
t
) = cos W
t
dW
t
1
2
sin W
t
dt,
then integrating yields
sinW
t
=
_
t
0
cos W
s
dW
s

1
2
_
t
0
sinW
s
ds.
Taking expectations we arrive at the integral equation
E[sin W
t
] =
1
2
_
t
0
E[sin W
s
] ds.
Let f(t) = E[sin W
t
]. Dierentiating yields the equation f
(t) =
1
2
f(t) with f(0) =
E[sin W
0
] = 0. The unique solution is f(t) = 0. Hence
E[sin W
t
] = 0.
Exercise 7.2.6 Let be a constant. Show that
(a) E[sin(W
t
)] = 0;
(b) E[cos(W
t
)] = e
2
t/2
;
(c) E[sin(t +W
t
)] = e
2
t/2
sin t;
(d) E[cos(t +W
t
)] = e
2
t/2
cos t;
145
Exercise 7.2.7 Use the previous exercise and the denition of expectation to show that
(a)
_

e
x
2
cos xdx =

1/2
e
1/4
;
(b)
_

e
x
2
/2
cos xdx =
_
2
e
.
Exercise 7.2.8 Using expectations show that
(a)
_

xe
ax
2
+bx
dx =
_
a
_
b
2a
_
e
b
2
/(4a)
;
(b)
_

x
2
e
ax
2
+bx
dx =
_
a
1
2a
_
1 +
b
2
2a
_
e
b
2
/(4a)
;
(c) Can you apply a similar method to nd a close form expression for the integral
_

x
n
e
ax
2
+bx
dx?
Exercise 7.2.9 Using the result given by Example 7.2.3 show that
(a) E[cos(tW
t
)] = e
t
3
/2
;
(b) E[sin(tW
t
)] = 0;
(c) E[e
tWt
] = 0.
For general drift rates we cannot nd the mean, but in the case of concave drift
rates we can nd an upper bound for the expectation E[X
t
]. The following result will
be useful.
Lemma 7.2.10 (Gronwalls inequality) Let f(t) be a non-negative function satis-
fying the inequality
f(t) C +M
_
t
0
f(s) ds
for 0 t T, with C, M constants. Then
f(t) Ce
Mt
, 0 t T.
t
be a continuous stochastic process such that
dX
t
= a(X
t
)dt +b(t, W
t
, X
t
) dW
t
,
with the function a() satisfying the following conditions
1. a(x) 0, for 0 x T;
2. a
(x) < 0, for 0 x T;

3. a
(0) = M.
Then E[X
t
] X
0
e
Mt
, for 0 X
t
T.
146
Proof: From the mean value theorem there is (0, x) such that
a(x) = a(x) a(0) = (x 0)a
() xa
(0) = Mx, (7.2.6)

where we used that a
(x) is a decreasing function. Applying Jensens inequality for

concave functions yields
E[a(X
t
)] a(E[X
t
]).
Combining with (7.2.6) we obtain E[a(X
t
)] ME[X
t
]. Substituting in the identity
(7.2.4) implies
E[X
t
] X
0
+M
_
t
0
E[X
s
] ds.
Applying Gronwalls inequality we obtain E[X
t
] X
0
e
Mt
.
Exercise 7.2.12 State the previous result in the particular case when a(x) = sin x,
with 0 x .
Not in all cases can the mean and the variance be obtained directly from the stochas-
tic equation. In these cases we need more powerful methods that produce closed form
solutions. In the next sections we shall discuss several methods of solving stochastic
dierential equation.
7.3 The Integration Technique
We shall start with the simple case when both the drift and the volatility are just
functions of time t.
Proposition 7.3.1 The solution X
t
of the stochastic dierential equation
dX
t
= a(t)dt +b(t)dW
t
is Gaussian distributed with mean X
0
+
_
t
0
a(s) ds and variance
_
t
0
b
2
(s) ds.
Proof: Integrating in the equation yields
X
t
X
0
=
_
t
0
dX
s
=
_
t
0
a(s) ds +
_
t
0
b(s) dW
s
.
Using the property of Wiener integrals,
_
t
0
b(s) dW
s
is Gaussian distributed with mean 0
and variance
_
t
0
b
2
(s) ds. Then X
t
is Gaussian (as a sum between a predictable function
147
and a Gaussian), with
E[X
t
] = E[X
0
+
_
t
0
a(s) ds +
_
t
0
b(s) dW
s
]
= X
0
+
_
t
0
a(s) ds +E[
_
t
0
b(s) dW
s
]
= X
0
+
_
t
0
a(s) ds,
V ar[X
t
] = V ar[X
0
+
_
t
0
a(s) ds +
_
t
0
b(s) dW
s
]
= V ar
_
_
t
0
b(s) dW
s
_
=
_
t
0
b
2
(s) ds,
Exercise 7.3.2 Solve the following stochastic dierential equations for t 0 and de-
termine the mean and the variance of the solution
(a) dX
t
= cos t dt sint dW
t
, X
0
= 1.
(b) dX
t
= e
t
dt +
t dW
t
, X
0
= 0.
(c) dX
t
=
t
1+t
2
dt +t
3/2
dW
t
, X
0
= 1.
If the drift and the volatility depend on both variables t and W
t
, the stochastic
dierential equation
dX
t
= a(t, W
t
)dt +b(t, W
t
)dW
t
, t 0
denes an Ito diusion. Integrating yields the solution
X
t
= X
0
+
_
t
0
a(s, W
s
) ds +
_
t
0
b(s, W
s
) dW
s
.
There are several cases when both integrals can be computed explicitly.
Example 7.3.1 Find the solution of the stochastic dierential equation
dX
t
= dt +W
t
dW
t
, X
0
= 1.
Integrate between 0 and t and get
X
t
= 1 +
_
t
0
ds +
_
t
0
W
s
dW
s
= t +
W
2
t
2

t
2
=
1
2
(W
2
t
+t).
148
Example 7.3.2 Solve the stochastic dierential equation
dX
t
= (W
t
1)dt +W
2
t
dW
t
, X
0
= 0.
Let Z
t
=
_
t
0
W
s
ds denote the integrated Brownian motion process. Integrating the
equation between 0 and t yields
X
t
=
_
s
0
dX
s
=
_
t
0
(W
s
1)ds +
_
t
0
W
2
s
dW
s
= Z
t
t +
1
3
W
3
t

1
2
W
2
t

t
2
= Z
t
+
1
3
W
3
t

1
2
W
2
t

t
2
.
dX
t
= t
2
dt +e
t/2
cos W
t
dW
t
, X
0
= 0,
and nd E[X
t
] and V ar(X
t
).
Integrating yields
X
t
=
_
t
0
s
2
ds +
_
t
0
e
s/2
cos W
s
dW
s
=
t
3
3
+e
t/2
sin W
t
, (7.3.7)
where we used (6.3.9). Even if the process X
t
is not Gaussian, we can still compute its
mean and variance. By Itos formula we have
d(sin W
t
) = cos W
t
dW
t

1
2
sin W
t
dt
Integrating between 0 and t yields
sinW
t
=
_
t
0
cos W
s
dW
s

1
2
_
t
0
sinW
s
ds,
where we used that sin W
0
= sin 0 = 0. Taking the expectation in the previous relation
yields
E[sin W
t
] = E
_
_
t
0
cos W
s
dW
s
_
1
2
_
t
0
E[sin W
s
] ds.
From the properties of the Ito integral, the rst expectation on the right side is zero.
Denoting (t) = E[sin W
t
], we obtain the integral equation
(t) =
1
2
_
t
0
(s) ds.
149
Dierentiating yields the dierential equation
(t) =
1
2
(t)
with the solution (t) = ke
t/2
. Since k = (0) = E[sin W
0
] = 0, it follows that
(t) = 0. Hence
E[sin W
t
] = 0.
Taking expectation in (7.3.7) leads to
E[X
t
] = E
_
t
3
3
_
+e
t/2
E[sin W
t
] =
t
3
3
.
Since the variance of predictable functions is zero,
V ar[X
t
] = V ar
_
t
3
3
+e
t/2
sinW
t
_
= (e
t/2
)
2
V ar[sin W
t
]
= e
t
E[sin
2
W
t
] =
e
t
2
(1 E[cos 2W
t
]). (7.3.8)
In order to compute the last expectation we use Itos formula
d(cos 2W
t
) = 2 sin 2W
t
dW
t
2 cos 2W
t
dt
and integrate to get
cos 2W
t
= cos 2W
0
2
_
t
0
sin 2W
s
dW
s
2
_
t
0
cos 2W
s
ds
Taking the expectation and using that Ito integrals have zero expectation, yields
E[cos 2W
t
] = 1 2
_
t
0
E[cos 2W
s
] ds.
If we denote m(t) = E[cos 2W
t
], the previous relation becomes an integral equation
m(t) = 1 2
_
t
0
m(s) ds.
Dierentiate and get
m
(t) = 2m(t),
with the solution m(t) = ke
2t
. Since k = m(0) = E[cos 2W
0
] = 1, we have m(t) =
e
2t
. Substituting into (7.3.8) yields
V ar[X
t
] =
e
t
2
(1 e
2t
) =
e
t
e
t
2
= sinht.
In conclusion, the solution X
t
has the mean and the variance given by
E[X
t
] =
t
3
3
, V ar[X
t
] = sinht.
150
Example 7.3.4 Solve the following stochastic dierential equation
e
t/2
dX
t
= dt +e
Wt
dW
t
, X
0
= 0,
and nd the distribution of the solution X
t
and its mean and variance.
Dividing by e
t/2
, integrating between 0 and t, and using formula (6.3.8) yields
X
t
=
_
t
0
e
s/2
ds +
_
t
0
e
s/2+Ws
dW
s
= 2(1 e
t/2
) +e
t/2
e
Wt
1
= 1 +e
t/2
(e
Wt
2).
Since e
Wt
is a geometric Brownian motion, using Proposition 2.2.2 yields
E[X
t
] = E[1 +e
t/2
(e
Wt
2)] = 1 2e
t/2
+e
t/2
E[e
W
t
]
= 2 2e
t/2
.
V ar(X
t
) = V ar[1 +e
t/2
(e
Wt
2)] = V ar[e
t/2
e
Wt
] = e
t
V ar[e
Wt
]
= e
t
(e
2t
e
t
) = e
t
1.
The process X
t
has the following distribution:
F(y) = P(X
t
y) = P
_
1 +e
t/2
(e
Wt
2) y
_
= P
_
W
t
ln
_
2 +e
t/2
(y 1)
_
_
= P
_
W
t
t
ln
_
2 +e
t/2
(y 1)
_
_
= N
_
1
t
ln
_
2 +e
t/2
(y 1)
_
_
,
where N(u) =
1
2
_
u
e
s
2
/2
ds is the distribution function of a standard normal
distributed random variable.
dX
t
= dt +t
3/2
W
t
e
W
2
t
/(2t)
dW
t
, X
1
= 1.
Integrating between 1 and t and applying formula (6.3.10) yields
X
t
= X
1
+
_
t
1
ds +
_
t
1
s
3/2
W
s
e
W
2
s
/(2s)
dW
s
= 1 +t 1 e
W
2
1
/2
1
t
1/2
e
W
2
t
/(2t)
= t e
W
2
1
/2
1
t
1/2
e
W
2
t
/(2t)
, t 1.
151
7.4 Exact Stochastic Equations
The stochastic dierential equation
dX
t
= a(t, W
t
)dt +b(t, W
t
)dW
t
(7.4.9)
is called exact if there is a dierentiable function f(t, x) such that
a(t, x) =
t
f(t, x) +
1
2
2
x
f(t, x) (7.4.10)
b(t, x) =
x
f(t, x). (7.4.11)
Assume the equation is exact. Then substituting in (7.4.9) yields
dX
t
=
_
t
f(t, W
t
) +
1
2
2
x
f(t, W
t
)
_
dt +
x
f(t, W
t
)dW
t
.
Applying Itos formula, the previous equations becomes
dX
t
= d
_
f(t, W
t
)
_
,
which implies X
t
= f(t, W
t
) +c, with c constant.
Solving the partial dierential equations system (7.4.10)(7.4.10) requires the fol-
lowing steps:
1. Integrating partially with respect to x in the second equation to obtain f(t, x)
up to an additive function T(t);
2. Substitute into the rst equation and determine the function T(t);
3. The solution is X
t
= f(t, W
t
) + c, with c determined from the initial condition
on X
t
.
dX
t
= e
t
(1 +W
2
t
)dt + (1 + 2e
t
W
t
)dW
t
, X
0
= 0.
In this case a(t, x) = e
t
(1 +x
2
) and b(t, x) = 1 + 2e
t
x. The associated system is
e
t
(1 +x
2
) =
t
f(t, x) +
1
2
2
x
f(t, x)
1 + 2e
t
x =
x
f(t, x).
Integrate partially in x in the second equation yields
f(t, x) =
_
(1 + 2e
t
x) dx = x +e
t
x
2
+T(t).
Then
t
f = e
t
x
2
+T
(t) and
2
x
f = 2e
t
. Substituting in the rst equation yields
e
t
(1 +x
2
) = e
t
x
2
+T
(t) +e
t
.
This implies T
(t) = 0, or T = c constant. Hence f(t, x) = x + e

t
x
2
+ c, and X
t
=
f(t, W
t
) = W
t
+ e
t
W
2
t
+ c. Since X
0
= 0, it follows that c = 0. The solution is
X
t
= W
t
+e
t
W
2
t
.
152
Example 7.4.2 Find the solution of
dX
t
=
_
2tW
3
t
+ 3t
2
(1 +W
t
)
_
dt + (3t
2
W
2
t
+ 1)dW
t
, X
0
= 0.
The coecient functions are a(t, x) = 2tx
3
+ 3t
2
(1 + x) and b(t, x) = 3t
2
x
2
+ 1. The
associated system is given by
2tx
3
+ 3t
2
(1 +x) =
t
f(t, x) +
1
2
2
x
f(t, x)
3t
2
x
2
+ 1 =
x
f(t, x).
Integrating partially in the second equation yields
f(t, x) =
_
(3t
2
x
2
+ 1) dx = t
2
x
3
+x +T(t).
Then
t
f = 2tx
3
+ T
(t) and
2
x
f = 6t
2
x, and substituting into the rst equation we
get
2tx
3
+ 3t
2
(1 +x) = 2tx
3
+T
(t) +
1
2
6t
2
x.
After cancellations we get T
(t) = 3t
2
, so T(t) = t
3
+c. Then
f(t, x) = t
2
x
3
+x + 3t
2
= t
2
(x
3
+ 1) +x +c.
The solution process is given by X
t
= f(t, W
t
) = t
2
(W
3
t
+ 1) + W
t
+ c. Using X
0
= 0
we get c = 0. Hence the solution is X
t
= t
2
(W
3
t
+ 1) +W
t
.
The next result deals with a condition regarding the closeness of the stochastic
dierential equation.
Theorem 7.4.1 If the stochastic dierential equation (7.4.9) is exact, then the coe-
cient functions a(t, x) and b(t, x) satisfy the condition
x
a =
t
b +
1
2
2
x
b. (7.4.12)
Proof: If the stochastic equation is exact, there is a function f(t, x) satisfying the
system (7.4.10)(7.4.10). Dierentiating the rst equation of the system with respect
to x yields
x
a =
t
x
f +
1
2
2
x
x
f.
Substituting b =
x
f yields the desired relation.
Remark 7.4.2 The equation (7.4.12) has the meaning of a heat equation. The func-
tion b(t, x) represents the temperature measured at x at the instance t, while
x
a is the
density of heat sources. The function a(t, x) can be regarded as the potential from which
the density of heat sources is derived by taking the gradient in x.
153
It is worth noting that equation (7.4.12) is a just necessary condition for exactness.
This means that if this condition is not satised, then the equation is not exact. In
that case we need to try a dierent method to solve the equation.
Example 7.4.3 Is the stochastic dierential equation
dX
t
= (1 +W
2
t
)dt + (t
4
+W
2
t
)dW
t
exact?
Collecting the coecients, we have a(t, x) = 1 +x
2
, b(t, x) = t
4
+ x
2
. Since
x
a = 2x,
t
b = 4t
3
, and
2
x
b = 2, the condition (7.4.12) is not satised, and hence the equation
is not exact.
7.5 Integration by Inspection
When solving a stochastic dierential equation by inspection we look for opportunities
to apply the product or the quotient formulas:
d(f(t)Y
t
) = f(t) dY
t
+Y
t
df(t).
d
_
X
t
f(t)
_
=
f(t)dX
t
X
t
df(t)
f(t)
2
.
For instance, if a stochastic dierential equation can be written as
dX
t
= f
(t)W
t
dt +f(t)dW
t
,
the product rule brings the equation into the exact form
dX
t
= d
_
f(t)W
t
_
,
which after integration leads to the solution
X
t
= X
0
+f(t)W
t
.
Example 7.5.1 Solve
dX
t
= (t +W
2
t
)dt + 2tW
t
dW
t
, X
0
= a.
We can write the equation as
dX
t
= W
2
t
dt +t(2W
t
dW
t
+dt),
which can be contracted to
dX
t
= W
2
t
dt +td(W
2
t
).
Using the product rule we can bring it to the exact form
dX
t
= d(tW
2
t
),
with the solution X
t
= tW
2
t
+a.
154
dX
t
= (W
t
+ 3t
2
)dt +tdW
t
.
If we rewrite the equation as
dX
t
= 3t
2
dt + (W
t
dt +tdW
t
),
we note the exact expression formed by the last two terms W
t
dt + tdW
t
= d(tW
t
).
Then
dX
t
= d(t
3
) +d(tW
t
),
which is equivalent with d(X
t
) = d(t
3
+tW
t
). Hence X
t
= t
3
+tW
t
+c, c R.
e
2t
dX
t
= (1 + 2W
2
t
)dt + 2W
t
dW
t
.
Multiply by e
2t
to get
dX
t
= e
2t
(1 + 2W
2
t
)dt + 2e
2t
W
t
dW
t
.
After regrouping, this becomes
dX
t
= (2e
2t
dt)W
2
t
+e
2t
(2W
t
dW
t
+dt).
Since d(e
2t
) = 2e
2t
dt and d(W
2
t
) = 2W
t
dW
t
+dt, the previous relation becomes
dX
t
= d(e
2t
)W
2
t
+e
2t
d(W
2
t
).
By the product rule, the right side becomes exact
dX
t
= d(e
2t
W
2
t
),
and hence the solution is X
t
= e
2t
W
2
t
+c, c R.
Example 7.5.4 Solve the equation
t
3
dX
t
= (3t
2
X
t
+t)dt +t
6
dW
t
, X
1
= 0.
The equation can be written as
t
3
dX
t
3X
t
t
2
dt = tdt +t
6
dW
t
.
Divide by t
6
:
t
3
dX
t
X
t
d(t
3
)
(t
3
)
2
= t
5
dt +dW
t
.
155
Applying the quotient rule yields
d
_
X
t
t
3
_
= d
_
t
4
4
_
+dW
t
.
Integrating between 1 and t, yields
X
t
t
3
=
t
4
4
+W
t
W
1
+c
so
X
t
= ct
3
1
4t
+t
3
(W
t
W
1
), c R.
Using X
1
= 0 yields c = 1/4 and hence the solution is
X
t
=
1
4
_
t
3
1
t
_
+t
3
(W
t
W
1
), c R.
Exercise 7.5.1 Solve the following stochastic dierential equations by the inspection
method
(a) dX
t
= (1 +W
t
)dt + (t + 2W
t
)dW
t
, X
0
= 0;
(b) t
2
dX
t
= (2t
3
W
t
)dt +tdW
t
, X
1
= 0;
(c) e
t/2
dX
t
=
1
2
W
t
dt +dW
t
, X
0
= 0;
(d) dX
t
= 2tW
t
dW
t
+W
2
t
dt, X
0
= 0;
(e) dX
t
=
_
1 +
1
2
t
W
t
_
dt +
t dW
t
, X
1
= 0.
7.6 Linear Stochastic Dierential Equations
Consider the stochastic dierential equation with drift term linear in X
t
dX
t
=
_
(t)X
t
+(t)
_
dt +b(t, W
t
)dW
t
, t 0.
This also can be written as
dX
t
(t)X
t
dt = (t)dt +b(t, W
t
)dW
t
.
Let A(t) =
_
t
0
(s) ds. Multiplying by the integrating factor e
A(t)
, the left side of the
previous equation becomes an exact expression
e
A(t)
_
dX
t
(t)X
t
dt
_
= e
A(t)
(t)dt +e
A(t)
b(t, W
t
)dW
t
d
_
e
A(t)
X
t
_
= e
A(t)
(t)dt +e
A(t)
b(t, W
t
)dW
t
.
156
Integrating yields
e
A(t)
X
t
= X
0
+
_
t
0
e
A(s)
(s) ds +
_
t
0
e
A(s)
b(s, W
s
) dW
s
X
t
= X
0
e
A(t)
+e
A(t)
_
_
t
0
e
A(s)
(s) ds +
_
t
0
e
A(s)
b(s, W
s
) dW
s
_
.
The rst integral within the previous parentheses is a Riemann integral, and the latter
one is an Ito stochastic integral. Sometimes, in practical applications these integrals
can be computed explicitly.
When b(t, W
t
) = b(t), the latter integral becomes a Wiener integral. In this case
the solution X
t
is Gaussian with mean and variance given by
E[X
t
] = X
0
e
A(t)
+e
A(t)
_
t
0
e
A(s)
(s) ds
V ar[X
t
] = e
2A(t)
_
t
0
e
2A(s)
b(s)
2
ds.
Another important particular case is when (t) = ,= 0, (t) = are constants
and b(t, W
t
) = b(t). The equation in this case is
dX
t
=
_
X
t
+
_
dt +b(t)dW
t
, t 0,
and the solution takes the form
X
t
= X
0
e
t
+

(e
t
1) +
_
t
0
e
(ts)
b(s) dW
s
.
Example 7.6.1 Solve the linear stochastic dierential equation
dX
t
= (2X
t
+ 1)dt +e
2t
dW
t
.
Write the equation as
dX
t
2X
t
dt = dt +e
2t
dW
t
and multiply by the integrating factor e
2t
to get
d
_
e
2t
X
t
_
= e
2t
dt +dW
t
.
Integrate between 0 and t and multiply by e
2t
, and obtain
X
t
= X
0
e
2t
+e
2t
_
t
0
e
2s
ds +e
2t
_
t
0
dW
s
= X
0
e
2t
+
1
2
(e
2t
1) +e
2t
W
t
.
157
dX
t
= (2 X
t
)dt +e
t
W
t
dW
t
.
Multiplying by the integrating factor e
t
yields
e
t
(dX
t
+X
t
dt) = 2e
t
dt +W
t
dW
t
.
Since e
t
(dX
t
+X
t
dt) = d(e
t
X
t
), integrating between 0 and t we get
e
t
X
t
= X
0
+
_
t
0
2e
t
dt +
_
t
0
W
s
dW
s
.
Dividing by e
t
and performing the integration yields
X
t
= X
0
e
t
+ 2(1 e
t
) +
1
2
e
t
(W
2
t
t).
dX
t
= (
1
2
X
t
+ 1)dt +e
t
cos W
t
dW
t
.
Write the equation as
dX
t

1
2
X
t
dt = dt +e
t
cos W
t
dW
t
and multiply by the integrating factor e
t/2
to get
d(e
t/2
X
t
) = e
t/2
dt +e
t/2
cos W
t
dW
t
.
Integrating yields
e
t/2
X
t
= X
0
+
_
t
0
e
s/2
ds +
_
t
0
e
s/2
cos W
s
dW
s
Multiply by e
t/2
and use formula (6.3.9) to obtain the solution
X
t
= X
0
e
t/2
+ 2(e
t/2
1) +e
t
sin W
t
.
Exercise 7.6.1 Solve the following linear stochastic dierential equations
(a) dX
t
= (4X
t
1)dt + 2dW
t
;
(b) dX
t
= (3X
t
2)dt +e
3t
dW
t
;
(c) dX
t
= (1 +X
t
)dt +e
t
W
t
dW
t
;
(d) dX
t
= (4X
t
+t)dt +e
4t
dW
t
;
(e) dX
t
=
_
t +
1
2
X
t
_
dt +e
t
sin W
t
dW
t
;
(f ) dX
t
= X
t
dt +e
t
dW
t
.
158
In the following we present an important example of stochastic dierential equa-
tions, which can be solved by the method presented in this section.
Proposition 7.6.2 (The mean-reverting Ornstein-Uhlenbeck process) Let m and
be two constants. Then the solution X
t
of the stochastic equation
dX
t
= (mX
t
)dt +dW
t
(7.6.13)
is given by
X
t
= m+ (X
0
m)e
t
+
_
t
0
e
st
dW
s
. (7.6.14)
X
t
is Gaussian with mean and variance given by
E[X
t
] = m+ (X
0
m)e
t
V ar(X
t
) =

2
2
(1 e
2t
).
Proof: Adding X
t
dt to both sides and multiplying by the integrating factor e
t
we get
d(e
t
X
t
) = me
t
dt +e
t
dW
t
,
which after integration yields
e
t
X
t
= X
0
+m(e
t
1) +
_
t
0
e
s
dW
s
.
Hence
X
t
= X
0
e
t
+me
t
+e
t
_
t
0
e
s
dW
s
= m+ (X
0
m)e
t
+
_
t
0
e
st
dW
s
.
Since X
t
is the sum between a predictable function and a Wiener integral, then we can
use Proposition 4.6.1 and it follows that X
t
is Gaussian, with
E[X
t
] = m+ (X
0
m)e
t
+E
_
_
t
0
e
st
dW
s
_
= m+ (X
0
m)e
t
V ar(X
t
) = V ar
_
_
t
0
e
st
dW
s
_
=
2
e
2t
_
t
0
e
2s
ds
=
2
e
2t
e
2t
1
2
=
1
2
2
(1 e
2t
).
159
The name mean-reverting comes from the fact that
lim
t
E[X
t
] = m.
The variance also tends to zero exponentially, lim
t
V ar[X
t
] = 0. According to Propo-
sition 3.8.1, the process X
t
tends to m in the mean square sense.
Proposition 7.6.3 (The Brownian bridge) For a, b R xed, the stochastic dif-
ferential equation
dX
t
=
b X
t
1 t
dt +dW
t
, 0 t < 1, X
0
= a
has the solution
X
t
= a(1 t) +bt + (1 t)
_
t
0
1
1 s
dW
s
, 0 t < 1. (7.6.15)
The solution has the property lim
t1
X
t
= b, almost certainly.
Proof: If we let Y
t
= b X
t
the equation becomes linear in Y
t
dY
t
+
1
1 t
Y
t
dt = dW
t
.
Multiplying by the integrating factor (t) =
1
1t
yields
d
_
Y
t
1 t
_
=
1
1 t
dW
t
,
which leads by integration to
Y
t
1 t
= c
_
t
0
1
1 s
dW
s
.
Making t = 0 yields c = a b, so
b X
t
1 t
= a b
_
t
0
1
1 s
dW
s
.
Solving for X
t
yields
X
t
= a(1 t) +bt + (1 t)
_
t
0
1
1 s
dW
s
, 0 t < 1.
160
Let U
t
= (1 t)
_
t
0
1
1s
dW
s
. First we notice that
E[U
t
] = (1 t)E
_
_
t
0
1
1 s
dW
s
= 0,
V ar(U
t
) = (1 t)
2
V ar
_
_
t
0
1
1 s
dW
s
= (1 t)
2
_
t
0
1
(1 s)
2
ds
= (1 t)
2
_
1
1 t
1
_
= t(1 t).
In order to show ac-lim
t1
X
t
= b, we need to prove
P
_
; lim
t1
X
t
() = b
_
= 1.
Since X
t
= a(1 t) +bt +U
t
, it suces to show that
P
_
; lim
t1
U
t
() = 0
_
= 1. (7.6.16)
We evaluate the probability of the complementary event
P
_
; lim
t1
U
t
() ,= 0
_
= P
_
; [U
t
()[ > , t
_
,
for some > 0. Since by Markovs inequality
P
_
; [U
t
()[ > ) <
V ar(U
t
)
2
=
t(1 t)
2
holds for any 0 t < 1, choosing t 1 implies that
P
_
; [U
t
()[ > , t) = 0,
which implies (7.6.16).
The process (7.6.15) is called the Brownian bridge because it joins X
0
= a with
X
1
= b. Since X
t
is the sum between a deterministic linear function in t and a Wiener
integral, it follows that it is a Gaussian process, with mean and variance
E[X
t
] = a(1 t) +bt
V ar(X
t
) = V ar(U
t
) = t(1 t).
It is worth noting that the variance is maximum at the midpoint t = (b a)/2 and zero
at the end points a and b.
Example 7.6.4 Find Cov(X
s
, X
t
), 0 < s < t for the following cases:
(a) X
t
is a mean reverting Orstein-Uhlenbeck process;
(b) X
t
is a Brownian bridge process.
161
Stochastic equations with respect to a Poisson process
Similar techniques can be applied in the case when the Brownian motion process W
t
is replaced by a Poisson process N
t
with constant rate . For instance, the stochastic
dierential equation
dX
t
= 3X
t
dt +e
3t
dN
t
X
0
= 1
can be solved multiplying by the integrating factor e
3t
to obtain
d(e
3t
X
t
) = dN
t
.
Integrating yields e
3t
X
t
= N
t
+C. Making t = 0 yields C = 1, so the solution is given
by X
t
= e
3t
(1 +N
t
).
The following equation
dX
t
= (mX
t
)dt +dN
t
is similar with the equation dening the mean-reverting Ornstein-Uhlenbeck process.
As we shall see, in this case, the process is no more mean-reverting. It reverts though
to a certain constant. A similar method yields the solution
X
t
= m+ (X
0
m)e
t
+e
t
_
t
0
e
s
dN
s
.
Since from Proposition 4.8.5 and Exercise 4.8.8 we have
E
_
_
t
0
e
s
dN
s
=
_
t
0
e
s
ds = (e
t
1)
V ar
_
_
t
0
e
s
dN
s
_
=
_
t
0
e
2s
ds =

2
(e
2t
1),
it follows that
E[X
t
] = m+ (X
0
m)e
t
+(1 e
t
)
V ar(X
t
) =

2
2
(1 e
2t
).
It is worth noting that in this case the process X
t
is no more Gaussian.
7.7 The Method of Variation of Parameters
Lets start by considering the following stochastic equation
dX
t
= X
t
dW
t
, (7.7.17)
162
with constant. This is the equation which, in physics is known to model the linear
noise. Dividing by X
t
yields
dX
t
X
t
= dW
t
.
Switch to the integral form
_
dX
t
X
t
=
_
dW
t
,
and integrate blindly to get ln X
t
= W
t
+ c, with c an integration constant. This
leads to the pseudo-solution
X
t
= e
Wt+c
.
The nomination pseudo stands for the fact that X
t
does not satisfy the initial equa-
tion. We shall nd a correct solution by letting the parameter c be a function of t. In
other words, we are looking for a solution of the following type:
X
t
= e
Wt+c(t)
, (7.7.18)
where the function c(t) is subject to be determined. Using Itos formula we get
dX
t
= d(e
Wt+c(t)
) = e
Wt+c(t)
(c
(t) +
2
/2)dt +e
Wt+c(t)
dW
t
= X
t
(c
(t) +
2
/2)dt +X
t
dW
t
.
Substituting the last term from the initial equation (7.7.17) yields
dX
t
= X
t
(c
(t) +
2
/2)dt +dX
t
,
which leads to the equation
c
(t) +
2
/2 = 0
with the solution c(t) =
2
2
t +k. Substituting into (7.7.18) yields
X
t
= e
Wt
2
2
t+k
.
The value of the constant k is determined by taking t = 0. This leads to X
0
= e
k
.
Hence we have obtained the solution of the equation (7.7.17)
X
t
= X
0
e
Wt
2
2
t
.
Example 7.7.1 Use the method of variation of parameters to solve the equation
dX
t
= X
t
W
t
dW
t
.
163
Dividing by X
t
and converting the dierential equation into the equivalent integral
form, we get
_
1
X
t
dX
t
=
_
W
t
dW
t
.
The right side is a well-known stochastic integral given by
_
W
t
dW
t
=
W
2
t
2

t
2
+C.
The left side will be integrated blindly according to the rules of elementary Calculus
_
1
X
t
dX
t
= lnX
t
+C.
Equating the last two relations and solving for X
t
we obtain the pseudo-solution
X
t
= e
W
2
t
2

t
2
+c
,
with c constant. In order to get a correct solution, we let c to depend on t and W
t
. We
shall assume that c(t, W
t
) = a(t) +b(W
t
), so we are looking for a solution of the form
X
t
= e
W
2
t
2

t
2
+a(t)+b(Wt)
.
Applying Itos formula, we have
dX
t
= X
t
_
1
2
+a
(t) +
1
2
(1 +b
(W
t
))
dt +X
t
_
W
t
+b
(W
t
)
_
dW
t
.
Subtracting the initial equation dX
t
= X
t
W
t
dW
t
yields
0 = X
t
_
a
(t) +
1
2
b
(W
t
)
_
dt +X
t
b
(W
t
)dW
t
.
This equation is satised if we are able to choose the functions a(t) and b(W
t
) such
that the coecients of dt and dW
t
vanish
b
(W
t
) = 0, a
(t) +
1
2
b
(W
t
) = 0.
From the rst equation b must be a constant. Substituting into the second equation
it follows that a is also a constant. It turns out that the aforementioned pseudo-
solution is in fact a solution. The constant c = a +b is obtained letting t = 0. Hence
the solution is given by
X
t
= X
0
e
W
2
t
2

t
2
.
Example 7.7.2 Use the method of variation of parameters to solve the stochastic dif-
ferential equation
dX
t
= X
t
dt +X
t
dW
t
,
with and constants.
164
After dividing by X
t
we bring the equation into the equivalent integral form
_
dX
t
X
t
=
_
dt +
_
dW
t
.
Integrate on the left blindly and get
ln X
t
= t +W
t
+c,
where c is an integration constant. We arrive at the following pseudo-solution
X
t
= e
t+Wt+c
.
Assume the constant c is replaced by a function c(t), so we are looking for a solution
of the form
X
t
= e
t+Wt+c(t)
. (7.7.19)
Apply Itos formula and get
dX
t
= X
t
_
+c
(t) +

2
2
_
dt +X
t
dW
t
.
Subtracting the initial equation yields
_
c
(t) +

2
2
_
dt = 0,
which is satised for c
(t) =
2
2
, with the solution c(t) =
2
2
t+k, k R. Substituting
into (7.7.19) yields the solution
X
t
= e
t+Wt
2
2
t+k
= e
(
2
2
)t+Wt+k
= X
0
e
(
2
2
)t+Wt
.
7.8 Integrating Factors
The method of integrating factors can be applied to a class of stochastic dierential
equations of the type
dX
t
= f(t, X
t
)dt +g(t)X
t
dW
t
, (7.8.20)
where f and g are continuous deterministic functions. The integrating factor is given
by
t
= e
t
0
g(s) dWs+
1
2
t
0
g
2
(s) ds
.
The equation can be brought into the following exact form
d(
t
X
t
) =
t
f(t, X
t
)dt.
Substituting Y
t
=
t
X
t
, we obtain that Y
t
satises the deterministic dierential equa-
tion
dY
t
=
t
f(t, Y
t
/
t
)dt,
which can be solved by either integration or as an exact equation. We shall exemplify
the method of integrating factors with a few examples.
165
dX
t
= rdt +X
t
dW
t
, (7.8.21)
with r and constants.
The integrating factor is given by
t
= e
1
2
2
tWt
. Using Itos formula, we can easily
check that
d
t
=
t
(
2
dt dW
t
).
Using dt
2
= dt dW
t
= 0, (dW
t
)
2
= dt we obtain
dX
t
d
t
=
2
t
X
t
dt.
Multiplying by
t
, the initial equation becomes
t
dX
t
t
X
t
dW
t
= r
t
dt,
and adding and subtracting
2
t
X
t
dt from the left side yields
t
dX
t

t
X
t
dW
t
+
2
t
X
t
dt
2
t
X
t
dt = r
t
dt.
This can be written as
t
dX
t
+X
t
d
t
+d
t
dX
t
= r
t
dt,
which, with the virtue of the product rule, becomes
d(
t
X
t
) = r
t
dt.
Integrating yields
t
X
t
=
0
X
0
+r
_
t
0
s
ds
and hence the solution is
X
t
=
1
t
X
0
+
r
t
_
t
0
s
ds
= X
0
e
Wt
1
2
2
t
+r
_
t
0
e
1
2
2
(ts)+(WtWs)
ds.
Exercise 7.8.1 Let be a constant. Solve the following stochastic dierential equa-
tions by the method of integrating factors
(a) dX
t
= X
t
dW
t
;
(b) dX
t
= X
t
dt +X
t
dW
t
;
(c) dX
t
=
1
X
t
dt +X
t
dW
t
, X
0
> 0.
t
be the solution of the stochastic equation dX
t
= X
t
dW
t
, with
constant. Let A
t
=
1
t
_
t
0
X
s
dW
s
be the stochastic average of X
t
. Find the stochastic
equation satised by A
t
, the mean and variance of A
t
.
166
7.9 Existence and Uniqueness
An exploding solution
Consider the non-linear stochastic dierential equation
dX
t
= X
3
t
dt +X
2
t
dW
t
, X
0
= 1/a. (7.9.22)
We shall look for a solution of the type X
t
= f(W
t
). Itos formula yields
dX
t
= f
(W
t
)dW
t
+
1
2
f
(W
t
)dt.
Equating the coecients of dt and dW
t
in the last two equations yields
f
(W
t
) = X
2
t
= f
(W
t
) = f(W
t
)
2
(7.9.23)
1
2
f
(W
t
) = X
3
t
= f
(W
t
) = 2f(W
t
)
3
(7.9.24)
We note that equation (7.9.23) implies (7.9.24) by dierentiation. So it suces to solve
only the ordinary dierential equation
f
(x) = f(x)
2
, f(0) = 1/a.
Separating and integrating we have
_
df
f(x)
2
=
_
ds = f(x) =
1
a x
.
Hence a solution of equation (7.9.22) is
X
t
=
1
a W
t
.
Let T
a
be the rst time the Brownian motion W
t
hits a. Then the process X
t
is dened
only for 0 t < T
a
. T
a
is a random variable with P(T
a
< ) = 1 and E[T
a
] = , see
section 3.3.
The following theorem is the analog of Picards uniqueness result from ordinary
dierential equations:
Theorem 7.9.1 (Existence and Uniqueness) Consider the stochastic dierential
equation
dX
t
= b(t, X
t
)dt +(t, X
t
)dW
t
, X
0
= c
where c is a constant and b and are continuous functions on [0, T] R satisfying
1. [b(t, x)[ +[(t, x)[ C(1 +[x[); x R, t [0, T]
2. [b(t, x) b(t, y)[ +[(t, x) (t, y)[ K[x y[, x, y R, t [0, T]
with C, K positive constants. There there is a unique solution process X
t
that is con-
tinuous and satises
E
_
_
T
0
X
2
t
dt
_
< .
167
The rst condition says that the drift and volatility increase no faster than a linear
function in x. The second conditions states that the functions are Lipschitz in the
second argument.
Example 7.9.1 Consider the stochastic dierential equation
dX
t
=
_
_
1 +X
2
t
+
1
2
X
t
_
dt +
_
1 +X
2
t
dW
t
, X
0
= x
0
.
(a) Solve the equation;
(b) Show that there a unique solution.
168
Chapter 8
Applications of Brownian Motion
8.1 The Generator of an Ito Diusion
Let (X
t
)
t0
be a stochastic process with X
0
= x
0
. In this section we shall deal with
the operator associated with X
t
. This is an operator which describes innitesimally
the rate of change of a function which depends smoothly on X
t
.
More precisely, the generator of the stochastic process X
t
is the second order partial
dierential operator A dened by
Af(x) = lim
t0
E[f(X
t
)] f(x)
t
,
for any smooth function (at least of class C
2
) with compact support f : R
n
R. Here
E stands for the expectation operator taken at t = 0, i.e.,
E[f(X
t
)] =
_
R
n
f(y)p
t
(x, y) dy,
where p
t
(x, y) is the transition density of X
t
.
In the following we shall nd the generator associated with the Ito diusion
dX
t
= b(X
t
)dt +(X
t
)dW(t), t 0, X
0
= x, (8.1.1)
where W(t) =
_
W
1
(t), . . . , W
m
(t)
_
is an m-dimensional Brownian motion, b : R
n
R
n
and : R
n
R
nm
are measurable functions.
The main tool used in deriving the formula for the generator A is Itos formula in
several variables. If let F
t
= f(X
t
), then using Itos formula we have
dF
t
=
i
f
x
i
(X
t
) dX
i
t
+
1
2
i,j
2
f
x
i
x
j
(X
t
) dX
i
t
dX
j
t
, (8.1.2)
169
170
where X
t
= (X
1
t
, , X
n
t
) satises the Ito diusion (8.1.1) on components, i.e.,
dX
i
t
= b
i
(X
t
)dt + [(X
t
) dW(t)]
i
= b
i
(X
t
)dt +
ik
dW
k
(t). (8.1.3)
Using the stochastic relations dt
2
= dt dW
k
(t) = 0 and dW
k
(t) dW
r
(t) =
kr
dt, a
computation provides
dX
i
t
dX
j
t
=
_
b
i
dt +
ik
dW
k
(t)
__
b
j
dt +
jk
dW
k
(t)
_
=
_
ik
dW
k
(t)
__
jr
dW
r
(t)
_
=
k,r
ik
jr
dW
k
(t)dW
r
(t) =
ik
jk
dt
= (
T
)
ij
dt.
Therefore
dX
i
t
dX
j
t
= (
T
)
ij
dt. (8.1.4)
Substitute (8.1.3) and (8.1.4) into (8.1.2) yields
dF
t
=
_
1
2
i,j
2
f
x
i
x
j
(X
t
)(
T
)
ij
+
i
b
i
(X
t
)
f
x
i
(X
t
)
_
dt
+
i,k
f
x
i
(X
t
)
ik
(X
t
) dW
k
(t).
Integrate and obtain
F
t
= F
0
+
_
t
0
_
1
2
i,j
2
f
x
i
x
j
(
T
)
ij
+
i
b
i
f
x
i
_
(X
s
) ds
+
k
_
t
0
ik
f
x
i
(X
s
) dW
k
(s).
Since F
0
= f(X
0
) = f(x) and E(f(x)) = f(x), applying the expectation operator in
the previous relation we obtain
E(F
t
) = f(x) +E
_
_
t
0
_
1
2
i,j
(
T
)
ij
2
f
x
i
x
j
+
i
b
i
f
x
i
_
(X
s
) ds
_
. (8.1.5)
171
Using the commutation between the operator E and the integral
_
t
0
yields
lim
t0
E(F
t
) f(x)
t
=
1
2
i,j
(
T
)
ij
2
f(x)
x
i
x
j
+
k
b
k
f(x)
x
k
.
We conclude the previous computations with the following result.
Theorem 8.1.1 The generator of the Ito diusion (8.1.1) is given by
A =
1
2
i,j
(
T
)
ij
2
x
i
x
j
+
k
b
k
x
k
. (8.1.6)
The matrix is called dispersion and the product
T
is called diusion matrix.
These names are related with their physical signicance Substituting (8.1.6) in (8.1.5)
we obtain the following formula
E[f(X
t
)] = f(x) +E
_
_
t
0
Af(X
s
) ds
_
, (8.1.7)
for any f C
2
0
(R
n
).
Exercise 8.1.2 Find the generator operator associated with the n-dimensional Brow-
nian motion.
Exercise 8.1.3 Find the Ito diusion corresponding to the generator Af(x) = f
(x)+
f
(x).
Exercise 8.1.4 Let
G
=
1
2
(
2
x
1
+x
2
1
2
x
2
) be the Grushins operator.
(a) Find the diusion process associated with the generator
G
.
(b) Find the diusion and dispersion matrices and show that they are degenerate.
t
and Y
t
be two one-dimensional independent Ito diusions with
innitesimal generators A
X
and A
Y
. Let Z
t
= (X
t
, Y
t
) with the innitesimal generator
A
Z
. Show that A
Z
= A
X
+A
Y
.
t
be an Ito diusions with innitesimal generator A
X
. Consider
the process Y
t
= (t, X
t
). Show that the innitesimal generator of Y
t
is given by A
Y
=
t
+A
X
.
172
8.2 Dynkins Formula
Formula (8.1.7) holds under more general conditions, when t is a stopping time. First
we need the following result, which deals with a continuity-type property in the upper
limit of a Ito integral.
Lemma 8.2.1 Let g be a bounded measurable function and be a stopping time for
X
t
with E[] < . Then
lim
k
E
_
_
k
0
g(X
s
) dW
s
_
= E
_
_

0
g(X
s
) dW
s
_
; (8.2.8)
lim
k
E
_
_
k
0
g(X
s
) ds
_
= E
_
_

0
g(X
s
) ds
_
. (8.2.9)
Proof: Let [g[ < K. Using the properties of Ito integrals, we have
E
__
_

0
g(X
s
) dW
s

_
k
0
g(X
t
) dW
s
_
2
_
= E
__
_

k
g(X
s
) dW
s
_
2
_
= E
_
_

k
g
2
(X
s
) ds
_
K
2
E[ k] 0, k .
Since E[X
2
] E[X]
2
, it follows that
E
_
_

0
g(X
s
) dW
s

_
k
0
g(X
t
) dW
s
_
0, k ,
which is equivalent with relation (8.2.8).
The second relation can be proved similar and is left as an exercise for the reader.
Exercise 8.2.2 Assume the hypothesis of the previous lemma. Let 1
{s<}
be the char-
acteristic function of the interval (, )
1
{s<}
(u) =
_
1, if u <
0, otherwise.
Show that
(a)
_
k
0
g(X
s
) dW
s
=
_
k
0
1
{s<}
g(X
s
) dW
s
,
(b)
_
k
0
g(X
s
) ds =
_
k
0
1
{s<}
g(X
s
) ds.
173
Theorem 8.2.3 (Dynkins formula) Let f C
2
0
(R
n
), and X
t
be an Ito diusion
starting at x. If is a stopping time with E[] < , then
E[f(X
)] = f(x) +E
_
_

0
Af(X
s
) ds
_
, (8.2.10)
where A is the innitesimal generator of X
t
.
Proof: Replace t by k and f by 1
{s<}
f in (8.1.7) and obtain
E[1
s<
f(X
k
)] = 1
{s<}
f(x) +E
_
_
k
0
A(1
{s<}
f)(X
s
) ds
_
,
which can be written as
E[f(X
k
)] = 1
{s<}
f(x) +E
_
_
k
0
1
{s<}
(s)A(f)(X
s
) ds
_
= 1
{s<}
f(x) +E
_
_
k
0
A(f)(X
s
) ds
_
. (8.2.11)
Since by Lemma 8.2.1 (b)
E[f(X
k
)] E[f(X
)], k
E
_
_
k
0
A(f)(X
s
) ds
_
E
_
_

0
A(f)(X
s
) ds
_
, k ,
using Exercise (8.2.2) relation (8.2.11) yields (8.2.10).
Exercise 8.2.4 Write Dynkins formula for the case of a function f(t, X
t
). Use Ex-
ercise 8.1.6.
In the following sections we shall present a few important results of stochastic
calculus that can be obtained as direct consequences of Dynkins formula.
8.3 Kolmogorovs Backward Equation
For any function f C
2
0
(R
n
) let v(t, x) = E[f(X
t
)], given that X
0
= x. As usual,
E denotes the expectation at time t = 0. Then v(0, x) = f(x), and dierentiating in
Dynkins formula (8.1.7)
v(t, x) = f(x) +
_
t
0
E[Af(X
s
)] ds
yields
v
t
= E[Af(X
t
)] = AE[f(X
t
)] = Av(t, x).
We arrived at the following result.
174
Theorem 8.3.1 (Kolmogorovs backward equation) For any f C
2
0
(R
n
) the func-
tion v(t, x) = E[f(X
t
)] satises the following Cauchys problem
v
t
= Av, t > 0
v(0, x) = f(x),
where A denotes the generator of the Itos diusion (8.1.1).
8.4 Exit Time from an Interval
Let X
t
= x
0
+ W
t
e a one-dimensional Brownian motion starting at x
0
. Consider the
exit time of the process X
t
from the strip (a, b)
= inft > 0; X
t
, (a, b).
Assuming E[] < 0, applying Dynkins formula yields
E
_
f(X
)
_
= f(x
0
) +E
_
_

0
1
2
d
2
dx
2
f(X
s
) ds
_
. (8.4.12)
Choosing f(x) = x in (8.4.12) we obtain
E[X
] = x
0
(8.4.13)
Exercise 8.4.1 Prove relation (8.4.13) using the Optional Stopping Theorem for the
martingale X
t
.
Let p
a
= P(X
= a) and p
b
= P(X
= b) be the exit probabilities from the interval

(a, b). Obviously, p
a
+p
b
= 1, since the probability that the Brownian motion will stay
for ever inside the bounded interval is zero. Using the expectation denition, relation
(8.4.13) yields
ap
a
+b(1 p
a
) = x
0
.
Solving for p
a
and p
b
we get the following exit probabilities
p
a
=
b x
0
b a
(8.4.14)
p
b
= 1 p
a
=
x
0
a
b a
. (8.4.15)
It is worth noting that if b then p
a
1 and if a then p
b
1. This can
be stated by saying that a Brownian motion starting at x
0
reaches any level (below or
above x
0
) with probability 1.
Next we shall compute the mean of the exit time, E[]. Choosing f(x) = x
2
in
(8.4.12) yields
E[(X
)
2
] = x
2
0
+E[].
175
Figure 8.1: The Brownian motion X
t
in the ball B(0, R).
From the denition of the mean and formulas (8.4.14)-(8.4.15) we obtain
E[] = a
2
p
a
+b
2
p
b
x
2
0
= a
2
b x
0
b a
+b
2
x
0
a
b a
x
0
=
1
b a
_
ba
2
ab
2
+x
0
(b a)(b +a)
_
x
2
0
= ab +x
0
(b +a) x
2
0
= (b x
0
)(x
0
a). (8.4.16)
Exercise 8.4.2 (a) Show that the equation x
2
(ba)x+E[] = 0 cannot have complex
roots;
(b) Prove that E[]
(b a)
2
4
;
(c) Find the point x
0
(a, b) such that the expectation of the exit time, E[], is maxi-
mum.
8.5 Transience and Recurrence of Brownian Motion
We shall consider rst the expectation of the exit time from a ball. Then we shall
extend it to an annulus and compute the transience probabilities.
1. Consider the process X
t
= a + W(t), where W(t) =
_
W
1
(t), . . . , W
n
(t)
_
is an n-
dimensional Brownian motion, and a = (a
1
, . . . , a
n
) R
n
is a xed vector, see Fig. 8.1.
Let R > 0 such that R > [a[. Consider the exit time of the process X
t
from the ball
B(0, R)
= inft > 0; [X
t
[ > R. (8.5.17)
176
Assuming E[] < and letting f(x) = [x[
2
= x
2
1
+ +x
2
n
in Dynkins formula
E
_
f(X
= f(x) +E
_
_

0
1
2
f(X
s
) ds
_
yields
R
2
= [a[
2
+E
_
_

0
nds
_
,
and hence
E[] =
R
2
[a[
2
n
. (8.5.18)
In particular, if the Brownian motion starts from the center, i.e. a = 0, the expectation
of the exit time is
E[] =
R
2
n
.
(i) Since R
2
/2 > R
2
/3, the previous relation implies that it takes longer for a Brownian
motion to exit a disk of radius R rather than a ball of the same radius.
(ii) The probability that a Brownian motion leaves the interval (R, R) is twice the
probability that a 2-dimensional Brownian motion exits the disk B(0, R).
Exercise 8.5.1 Prove that E[] < , where is given by (8.5.17).
Exercise 8.5.2 Apply the Optional Stopping Theorem for the martingale W
t
= W
2
t
t
to show that E[] = R
2
, where
= inft > 0; [W
t
[ > R
is the rst exit time of the Brownian motion from (R, R).
2. Let b R
n
such that b / B(0, R), i.e. [b[ > R, and consider the annulus
A
k
= x; R < [x[ < kR
where k > 0 such that b A
k
. Consider the process X
t
= b +W(t) and let
k
= inft > 0; X
t
/ A
k
be the rst exit time of X

t
from the annulus A
k
. Let f : A
k
R be dened by
f(x) =
_
ln[x[, if n = 2
1
[x[
n2
, if n > 2.
A straightforward computation shows that f = 0. Substituting into Dynkins formula
E
_
f(X
k
)
= f(b) +E
_
_

k
0
_
1
2
f
_
(X
s
) ds
_
177
yields
E
_
f(X
k
)
= f(b). (8.5.19)
This can be stated by saying that the value of f at a point b in the annulus is equal to
the expected value of f at the rst exit time of a Brownian motion starting at b.
Since [X
k
[ is a random variable with two outcomes, we have
E
_
f(X
k
)
= p
k
f(R) +q
k
f(kR),
where p
k
= P([X
k
[ = R), q
k
= P([X
X
k
[) = kR and p
k
+ q
k
= 1. Substituting in
(8.5.19) yields
p
k
f(R) +q
k
f(kR) = f(b). (8.5.20)
There are the following two distinguished cases:
(i) If n = 2 we obtain
p
k
ln R q
k
(ln k + ln R) = lnb.
Using p
k
= 1 q
k
, solving for p
k
yields
p
k
= 1
ln(
b
R
)
ln k
.
Hence
P( < ) = lim
k
p
k
= 1,
where = inft > 0; [X
t
[ < R is the rst time X
t
hits the ball B(0, R). Hence in
R
2
a Brownian motion hits with probability 1 any ball. This is stated equivalently by
saying that the Brownian motion is recurrent in R
2
.
(ii) If n > 2 the equation (8.5.20) becomes
p
k
R
n2
+
q
k
k
n2
R
n2
=
1
b
n2
.
Taking the limit k yields
lim
k
p
k
=
_
R
b
_
n2
< 1.
Then in R
n
, n > 2, a Brownian motion starting outside of a ball hits it with a probability
less than 1. This is usually stated by saying that the Brownian motion is transient.
3. We shall recover the previous results using the n-dimensional Bessel process
1
t
= dist
_
0, W(t)
_
=
_
W
1
(t)
2
+ +W
n
(t)
2
.
Consider the process Y
t
= + 1
t
, with 0 < R. The generator of Y
t
is the Bessel
operator of order n
A =
1
2
d
2
dx
2
+
n 1
2x
d
dx
,
178
Figure 8.2: The Brownian motion X
t
in the annulus A
r,R
.
see section 2.7. Consider the exit time
= t > 0; Y
t
> R.
Applying Dynkins formula
E
_
f(Y
= f(Y
0
) +E
_
_

0
(Af)(Y
s
) ds
_
for f(x) = x
2
yields R
2
=
2
+E
_ _
0
nds
. This leads to
E[] =
R
2
2
n
.
which recovers (8.5.18) with = [a[.
In the following assume n 3 and consider the annulus
A
r,R
= x R
n
; r < [x[ < R.
Consider the stopping time = inft > 0; X
t
/ A
r,R
= inft > 0; Y
t
/ (r, R), where
[Y
0
[ = (r, R). Applying Dynkins formula for f(x) = x
2n
yields E
_
f(Y
= f().
This can be written as
p
r
r
2n
+p
R
R
2n
=
2n
,
where
p
r
= P([X
t
[ = r), p
R
= P([X
t
[ = R), p
r
+p
R
= 1.
Solving for p
r
and p
R
yields
179
p
r
=
_
R
_
n2
1
_
R
r
_
n2
1
, p
R
=
_
r
_
n2
1
_
r
R
_
n2
1
.
The transience probability is obtained by taking the limit to innity
p
r
= lim
R
p
r,R
= lim
R
2n
R
n2
1
r
2n
R
n2
1
=
_
r
_
n2
,
where p
r
is the probability that a Brownian motion starting outside the ball of radius
r will hit the ball, see Fig. 8.2.
Exercise 8.5.3 Solve the equation
1
2
f
(x) +
n1
2x
f
(x) = 0 by looking for a solution of

monomial type f(x) = x
k
.
8.6 Application to Parabolic Equations
This section deals with solving rst and second order parabolic equations using the
integral of the cost function along a certain characteristic solution. The rst order
equations are related to predictable characteristic curves, while the second order equa-
tions depend on stochastic characteristic curves.
1. Predictable characteristics Let (s) be the solution of the following one-dimensional
ODE
dX(s)
ds
= a
_
s, X(s)
_
, t s T
X(t) = x,
and dene the cumulative cost between t and T along the solution
u(t, x) =
_
T
t
c
_
s, (s)
_
ds, (8.6.21)
where c denotes a continuous cost function. Dierentiate both sides with respect to t
t
u
_
t, (t)
_
=

t
_
T
t
c
_
s, (s)
_
ds
t
u +
x
u
(t) = c
_
t, (t)
_
.
Hence (8.6.21) is a solution of the following nal value problem
t
u(t, x) +a(t, x)
x
u(t, x) = c(t, x)
u(T, x) = 0.
It is worth mentioning that this is a variant of the method of characteristics.
1
The
curve given by the solution (s) is called a characteristic curve.
1
This is a well known method of solving linear partial dierential equations.
180
Exercise 8.6.1 Using the previous method solve the following nal boundary problems:
(a)
t
u +x
x
u = x
u(T, x) = 0.
(b)
t
u +tx
x
u = lnx, x > 0
u(T, x) = 0.
2. Stochastic characteristics Consider the diusion
dX
s
= a(s, X
s
)ds +b(s, X
s
)dW
s
, t s T
X
t
= x,
and dene the stochastic cumulative cost function
u(t, X
t
) =
_
T
t
c(s, X
s
) ds, (8.6.22)
with the conditional expectation
u(t, x) = E
_
u(t, X
t
)[X
t
= x
_
= E
_
_
T
t
c(s, X
s
) ds[X
t
= x
_
.
Taking increments in both sides of (8.6.22) yields
du(t, X
t
) = d
_
T
t
c(s, X
s
) ds.
Applying Itos formula on one side and the Fundamental Theorem of Calculus on the
other, we obtain
t
u(t, x)dt +
x
u(t, X
t
)dX
t
+
1
2
2
x
u(t, t, X
t
)dX
2
t
= c(t, X
t
)dt.
Taking the expectation E[ [X
t
= x] on both sides yields
t
u(t, x)dt +
x
u(t, x)a(t, x)dt +
1
2
2
x
u(t, x)b
2
(t, x)dt = c(t, x)dt.
Hence, the expected cost
u(t, x) = E
_
_
T
t
c(s, X
s
) ds[X
t
= x
_
181
is a solution of the following second order parabolic equation
t
u +a(t, x)
x
u +
1
2
b
2
(t, x)
2
x
u(t, x) = c(t, x)
u(T, x) = 0.
This represents the probabilistic interpretation of the solution of a parabolic equation.
Exercise 8.6.2 Solve the following nal boundary problems:
(a)
t
u +
x
u +
1
2
2
x
u = x
u(T, x) = 0.
(b)
t
u +
x
u +
1
2
2
x
u = e
x
,
u(T, x) = 0.
(c)
t
u +x
x
u +
1
2
2
x
2
2
x
u = x,
u(T, x) = 0.
182
Chapter 9
Martingales and Girsanovs
Theorem
9.1 Examples of Martingales
In this section we shall use the knowledge acquired in previous chapters to present a
few important examples of martingales. These will be useful in the proof of Girsanovs
theorem in the next section.
We start by recalling that a process X
t
, 0 t T, is an T
t
-martingale if
1. E[[X
t
[] < (X
t
integrable for each t);
2. X
t
is T
t
-predictable;
3. the forecast of future values is the last observation: E[X
t
[T
s
] = X
s
, s < t.
We shall present three important examples of martingales and some of their particular
cases.
Example 9.1.1 If v(s) is a continuous function on [0, T], then
X
t
=
_
t
0
v(s) dW
s
is an T
t
-martingale.
Taking out the predictable part
E[X
t
[T
s
] = E
_
_
s
0
v() dW
+
_
t
s
v() dW
T
s
_
= X
s
+E
_
_
t
s
v() dW
T
s
_
= X
s
,
183
184
where we used that
_
t
s
v() dW
is independent of T
s
and the conditional expectation
equals the usual expectation
E
_
_
t
s
v() dW
T
s
_
= E
_
_
t
s
v() dW
_
= 0.
Example 9.1.2 Let X
t
=
_
t
0
v(s) dW
s
be a process as in Example 9.1.1. Then
M
t
= X
2
t

_
t
0
v
2
(s) ds
is an T
t
-martingale.
The process X
t
satises the stochastic equation dX
t
= v(t)dW
t
. By Itos formula
d(X
2
t
) = 2X
t
dX
t
+ (dX
t
)
2
= 2v(t)X
t
dW
t
+v
2
(t)dt. (9.1.1)
Integrating between s and t yields
X
2
t
X
2
s
= 2
_
t
s
X
v() dW
+
_
t
s
v
2
() d.
Then separating the predictable from the unpredictable we have
E[M
t
[T
s
] = E
_
X
2
t

_
t
0
v
2
() d
T
s
_
= E
_
X
2
t
X
2
s

_
t
s
v
2
() d +X
2
s

_
s
0
v
2
() d
T
s
_
= X
2
s

_
s
0
v
2
() d +E
_
X
2
t
X
2
s

_
t
s
v
2
() d
T
s
_
= M
s
+ 2E
_
_
t
s
X
v() dW
T
s
_
= M
s
,
where we used relation (9.1.1) and that
_
t
s
X
v() dW
is totally unpredictable given

the information set T
s
. In the following we shall mention a few particular cases.
1. If v(s) = 1, then X
t
= W
t
. In this case M
t
= W
2
t
t is an T
t
-martingale.
2. If v(s) = s, then X
t
=
_
t
0
s dW
s
, and hence
M
t
=
_
_
t
0
s dW
s
_
2
t
3
3
is an T
t
-martingale.
185
Example 9.1.3 Let u : [0, T] R be a continuous function. Then
M
t
= e
t
0
u(s) dWs
1
2
t
0
u
2
(s) ds
is an T
t
-martingale for 0 t T.
Consider the process U
t
=
_
t
0
u(s) dW
s

1
2
_
t
0
u
2
(s) ds. Then
dU
t
= u(t)dW
t

1
2
u
2
(t)dt
(dU
t
)
2
= u(t)dt.
Then Itos formula yields
dM
t
= d(e
Ut
) = e
Ut
dU
t
+
1
2
e
Ut
(dU
t
)
2
= e
Ut
_
u(t)dW
t

1
2
u
2
(t)dt +
1
2
u
2
(t)dt
_
= u(t)M
t
dW
t
.
M
t
= M
s
+
_
t
s
u()M
dW
Since
_
t
s
u()M
dW
is independent of T
s
, then
E
_
_
t
s
u()M
dW
[T
s
_
= E
_
_
t
s
u()M
dW
= 0,
and hence
E[M
t
[T
s
] = E[M
s
+
_
t
s
u()M
dW
[T
s
] = M
s
.
Remark 9.1.4 The condition that u(s) is continuous on [0, T] can be relaxed by asking
only
u L
2
[0, T] = u : [0, T] R; measurable and
_
t
0
[u(s)[
2
ds < .
It is worth noting that the conclusion still holds if the function u(s) is replaced by a
stochastic process u(t, ) satisfying Novikovs condition
E
_
e
1
2
T
0
u
2
(s,) ds
< .
186
The previous process has a distinguished importance in the theory of martingales and
it will be useful when proving the Girsanov theorem.
Denition 9.1.5 Let u L
2
[0, T] be a deterministic function. Then the stochastic
process
M
t
= e
t
0
u(s) dWs
1
2
t
0
u
2
(s) ds
is called the exponential process induced by u.
Particular cases of exponential processes.
1. Let u(s) = , constant, then M
t
= e
Wt
2
2
t
is an T
t
-martingale.
2. Let u(s) = s. Integrating in d(tW
t
) = tdW
t
W
t
dt yields
_
t
0
s dW
s
= tW
t
_
t
0
W
s
ds.
Let Z
t
=
_
t
0
W
s
ds be the integrated Brownian motion. Then
M
t
= e
t
0
d dWs
1
2
t
0
s
2
ds
= e
tWt
t
3
6
Zt
is an T
t
-martingale.
Example 9.1.6 Let X
t
be a solution of dX
t
= u(t)dt + dW
t
, with u(s) a bounded
function. Consider the exponential process
M
t
= e
t
0
u(s) dWs
1
2
t
0
u
2
(s) ds
. (9.1.2)
Then Y
t
= M
t
X
t
is an T
t
-martingale.
In Example 9.1.3 we obtained dM
t
= u(t)M
t
dW
t
. Then
dM
t
dX
t
= u(t)M
t
dt.
The product rule yields
dY
t
= M
t
dX
t
+X
t
dM
t
+dM
t
dX
t
= M
t
_
u(t)dt +dW
t
_
X
t
u(t)M
t
dW
t
u(t)M
t
dt
= M
t
_
1 u(t)X
t
_
dW
t
.
Y
t
= Y
s
+
_
t
s
M
_
1 u()X
_
dW
.
187
Since
_
t
s
M
_
1 u()X
_
dW
is independent of T
s
,
E
_
_
t
s
M
_
1 u()X
_
dW
[T
s
_
= E
_
_
t
s
M
_
1 u()X
_
dW
_
= 0,
and hence
E[Y
t
[T
s
] = Y
s
.
Exercise 9.1.7 Prove that (W
t
+t)e
Wt
1
2
t
is an T
t
-martingale.
Exercise 9.1.8 Let h be a continuous function. Using the properties of the Wiener
integral and log-normal random variables, show that
E
_
e
t
0
h(s) dWs
_
= e
1
2
t
0
h(s)
2
ds
.
Exercise 9.1.9 Let M
t
be the exponential process (9.1.2). Use the previous exercise
to show that for any t > 0
(a) E[M
t
] = 1 (b) E[M
2
t
] = e
t
0
u(s)
2
ds
.
Exercise 9.1.10 Let T
t
= W
u
; u t. Show that the following processes are T
t
-
martingales:
(a) e
t/2
cos W
t
;
(b) e
t/2
sin W
t
.
Recall that the Laplacian of a twice dierentiable function f is dened by f(x) =
n
j=1
2
x
j
f.
Example 9.1.11 Consider the smooth function f : R
n
R, such that
(i) f = 0;
(ii) E
_
[f(W
t
)[
< , t > 0 and x R.

Then the process X
t
= f(W
t
) is an T
t
-martingale.
Proof: It follows from the more general Example 9.1.13.
Exercise 9.1.12 Let W
1
(t) and W
2
(t) be two independent Brownian motions. Show
that X
t
= e
W
1
(t)
cos W
2
(t) is a martingale.
Example 9.1.13 Let f : R
n
R be a smooth function such that
(i) E
_
[f(W
t
)[
< ;
(ii) E
_
_
t
0
[f(W
s
)[ ds
_
< .
Then the process X
t
= f(W
t
)
1
2
_
t
0
f(W
s
) ds is a martingale.
188
Proof: For 0 s < t we have
E
_
X
t
[T
s
= E
_
f(W
t
)[T
s
E
_
1
2
_
t
0
f(W
u
) du[T
s
_
= E
_
f(W
t
)[T
s
1
2
_
s
0
f(W
u
) du
_
t
s
E
_
1
2
f(W
u
)[T
s
_
du(9.1.3)
Let p(t, y, x) be the probability density function of W
t
. Integrating by parts and using
that p satises the Kolmogorovs backward equation, we have
E
_
1
2
f(W
u
)[T
s
_
=
1
2
_
p(u s, W
s
, x) f(x) dx =
1
2
_

x
p(u s, W
s
, x)f(x) dx
=
_

u
p(u s, W
s
, x)f(x) dx.
Then, using the Fundamental Theorem of Calculus, we obtain
_
t
s
E
_
1
2
f(W
u
)[T
s
_
du =
_
t
s
_

u
_
p(u s, W
s
, x)f(x) dx
_
du
=
_
p(t s, W
s
, x)f(x) dx lim
0
_
p(, W
s
, x)f(x) dx
= E
_
f(W
t
)[T
s
_
(x = W
s
)f(x) dx
= E
_
f(W
t
)[T
s
f(W
s
).
E
_
X
t
[T
s
= E
_
f(W
t
)[T
s
1
2
_
s
0
f(W
u
) du E
_
f(W
t
)[T
s
+f(W
s
)(9.1.4)
= f(W
s
)
1
2
_
s
0
f(W
u
) du (9.1.5)
= X
s
. (9.1.6)
Hence X
t
is an T
t
-martingale.
Exercise 9.1.14 Use the Example 9.1.13 to show that the following processes are mar-
tingales:
(a) X
t
= W
2
t
t;
(b) X
t
= W
3
t
3
_
t
0
W
s
ds;
(c) X
t
=
1
n(n1)
W
n
t

1
2
_
t
0
W
n2
s
ds;
(d) X
t
= e
cWt
1
2
c
2
_
t
0
e
Ws
ds, with c constant;
(e) X
t
= sin(cW
t
) +
1
2
c
2
_
t
0
sin(cW
s
) ds, with c constant.
189
Exercise 9.1.15 Let f : R
n
R be a function such that
(i) E
_
[f(W
t
)[
< ;
(ii) f = f, constant.
Show that the process X
t
= f(W
t
)

2
_
t
0
f(W
s
) ds is a martingale.
9.2 Girsanovs Theorem
In this section we shall present and prove a particular version of Girsanovs theorem,
which will suce for the purpose of later applications. The application of Girsanovs
theorem to nance is to show that in the study of security markets the dierences
between the mean rates of return can be removed.
We shall recall rst a few basic notions. Let (, T, P) be a probability space. When
dealing with an T
t
-martingale on the aforementioned probability space, the ltration
T
t
is considered to be the sigma-algebra generated by the given Brownian motion W
t
,
i.e. T
t
= W
u
; 0 u s. By default, a martingale is considered with respect to the
probability measure P, in the sense that the expectations involve an integration with
respect to P
E
P
[X] =
_
X()dP().
We have not used the upper script until now since there was no doubt which probability
measure was used. In this section we shall use also another probability measure given
by
dQ = M
T
dP,
where M
T
is an exponential process. This means that Q : T R is given by
Q(A) =
_
A
dQ =
_
A
M
T
dP, A T.
Since M
T
> 0, M
0
= 1, using the martingale property of M
t
yields
Q(A) > 0, A ,= ;
Q() =
_
M
T
dP = E
P
[M
T
] = E
P
[M
T
[T
0
] = M
0
= 1,
which shows that Q is a probability on T, and hence (, T, Q) becomes a probability
space. The following transformation of expectation from the probability measure Q to
P will be useful in part II. If X is a random variable
E
Q
[X] =
_
X() dQ() =
_
X()M
T
() dP()
= E
P
[XM
T
].
The following result will play a central role in proving Girsanovs theorem:
190
Lemma 9.2.1 Let X
t
be the Ito process
dX
t
= u(t)dt +dW
t
, X
0
= 0, 0 t T,
with u(s) a bounded function. Consider the exponential process
M
t
= e
t
0
u(s) dWs
1
2
t
0
u
2
(s) ds
.
Then X
t
is an T
t
-martingale with respect to the measure
dQ() = M
T
()dP().
Proof: We need to prove that X
t
is an T
t
-martingale with respect to Q, so it suces
to show the following three properties:
1. Integrability of X
t
. This part usually follows from standard manipulations of norms
estimations. We shall do it here in detail, but we shall omit it in other proofs. Inte-
grating in the equation of X
t
between 0 and t yields
X
t
=
_
t
0
u(s) ds +W
t
. (9.2.7)
We start with an estimation of the expectation with respect to P
E
P
[X
2
t
] = E
P
__
_
t
0
u(s) ds
_
2
+ 2
_
t
0
u(s) ds W
t
+W
2
t
_
=
_
_
t
0
u(s) ds
_
2
+ 2
_
t
0
u(s) ds E
P
[W
t
] +E
P
[W
2
t
]
=
_
_
t
0
u(s) ds
_
2
+t < , 0 t T
where the last inequality follows from the norm estimation
_
t
0
u(s) ds
_
t
0
[u(s)[ ds
_
t
_
t
0
[u(s)[
2
ds
_
1/2
_
t
_
T
0
[u(s)[
2
ds
_
1/2
= T
1/2
|u|
L
2
[0,T]
.
Next we obtain an estimation with respect to Q
E
Q
[[X
t
[]
2
=
_
_
[X
t
[M
T
dP
_
2
[X
t
[
2
dP
_
M
2
T
dP
= E
P
[X
2
t
] E
P
[M
2
T
] < ,
since E
P
[X
2
t
] < and E
P
[M
2
T
] = e
T
0
u(s)
2
ds
= e
u
2
L
2
[0,T]
, see Exercise 9.1.9.
191
2. T
t
-predictability of X
t
. This follows from equation (9.2.7) and the fact that W
t
is
T
t
-predictable.
3. Conditional expectation of X
t
. From Examples 9.1.3 and 9.1.6 recall that for any
0 t T
M
t
is an T
t
-martingale with respect to probability measure P;
X
t
M
t
is an T
t
-martingale with respect to probability measure P.
We need to verify that
E
Q
[X
t
[T
s
] = X
s
, s t.
_
A
X
t
dQ =
_
A
X
s
dQ, A T
s
.
Since dQ = M
T
dP, the previous relation becomes
_
A
X
t
M
T
dP =
_
A
X
s
M
T
dP, A T
s
.
This can be written in terms of conditional expectation as
E
P
[X
t
M
T
[T
s
] = E
P
[X
s
M
T
[T
s
]. (9.2.8)
We shall prove this identity by showing that both terms are equal to X
s
M
s
. Since X
s
is T
s
-predictable and M
t
is a martingale, the right side term becomes
E
P
[X
s
M
T
[T
s
] = X
s
E
P
[M
T
[T
s
] = X
s
M
s
, s T.
Let s < t. Using the tower property (see Proposition 1.11.4, part 3), the left side term
becomes
E
P
[X
t
M
T
[T
s
] = E
P
_
E
P
[X
t
M
T
[T
t
][T
s
= E
P
_
X
t
E
P
[M
T
[T
t
][T
s
= E
P
_
X
t
M
t
[T
s
= X
s
M
s
,
where we used that M
t
and X
t
M
t
are martingales and X
t
is T
t
-predictable. Hence
(9.2.8) holds and X
t
is an T
t
-martingale with respect to the probability measure Q.
Lemma 9.2.2 Consider the process
X
t
=
_
t
0
u(s) ds +W
t
, 0 t T,
with u L
2
[0, T] a deterministic function, and let dQ = M
T
dP. Then
E
Q
[X
2
t
] = t.
192
Proof: Denote U(t) =
_
t
0
u(s) ds. Then
E
Q
[X
2
t
] = E
P
[X
2
t
M
T
] = E
P
[U
2
(t)M
T
+ 2U(t)W
t
M
T
+W
2
t
M
T
]
= U
2
(t)E
P
[M
T
] + 2U(t)E
P
[W
t
M
T
] +E
P
[W
2
t
M
T
]. (9.2.9)
From Exercise 9.1.9 (a) we have E
P
[M
T
] = 1. In order to compute E
P
[W
t
M
T
] we use
the tower property and the martingale property of M
t
E
P
[W
t
M
T
] = E[E
P
[W
t
M
T
[T
t
]] = E[W
t
E
P
[M
T
[T
t
]]
= E[W
t
M
t
]. (9.2.10)
Using the product rule
d(W
t
M
t
) = M
t
dW
t
+W
t
dM
t
+dW
t
dM
t
=
_
M
t
u(t)M
t
W
t
_
dW
t
u(t)M
t
dt,
where we used dM
t
= u(t)M
t
dW
t
. Integrating between 0 and t yields
W
t
M
t
=
_
t
0
_
M
s
u(s)M
s
W
s
_
dW
s
_
t
0
u(s)M
s
ds.
Taking the expectation and using the property of Ito integrals we have
E[W
t
M
t
] =
_
t
0
u(s)E[M
s
] ds =
_
t
0
u(s) ds = U(t). (9.2.11)
E
P
[W
t
M
T
] = U(t). (9.2.12)
For computing E
P
[W
2
t
M
T
] we proceed in a similar way
E
P
[W
2
t
M
T
] = E[E
P
[W
2
t
M
T
[T
t
]] = E[W
2
t
E
P
[M
T
[T
t
]]
= E[W
2
t
M
t
]. (9.2.13)
Using the product rule yields
d(W
2
t
M
t
) = M
t
d(W
2
t
) +W
2
t
dM
t
+d(W
2
t
)dM
t
= M
t
(2W
t
dW
t
+dt) W
2
t
_
u(t)M
t
dW
t
_
(2W
t
dW
t
+dt)
_
u(t)M
t
dW
t
_
= M
t
W
t
_
2 u(t)W
t
_
dW
t
+
_
M
t
2u(t)W
t
M
t
_
dt.
Integrate between 0 and t
W
2
t
M
t
=
_
t
0
[M
s
W
s
_
2 u(s)W
s
_
] dW
s
+
_
t
0
_
M
s
2u(s)W
s
M
s
_
ds,
193
and take the expected value to get
E[W
2
t
M
t
] =
_
t
0
_
E[M
s
] 2u(s)E[W
s
M
s
]
_
ds
=
_
t
0
_
1 + 2u(s)U(s)
_
ds
= t +U
2
(t),
where we used (9.2.11). Substituting into (9.2.13) yields
E
P
[W
2
t
M
T
] = t +U
2
(t). (9.2.14)
Substituting (9.2.12) and (9.2.14) into relation (9.2.9) yields
E
Q
[X
2
t
] = U
2
(t) 2U(t)
2
+t +U
2
(t) = t, (9.2.15)
which ends the proof of the lemma.
Now we are prepared to prove one of the most important results of stochastic
calculus.
Theorem 9.2.3 (Girsanov Theorem) Let u L
2
[0, T] be a deterministic function.
Then the process
X
t
=
_
t
0
u(s) ds +W
t
, 0 t T
is a Brownian motion with respect to the probability measure Q given by
dQ = e
T
0
u(s) dWs
1
2
T
0
u(s)
2
ds
dP.
Proof: In order to prove that X
t
is a Brownian motion on the probability space
(, T, Q) we shall apply Levys characterization theorem, see Theorem 2.1.5. Hence it
suces to show that X
t
is a Wiener process. Lemma 9.2.1 implies that the process X
t
1. X
0
= 0;
2. X
t
is continuous in t;
3. X
t
is a square integrable T
t
-martingale on the space (, T, Q).
The only property we still need to show is
4. E
Q
[(X
t
X
s
)
2
] = t s, s < t.
Using Lemma 9.2.2, the martingale property of W
t
, the additivity and the tower prop-
erty of expectations, we have
E
Q
[(X
t
X
s
)
2
] = E
Q
[X
2
t
] 2E
Q
[X
t
X
s
] +E
Q
[X
2
s
]
= t 2E
Q
[X
t
X
s
] +s
= t 2E
Q
[E
Q
[X
t
X
s
[T
s
]] +s
= t 2E
Q
[X
s
E
Q
[X
t
[T
s
]] +s
= t 2E
Q
[X
2
s
] +s
= t 2s +s = t s,
194
which proves relation 4.
Choosing u(s) = , constant, we obtain the following consequence that will be
useful later in nance applications.
Corollary 9.2.4 Let W
t
be a Brownian motion on the probability space (, T, P).
Then the process
X
t
= t +W
t
, 0 t T
is a Brownian motion on the probability space (, T, Q), where
dQ = e
1
2
2
TW
T
dP.
Part II
Applications to Finance
195
Chapter 10
Modeling Stochastic Rates
Elementary Calculus provides powerful methods of modeling phenomena from the real
world. However, the real world is imperfect, and in order to study it, one needs to
employ methods of Stochastic Calculus.
10.1 An Introductory Problem
The model in an ideal world
Consider the amount of money M(t) at time t invested in a bank account that pays
interest at a constant rate r. The dierential equation which models this problem is
dM(t) = rM(t)dt. (10.1.1)
Given the initial investment M(0) = M
0
, the account balance at time t is given by the
solution of the equation, M(t) = M
0
e
rt
.
The model in the real world
In the real world the interest rate r is not constant. It may be assumed constant only
for a very small amount of time, such as one day or one week. The interest rate changes
unpredictably in time, which means that it is a stochastic process. This can be modeled
in several dierent ways. For instance, we may assume that the interest rate at time t
is given by the continuous stochastic process r
t
= r + W
t
, where > 0 is a constant
that controls the volatility of the rate, and W
t
is a Brownian motion process. The
process r
t
represents a diusion that starts at r
0
= r, with constant mean E[r
t
] = r
and variance proportional with the time elapsed, V ar[r
t
] =
2
t. With this change in
the model, the account balance at time t becomes a stochastic process M
t
that satises
the stochastic equation
dM
t
= (r +W
t
)M
t
dt, t 0. (10.1.2)
197
198
Solving the equation
In order to solve this equation, we write it as dM
t
r
t
M
t
dt = 0 and multiply by the
integrating factor e
t
0
rs ds
. We can check that
d
_
e
t
0
rs ds
_
= e
t
0
rs ds
r
t
dt
dM
t
d
_
e
t
0
rs ds
_
= 0,
since dt
2
= dt dW
t
= 0. Using the product rule, the equation becomes exact
d
_
M
t
e
t
0
rs ds
_
= 0.
Integrating yields the solution
M
t
= M
0
e
t
0
rs ds
= M
0
e
t
0
(r+Ws) ds
= M
0
e
rt+Zt
,
where Z
t
=
_
t
0
W
s
ds is the integrated Brownian motion process. Since the moment
generating function of Z
t
is m() = E[e
Zt
] = e
2
t
3
/6
(see Exercise 2.3.4), we obtain
E[e
Ws
] = e
2
t
3
/6
;
V ar[e
Ws
] = m(2) m() = e
2
t
3
/3
(e
2
t
3
/3
1).
Then the mean and variance of the solution M
t
= M
0
e
rt+Zt
are
E[M
t
] = M
0
e
rt
E[e
Zt
] = M
0
e
rt+
2
t
3
/6
;
V ar[M
t
] = M
2
0
e
2rt
V ar[e
Zt
] = M
2
0
e
2rt+
2
t
3
/3
(e
2
t
3
/3
1).
Conclusion
We shall make a few interesting remarks. If M(t) and M
t
represent the balance at time
t in the ideal and real worlds, respectively, then
E[M
t
] = M
0
e
rt
e
2
t
3
/6
> M
0
e
rt
= M(t).
This means that we expect to have more money in the account of an ideal world rather
than in the real world account. Similarly, a bank can expect to make more money when
lending at a stochastic interest rate than at a constant interest rate. This inequality
is due to the convexity of the exponential function. If X
t
= rt + Z
t
, then Jensens
inequality yields
E[e
Xt
] e
E[Xt]
= e
rt
.
199
10.2 Langevins Equation
We shall consider another stochastic extension of the equation (10.1.1). We shall al-
low for continuously random deposits and withdrawals which can be modeled by an
unpredictable term, given by dW
t
, with constant. The obtained equation
dM
t
= rM
t
dt +dW
t
, t 0 (10.2.3)
is called Langevins equation.
Solving the equation
We shall solve it as a linear stochastic equation. Multiplying by the integrating factor
e
rt
yields
d(e
rt
M
t
) = e
rt
dW
t
.
Integrating we obtain
e
rt
M
t
= M
0
+
_
t
0
e
rs
dW
s
.
Hence the solution is
M
t
= M
0
e
rt
+
_
t
0
e
r(ts)
dW
s
. (10.2.4)
This is called the Ornstein-Uhlenbeck process. Since the last term is a Wiener integral,
by Proposition (7.3.1) we have that M
t
is Gaussian with the mean
E[M
t
] = M
0
e
rt
+E[
_
t
0
e
r(ts)
dW
s
] = M
0
e
rt
and variance
V ar[M
t
] = V ar
_
_
t
0
e
r(ts)
dW
s
=

2
2r
(e
2rt
1).
It is worth noting that the expected balance is equal to the ideal world balance M
0
e
rt
.
The variance for t small is approximately equal to
2
t, which is the variance of W
t
.
If the constant is replaced by an unpredictable function (t, W
t
), the equation
becomes
dM
t
= rM
t
dt +(t, W
t
)dW
t
, t 0.
Using a similar argument we arrive at the following solution:
M
t
= M
0
e
rt
+
_
t
0
e
r(ts)
(t, W
t
) dW
s
. (10.2.5)
This process is not Gaussian. Its mean and variance are given by
E[M
t
] = M
0
e
rt
V ar[M
t
] =
_
t
0
e
2r(ts)
E[
2
(t, W
t
)] ds.
200
In the particular case when (t, W
t
) = e
2rWt
, using Application 6.3.9 with =
2r, we can work out for (10.2.5) an explicit form of the solution
M
t
= M
0
e
rt
+
_
t
0
e
r(ts)
e
2rWt
dW
s
= M
0
e
rt
+e
rt
_
t
0
e
rs
e
2rWt
dW
s
= M
0
e
rt
+e
rt
1
2r
_
e
rs+
2rWt
1
_
= M
0
e
rt
+
1
2r
_
e
2rWt
e
rt
_
.
The previous model for the interest rate is not a realistic one because it allows for
negative rates. The Brownian motion hits r after a nite time with probability 1.
However, it might be a good stochastic model for something such as the evolution of
population, since in this case the rate can be negative.
10.3 Equilibrium Models
Let r
t
denote the spot rate at time t. This is the rate at which one can invest for
the shortest period of time.
1
For the sake of simplicity, we assume the interest rate r
t
satises an equation with one source of uncertainty
dr
t
= m(r
t
)dt +(r
t
)dW
t
. (10.3.6)
The drift rate and volatility of the spot rate r
t
do not depend explicitly on the time
t. There are several classical choices for m(r
t
) and (r
t
) that will be discussed in the
following sections.
10.3.1 The Rendleman and Bartter Model
The model introduced in 1986 by Rendleman and Bartter [14] assumes that the short-
time rate satises the process
dr
t
= r
t
dt +r
t
dW
t
.
The growth rate and the volatility are considered constants. This equation has
been solved in Example 7.7.2 and its solution is given by
r
t
= r
0
e
(
2
2
)t+Wt
.
1
The longer the investment period, the larger the rate. The instantaneous rate or spot rate is the
rate at which one can invest for a short period of time, such as one day or one second.
201
Equilibrium level
1000 2000 3000 4000 5000 6000
1.21
1.22
1.23
1.24
1.25
Figure 10.1: A simulation of dr
t
= a(b r
t
)dt +dW
t
, with r
0
= 1.25, a = 3, = 1%,
b = 1.2.
The distribution of r
t
is log-normal. Using Example 7.2.1, the mean and variance
become
E[r
t
] = r
0
e
(
2
2
)t
E[e
Wt
] = r
0
e
(
2
2
)t
e
2
t/2
= r
0
e
t
.
V ar[r
t
] = r
2
0
e
2(
2
2
)t
V ar[e
Wt
] = r
2
0
e
2(
2
2
)t
e
2
t
(e
2
t
1)
= r
2
0
e
2t
(e
2
t
1).
The next two models incorporate the mean reverting phenomenon of interest rates.
This means that in the long run the rate converges towards an average level. These
models are more realistic and are based on economic arguments.
10.3.2 The Vasicek Model
Vasiceks assumption is that the short-term interest rates should satisfy the mean
reverting stochastic dierential equation
dr
t
= a(b r
t
)dt +dW
t
, (10.3.7)
with a, b, positive constants, see Vasicek [16].
Assuming the spot rates are deterministic, we take = 0 and obtain the ODE
dr
t
= a(b r
t
)dt.
Solving it as a linear equation yields the solution
r
t
= b + (r
0
b)e
at
.
This implies that the rate r
t
is pulled towards level b at the rate a. This means that, if
r
0
> b, then r
t
is decreasing towards b, and if r
0
< b, then r
t
is increasing towards the
horizontal asymptote b. The term dW
t
in Vasiceks model adds some white noise
to the process. In the following we shall nd an explicit formula for the spot rate r
t
in
the stochastic case.
202
Proposition 10.3.1 The solution of the equation (11.4.5) is given by
r
t
= b + (r
0
b)e
at
+e
at
_
t
0
e
as
dW
s
.
The process r
t
is Gaussian with mean and variance
E[r
t
] = b + (r
0
b)e
at
;
V ar[r
t
] =

2
2a
(1 e
2at
).
Proof: Multiplying by the integrating factor e
at
yields
d
_
e
at
r
t
_
= abe
at
dt +e
at
dW
t
.
Integrating between 0 and t and dividing by e
at
we get
r
t
= r
0
e
at
+be
at
(e
at
1) +e
at
_
t
0
e
as
dW
s
= b + (r
0
b)e
at
+e
at
_
t
0
e
as
dW
s
.
Since the spot rate r
t
is the sum between the predictable function r
0
e
at
+be
at
(e
at
1) and a multiple of a Wiener integral, from Proposition (4.6.1) it follows that r

t
is
Gaussian, with
E[r
t
] = b + (r
0
b)e
at
V ar(r
t
) = V ar
_
e
at
_
t
0
e
as
dW
s
_
=
2
e
2at
_
t
0
e
2as
ds
=

2
2a
(1 e
2at
).
The following consequence explains the name of mean reverting rate.
Remark 10.3.2 Since lim
t
E[r
t
] = lim
t
(b + (r
0
b)e
at
) = b, it follows that the spot
rate r
t
tends to b as t . The variance tends in the long run to

2
2a
.
Since r
t
is normally distributed, the Vasicek model has been criticized because
it allows for negative interest rates and unbounded large rates. See Fig.10.1 for a
simulation of the short-term interest rates for the Vasicek model.
Exercise 10.3.3 Let 0 s < t. Find the following expectations
(a) E[W
t
_
s
0
W
u
e
au
du].
(b) E[
_
t
0
W
u
e
au
du
_
s
0
W
v
e
av
dv].
203
Equilibrium level
Vasicek's model
CIR model
1000 2000 3000 4000 5000 6000
10.0
10.5
11.0
11.5
12.0
Figure 10.2: Comparison between a simulation in CIR and Vasicek models, with pa-
rameter values a = 3, = 15%, r
0
= 12, b = 10. Note that the CIR process tends to
be more volatile than Vasiceks.
Exercise 10.3.4 (a) Find the probability that r
t
is negative.
(b) What happens with this probability when t ?
(c) Find the rate of change of this probability.
(d) Compute Cov(r
s
, r
t
).
10.3.3 The Cox-Ingersoll-Ross Model
The Cox-Ingersoll-Ross (CIR) model assumes that the spot rates verify the stochastic
equation
dr
t
= a(b r
t
)dt +
r
t
dW
t
, (10.3.8)
with a, b, constants, see [5]. Two main advantages of this model are
the process exhibits mean reversion.
it is not possible for the interest rates to become negative.
A process that satises equation (10.3.8) is called a CIR process. This is not a Gaussian
process. In the following we shall compute its rst two moments.
To get the rst moment, integrating the equation (10.3.8) between 0 and t which
yields
r
t
= r
0
+abt a
_
t
0
r
s
ds +
_
t
0
r
s
dW
s
.
Take the expectation we obtain
E[r
t
] = r
0
+abt a
_
t
0
E[r
s
] ds.
204
Denote (t) = E[r
t
]. Then dierentiate with respect to t in
(t) = r
0
+abt a
_
t
0
(s) ds,
which yields the dierential equation
(t) = ab a(t). Multiply by the integrating

factor e
at
and obtain
d(e
at
(t)) = abe
at
.
Integrate and use that (0) = r
0
, which provides the solution
(t) = b +e
at
(r
0
b).
Hence
lim
t
E[r
t
] = lim
t
_
b +e
at
(r
0
b)
_
= b,
which shows that the process is mean reverting.
We compute in the following the second moment
2
(t) = E[r
2
t
]. By Itos formula
we have
d(r
2
t
) = 2r
t
dr
t
+ (dr
t
)
2
= 2r
t
dr
t
+
2
r
t
dt
= 2r
t
[a(b r
t
)dt +
r
t
dW
t
] +
2
r
t
dt
= [(2ab +
2
)r
t
2ar
2
t
]dt + 2r
3/2
t
dW
t
.
Integrating we get
r
2
t
= r
2
0
+
_
t
0
[(2ab +
2
)r
s
2ar
2
s
] ds + 2
_
t
0
r
3/2
s
dW
s
.
Taking the expectation yields
2
(t) = r
2
0
+
_
t
0
[(2ab +
2
)(s) 2a
2
(s)] ds.
Dierentiating again:
2
(t) = (2ab +
2
)(t) 2a
2
(t).
Solving as a linear dierential equation in
2
(t) yields
d(
2
(t)e
2at
) = (2ab +
2
)e
2at
(t).
Substituting the value of (t) and integrating yields
2
(t)e
2at
= r
2
0
+ (2ab +
2
)
_
b
2a
(e
2at
1) +
r
0
b
a
(e
at
1)
_
.
Hence the second moment has the formula
2
(t) = r
2
0
e
2at
+ (2ab +
2
)
_
b
2a
(1 e
2at
) +
r
0
b
a
(1 e
at
)e
at
_
.
Exercise 10.3.5 Use a similar method to nd a recursive formula for the moments of
a CIR process.
205
10.4 No-arbitrage Models
In the following models the drift rate is a function of time, which is chosen such that
the model is consistent with the term structure.
10.4.1 The Ho and Lee Model
The rst no-arbitrage model was proposed in 1986 by Ho and Lee [7]. The model was
presented initially in the form of a binomial tree. The continuous time-limit of this
model is
dr
t
= (t)dt +dW
t
.
In this model (t) is the average direction in which r
t
moves and it is considered
independent of r
t
, while is the standard deviation of the short rate. The solution
process is Gaussian and is given by
r
t
= r
0
+
_
t
0
(s) ds +W
t
.
If F(0, t) denotes the forward rate at time t as seen at time 0, it is known that (t) =
F
t
(0, t) +
2
t. In this case the solution becomes
r
t
= r
0
+F(0, t) +

2
2
t
2
+W
t
.
10.4.2 The Hull and White Model
The model proposed by Hull and White [8] is an extension of the Ho and Lee model
model that incorporates mean reversion
dr
t
= ((t) ar
t
)dt +dW
t
,
with a and constants. We can solve the equation by multiplying by the integrating
factor e
at
d(e
at
r
t
) = (t)e
at
dt +e
at
dW
t
.
r
t
= r
0
e
at
+e
at
_
t
0
(s)e
as
ds +e
at
_
t
0
e
as
dW
s
. (10.4.9)
Since the rst two terms are deterministic and the last is a Wiener integral, the process
r
t
is Gaussian.
The function (t) can be calculated from the term structure F(0, t) as
(t) =
t
F(0, t) +aF(0, t) +

2
2a
(1 e
2at
).
206
Then
_
t
0
(s)e
as
ds =
_
t
0
s
F(0, s)e
as
ds +a
_
t
0
F(0, s)e
as
ds +

2
2a
_
t
0
(1 e
2as
)e
as
ds
= F(0, t)e
at
r
0
+

2
a
2
_
cosh(at) 1
_
,
where we used that F(0, 0) = r
0
. The deterministic part of r
t
becomes
r
0
e
at
+e
at
_
t
0
(s)e
as
ds = F(0, t) +

2
a
2
e
at
_
cosh(at) 1
_
.
An algebraic manipulation shows that
e
at
_
cosh(at) 1
_
=
1
2
(1 e
at
)
2
.
r
t
= F(0, t) +

2
2a
2
(1 e
at
)
2
+e
at
_
t
0
e
as
dW
s
.
The mean and variance are:
E[r
t
] = F(0, t) +

2
2a
2
(1 e
at
)
2
V ar(r
t
) =
2
e
2at
V ar
_
_
t
0
e
as
dW
s
_
=
2
e
2at
_
t
0
e
2as
ds
=

2
2a
(1 e
2at
).
10.5 Nonstationary Models
These models assume both and as functions of time. In the following we shall
discuss two models with this property.
10.5.1 Black, Derman and Toy Model
The binomial tree of Black, Derman and Toy [2] is equivalent with the following con-
tinuous model of short-time rate
d(ln r
t
) =
_
(t) +

(t)
(t)
ln r
t
_
dt +(t)dW
t
.
Making the substitution u
t
= ln r
t
, we obtain a linear equation in u
t
du
t
=
_
(t) +

(t)
(t)
u
t
_
dt +(t)dW
t
.
207
The equation can be written equivalently as
(t)du
t
d(t) u
t
2
(t)
=
(t)
(t)
dt +dW
t
,
which after using the quotient rule becomes
d
_
u
t
(t)
_
=
(t)
(t)
dt +dW
t
.
Integrating and solving for u
t
leads to
u
t
=
u
0
(0)
(t) +(t)
_
t
0
(s)
(s)
ds +(t)W
t
.
This implies that u
t
is Gaussian and hence r
t
= e
ut
is log-normal for each t. Using
u
0
= ln r
0
and
e
u
0
(0)
(t)
= e
(t)
(0)
lnr
0
= r
(t)
(0)
0
,
we obtain the following explicit formula for the spot rate
r
t
= r
(t)
(0)
0
e
(t)
t
0
(s)
(s)
ds
e
(t)Wt
. (10.5.10)
Since (t)W
t
is normally distributed with mean 0 and variance
2
(t)t, the log-normal
variable e
(t)Wt
has
E[e
(t)Wt
] = e
2
(t)t/2
V ar[e
(t)Wt
] = e
(t)
2
t
(e
(t)
2
t
1).
Hence
E[r
t
] = r
(t)
(0)
0
e
(t)
t
0
(s)
(s)
ds
e
2
(t)t/2
V ar[r
t
] = r
2(t)
(0)
0
e
2(t)
t
0
(s)
(s)
ds
e
(t)
2
t
(e
(t)
2
t
1).
Exercise 10.5.1 (a) Solve the Black, Derman and Toy model in the case when is
constant.
(b) Show that in this case the spot rate r
t
is log-normally distributed.
(c) Find the mean and the variance of r
t
.
10.5.2 Black and Karasinski Model
In 1991 Black and Karasinski [3] proposed the following more general model for the
spot rates:
d(ln r
t
) =
_
(t) a(t) ln r
t
_
dt +(t)dW
t
Exercise 10.5.2 Find the explicit formula for r
t
in this case.
208
Chapter 11
Bond Valuation and Yield Curves
A bond is a contract between a buyer and a nancial institution (bank, government,
etc) by which the nancial institution agrees to pay a certain principal to the buyer at
a determined time T in the future, plus some periodic coupon payments done during
the life time of the contract. If there are no coupons to be payed, the bond is called
a zero coupon bond or a discount bond. The price of a bond at any time 0 t T is
denoted by P(t, T).
If the spot rates are constant, r
t
= r, then the price at time t of a discount bond
that pays o $1 at time T is given by
P(t, T) = e
r(Tt)
.
If the spot interest rates depend deterministic on time, r
t
= r(t), the formula becomes
P(t, T) = e
T
t
r(s) ds
.
However, the spot rates r
t
are in general stochastic. In this case, the formula e
T
t
rs ds
is a random variable, and the price of the bond in this case can be calculated as the
expectation
P(t, T) = E
t
[e
T
t
rs ds
],
where E
t
is the expectation as of time t. If r =
1
T t
_
T
t
r
s
ds denotes the average
value of the spot rate in the time interval between t and T, the previous formula can
be written in the equivalent form
P(t, T) = E
t
[e
r(Tt)
].
11.1 The Case of a Brownian Motion Spot Rate
Assume the spot rate r
t
satises the stochastic equation
dr
t
= dW
t
,
209
210
with > 0, constant. We shall show that the price of the zero-coupon bond that pays
o $1 at time T is given by
P(t, T) = e
rt(Tt)+
1
6
2
(Tt)
3
.
Let 0 < t < T be xed. Solving the stochastic dierential equation, we have for
any t < s < T
r
s
= r
t
+(W
s
W
t
).
Integrating yields
_
T
t
r
s
ds = r
t
(T s) +
_
T
t
(W
s
W
t
) ds.
Then taking the exponential, we obtain
e
T
t
rs ds
= e
rt(Ts)
e
T
t
(WsWt) ds
.
The price of the bond at time t is given by
P(t, T) = E
_
e
T
t
rs ds
T
t
_
= e
rt(Tt)
E
_
e
T
t
(WsWt) ds
T
t
_
= e
rt(Tt)
E
_
e
T
t
(WsWt) ds
_
= e
rt(Tt)
E
_
e
Tt
0
Ws ds
_
= e
rt(Tt)
e
2
2
(Tt)
3
3
.
In the second identity we took the T
t
-predictable part out of the expectation, while
in the third identity we dropped the condition since W
s
W
t
is independent of the
information set T
t
for any t < s. The fourth identity invoked the stationarity. The
last identity follows from the Exercise 1.6.2 (b) and from the fact that
_
Tt
0
W
s
ds
is normal distributed with mean 0 and variance

2
(Tt)
3
3
.
It is worth noting that the price of the bond, P(t, T), depends only the spot rate
at time t, r
t
, and the time dierence T t.
Exercise 11.1.1 Spot rates which exhibit positive jumps of size > 0 satisfy the
stochastic equation
dr
t
= dN
t
,
where N
s
is the Poisson process. Find the price of the zero-coupon bond that pays o
$1 at time T.
211
11.2 The Case of Vasiceks Model
The next result regarding bond pricing is due to Vasicek [16]. Its initial proof is based
on partial dierential equations, while here we provide an approach solely based on
expectatations.
Proposition 11.2.1 Assume the spot rate r
t
satises Vasiceks mean reverting model
dr
t
= a(b r
t
)dt +dW
t
.
We shall show that the price of the zero-coupon bond that pays o $1 at time T is given
by
P(t, T) = A(t, T)e
B(t,T)rt
, (11.2.1)
where
B(t, T) =
1 e
a(Tt)
a
A(t, T) = e
(B(t,T)T+t)(a
2
b
2
/2)
a
2

2
B(t,T)
2
4a
.
We shall start the deduction of the previous formula by integrating the stochastic
dierential equation between t and T yields
a
_
T
t
r
s
ds = r
T
r
t
(W
T
W
t
) ab(T t).
Taking the exponential we obtain
e
T
t
rs ds
= e
1
a
(r
T
rt)
a
(W
T
Wt)b(Tt)
. (11.2.2)
Multiplying in the stochastic dierential equation by e
as
yields the exact equation
d(e
as
r
s
) = abe
as
ds +e
as
dW
s
.
Integrating between t and T to get
e
aT
r
T
e
at
r
t
= b(e
aT
e
at
) +
_
T
t
e
as
dW
s
.
Solving for r
T
and subtracting r
t
yields
r
T
r
t
= r
t
(1 e
a(Tt)
) +b(1 e
a(Tt)
) +e
aT
_
T
t
e
as
dW
s
.
Dividing by a and using the notation for B(t, T) yields
1
a
(r
T
r
t
) = r
t
B(t, T) +bB(t, T) +

a
e
aT
_
T
t
e
as
dW
s
.
212
Substituting into (11.4.7) and using that W
T
W
t
=
_
T
t
dW
s
, we get
e
T
t
rs ds
= e
rtB(t,T)
e
bB(t,T)b(Tt)
e
T
t
(e
asaT
1) dWs
. (11.2.3)
Taking the predictable part out and dropping the independent condition, the price of
the zero-coupon bond at time t becomes
P(t, T) = E
_
e
T
t
rs ds
[T
t
= e
rtB(t,T)
e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dWs
[T
t
= e
rtB(t,T)
e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dWs
= e
rtB(t,T)
A(t, T),
where
A(t, T) = e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dWs
. (11.2.4)
As a Wiener integral,
_
T
t
(e
asaT
1) dW
s
is normally distributed with mean zero and
variance
_
T
t
(e
asaT
1)
2
ds = e
2aT
e
2aT
e
2at
a
2e
aT
e
aT
e
at
a
+ (T t)
= B(t, T)
2 aB(t, T)
2
2B(t, T) + (T t)
=
a
2
B(t, T)
2
B(t, T) + (T t).
Then the log-normal variable e
T
t
(e
asaT
1) dWs
has the mean
E
_
e
T
t
(e
asaT
1) dWs
= e
1
2
2
a
2
[
a
2
B(t,T)
2
B(t,T)+(Tt)]
.
Substituting into (11.4.9) and using that
bB(t, T) b(T t) +
1
2
2
a
2
[
a
2
B(t, T)
2
B(t, T) + (T t)]
=
2
4a
B(t, T)
2
+ (T t)
_

2
2a
2
b
_
+
_
b

2
2a
2
_
B(t, T)
=
2
4a
B(t, T)
2
+
1
a
2
(a
2
b
2
/2)
_
B(t, T) T +t
_
we obtain
A(t, T) = e
(B(t,T)T+t)(a
2
b
2
/2)
a
2

2
B(t,T)
2
4a
.
It is worth noting that P(t, T) depends on the time to maturity, T t, and the spot
rate r
t
.
Exercise 11.2.2 Find the price of an innitely lived bond in the case when spot rates
satisfy Vasiceks model.
213
0 2 4 6 8 10
0.1
0.2
0.3
0.4
0 2 4 6 8
0.10
0.15
0.20
0.25
a b
2 4 6 8
0.08
0.10
0.12
0.14
0.16
0.18
0.20
2 4 6 8 10
0.2
0.3
0.4
0.5
0.6
c d
Figure 11.1: The yield on zero-coupon bond versus maturity time T t in the case of
Vasiceks model: a. a = 1, b = 0.5, = 0.6, r
t
= 0.1; b. a = 1, b = 0.5, = 0.8,
r
t
= 0.1; c. a = 1, b = 0.5, = 0.97, r
t
= 0.1; d. a = 0.3, b = 0.5, = 0.3, r
t
= 0.r.
The term structure Let R(t, T) be the continuously compounded interest rate at
time t for a term of T t. Using the formula for the bond price B(t, T) = e
R(t,T)(Tt)
we get
R(t, T) =
1
T t
ln B(t, T).
Using formula for the bond price (11.2.1) yields
R(t, T) =
1
T t
ln A(t, T) +
1
T t
B(t, T)r
t
.
A few possible shapes of the term structure R(t, T) in the case of Vasiceks model are
given in Fig. 11.1.
11.3 The Case of CIRs Model
In the case when r
t
satisfy the CIR model (10.3.8), the zero-coupon bond price has a
similar form as in the case of Vasiceks model
P(t, T) = A(t, T)e
B(t,T)rt
.
214
The functions A(t, T) and B(t, T) are given in this case by
B(t, T) =
2(e
(Tt)
1)
( +a)(e
(Tt)
1) + 2
A(t, T) =
_
2e
(a+)(Tt)/2
( +a)(e
(Tt)
1) + 2
_
2ab/
2
,
where = (a
2
+ 2
2
)
1/2
. For details the reader can consult [5].
11.4 The Case of a Mean Reverting Model with Jumps
We shall consider a model for the short-term interest rate r
t
that is mean reverting and
incorporates jumps. Let N
t
be the Poisson process of constant rate and M
t
= N
t
t
denote the compensated Poisson process, which is a martingale.
Consider the following model for the spot rate
dr
t
= a(b r
t
)dt +dM
t
, (11.4.5)
with a, b, positive constants. It is worth noting the similarity with the Vasiceks
model, which is obtained by replacing the process dM
t
by dW
t
, where W
t
is a one-
dimensional Brownian motion.
Making abstraction of the uncertainty source dM
t
, this implies that the rate r
t
is
pulled towards level b at the rate a. This means that, if for instance r
0
> b, then r
t
is decreasing towards b. The term dM
t
adds jumps of size to the process. A few
realizations of the process r
t
are given in Fig. 11.2.
The stochastic dierential equation (11.4.5) can be solved explicitly using the
method of integrating factor, see [13]. Multiplying by e
at
we get an exact equation;
integrating between 0 and t yields the following closed form formula for the spot rate
r
t
= b + (r
0
b)e
at
+e
at
_
t
0
e
as
dM
s
. (11.4.6)
Since the integral with respect to dM
t
is a martingale, the expectation of the spot rate
is
E[r
t
] = b + (r
0
b)e
at
.
In long run E[r
t
] tends to b, which shows that the process is mean reverting. Using the
formula
E
__
_
t
0
f(s) dM
s
_
2
_
=
_
t
0
f(s)
2
ds
the variance of the spot rate can be computed as follows
V ar(r
t
) =
2
e
2at
V ar
_
_
t
0
e
as
dM
s
_
=
2
e
2at
E
__
_
t
0
e
as
dM
s
_
2
_
=

2
2a
(1 e
2at
).
215
0 200 400 600 800 1000
2
4
6
8
Poisson Diffusion Solution Paths
Time
R
a
t
e
Figure 11.2: Realizations of the spot rate r
t
.
It is worth noting that in long run the variance tends to the constant

2
2a
. Another
feature implied by this formula is that the variance is proportional with the frequency
of jumps .
In the following we shall valuate a zero-coupon bond. Let T
t
be the -algebra
generated by the random variables N
s
; s t. This contains all information about
jumps occurred until time t. It is known that the price at time t of a zero-coupon bond
with the expiration time T is given by
P(t, T) = E
_
e
T
t
rs ds
[T
t
.
One of the main results is given in the following.
t
satises the mean reverting model with
jumps
dr
t
= a(b r
t
)dt +dM
t
.
Then the price of the zero-coupon bond that pays o $1 at time T is given by
P(t, T) = A(t, T)e
B(t,T)rt
,
where
B(t, T) =
1 e
a(Tt)
a
A(t, T) = exp(b +

a
)B(t, T) + [(1

a
) b](T t) e
a
_
Tt
0
e
a
e
ax
dx.
216
Proof: Integrating the stochastic dierential equation between t and T yields
a
_
T
t
r
s
ds = r
T
r
t
(M
T
M
t
) ab(T t).
Taking the exponential we obtain
e
T
t
rs ds
= e
1
a
(r
T
rt)
a
(M
T
Mt)b(Tt)
. (11.4.7)
Multiplying in the stochastic dierential equation by e
as
d(e
as
r
s
) = abe
as
ds +e
as
dM
s
.
Integrate between t and T to get
e
aT
r
T
e
at
r
t
= b(e
aT
e
at
) +
_
T
t
e
as
dM
s
.
Solving for r
T
and subtracting r
t
yields
r
T
r
t
= r
t
(1 e
a(Tt)
) +b(1 e
a(Tt)
) +e
aT
_
T
t
e
as
dM
s
.
Dividing by a and using the notation for B(t, T) yields
1
a
(r
T
r
t
) = r
t
B(t, T) +bB(t, T) +

a
e
aT
_
T
t
e
as
dM
s
.
Substituting into (11.4.7) and using that M
T
M
t
=
_
T
t
dM
s
, we get
e
T
t
rs ds
= e
rtB(t,T)
e
bB(t,T)b(Tt)
e
T
t
(e
asaT
1) dMs
. (11.4.8)
Taking the predictable part out and dropping the independent condition, the price of
the zero-coupon bond at time t becomes
P(t, T) = E
_
e
T
t
rs ds
[T
t
= e
rtB(t,T)
e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dMs
[T
t
= e
rtB(t,T)
e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dMs
[T
t
= e
rtB(t,T)
A(t, T),
where
A(t, T) = e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dMs
[T
t
. (11.4.9)
We shall compute in the following the right side expectation. From Bertoin [1], p. 8,
the exponential process
e
t
0
u(s) dNs+
t
0
(1e
u(s)
) ds
217
is an T
t
-martingale, with T
t
= (N
u
; u t). This implies
E
_
e
T
t
u(s) dNs
[T
t
_
= e
T
t
(1e
u(s)
) ds
.
Using dM
t
= dN
t
dt yields
E
_
e
T
t
u(s) dMs
[T
t
_
= E
_
e
T
t
u(s) dNs
[T
t
_
e
T
t
u(s) ds
= e
T
t
(1+u(s)e
u(s)
) ds
.
Let u(s) =

a
(e
a(sT)
1) and substitute in (11.4.9); then after changing the variable
of integration we obtain
A(t, T) = e
bB(t,T)b(Tt)
E
_
e
T
t
(e
asaT
1) dMs
[T
t
= e
bB(t,T)b(Tt)
e
Tt
0
[1+
a
(e
ax
1)e
a
(e
ax
1)
] dx
= e
bB(t,T)b(Tt)
e
(Tt)+
a
_
B(t,T)(Tt)
_
Tt
0
e
a
(e
ax
1)
dx
= e
(b+
a
)B(t,T)
e
[(1
a
)b](Tt)
e
Tt
0
e
a
(e
ax
1)
dx
= e
(b+
a
)B(t,T)
e
[(1
a
)b](Tt)
e
e
a
Tt
0
e
a
e
ax
dx
= exp(b +

a
)B(t, T) + [(1

a
) b](T t) e
a
_
Tt
0
e
a
e
ax
dx.
(11.4.10)
Evaluation using Special Functions The solution of the initial value problem
f
(x) =
e
x
x
, x > 0
f(0) =
is the exponential integral function f(x) = Ei(x), x > 0. This is a special func-
tion that can be evaluated numerically in MATHEMATICA by calling the function
ExpIntegralEi[x]. For instance, for any 0 < <
_

e
t
t
dt = Ei() Ei().
The reader can nd more details regarding the exponential integral function in Abramovitz
and Stegun [10]. The last integral in the expression of A(t, T) can be evaluated using
this special function. Substituting t =

a
e
ax
we have
_
Tt
0
e
a
e
ax
dx =
1
a
_
a
a
e
a(Tt)
e
t
t
dt
=
1
a
_
Ei
_
a
_
Ei
_
a
e
a(Tt)
__
.
218
11.5 The Case of a Model with pure Jumps
Consider the spot rate r
t
satisfying the stochastic dierential equation
dr
t
= dM
t
, (11.5.11)
where is a positive constant denoting the volatility of the rate. This model is obtained
when the rate at which r
t
is pulled toward b is a = 0, so there is no mean reverting
eect. This type of behavior can be noticed during a short time in a highly volatile
market; in this case the behavior of r
t
is most inuenced by the jumps.
The solution of (11.5.11) is
r
t
= r
0
+M
t
= r
0
t +N
t
,
which is an T
t
-martingale. The rate r
t
has jumps of size that occur at the arrival
times t
k
, which are exponentially distributed.
Next we shall compute the value P(t, T) at time t of a zero-coupon bond that pays
the amount of $1 at maturity T. This is given by the conditional expectation
P(t, T) = E[e
T
t
rs ds
[T
t
].
Integrating between t and s in equation (11.5.11) yields
r
s
= r
t
+(M
s
M
t
), t < s < T.
And then
_
T
t
r
s
ds = r
t
(T t) +
_
T
t
(M
s
M
t
) ds.
Taking out the predictable part and dropping the independent condition yields
P(t, T) = E[e
T
t
rs ds
[T
t
]
= e
rt(Tt)
E[e
T
t
(MsMt) ds
[T
t
]
= e
rt(Tt)
E[e
Tt
0
M d
]
= e
rt(Tt)
E[e
Tt
0
(N ) d
]
= e
rt(Tt)+
1
2
(Tt)
2
E[e
Tt
0
N d
]. (11.5.12)
We need to work out the expectation. Using integration by parts,
_
T
0
N
t
dt =
_
TN
T

_
T
0
t dN
t
_
=
_
T
_
T
0
dN
t

_
T
0
t dN
t
_
=
_
T
0
(T t) dN
t
,
219
so
e
T
0
Nt dt
= e
T
0
(Tt) dNt
.
Using the formula
E
_
e
T
0
u(t) dNt
_
= e
T
0
(1e
u(t)
) dt
we have
E
_
e
T
0
Nt dt
_
= E
_
e
T
0
(Tt) dNt
_
= e
T
0
(1e
(Tt)
) dt
= e
T
e
(e
T
1)
= e
_
T+
1
(e
T
1)
_
.
Replacing T by T t and t by yields
E
_
e
Tt
0
N d
_
= e
_
Tt+
1
(e
(Tt)
1)
_
.
Substituting in (11.5.12) yields the formula for the bond price
P(t, T) = e
rt(Tt)+
1
2
(Tt)
2
e
_
Tt+
1
(e
(Tt)
1)
_
= exp( +r
t
)(T t) +
1
2
(T t)
2
_
e
(Tt)
1
_
.
t
satises dr
t
= dM
t
, with > 0 con-
stant. Then the price of the zero-coupon bond that pays o $1 at time T is given
by
P(t, T) = exp( +r
t
)(T t) +
1
2
(T t)
2
_
e
(Tt)
1
_
.
The yield curve is given by
R(t, T) =
1
T t
ln P(t, T)
= r
t
+

2
(T t) +

(T t)
(e
(Tt)
1).
Exercise 11.5.2 The price of an interest rate derivative security with maturity time
T and payo P
T
(r
T
) has the price at time t = 0 given by P
0
= E
0
[e
T
0
rs ds
P
T
(r
T
)].
(The zero-coupon bond is obtained for P
T
(r
T
) = 1). Find the price of an interest rate
derivative with the payo P
T
(r
T
) = r
T
.
Exercise 11.5.3 Assume the spot rate satises the equation dr
t
= r
t
dM
t
, where M
t
is the compensated Poisson process. Find a solution of the form r
t
= r
0
e
(t)
(1 + )
Nt
,
where N
t
denotes the Poisson process.
220
Chapter 12
Modeling Stock Prices
The price of a stock can be modeled by a continuous stochastic process which is the
sum of a predictable and an unpredictable part. However, this type of model does not
take into account market crashes. If those are to be taken into consideration, the stock
price needs to contain a third component which models unexpected jumps. We shall
discuss these models in the present chapter.
12.1 Constant Drift and Volatility Model
Let S
t
denote the price of a stock at time t. If T
t
denotes the information set at time t,
then S
t
is a continuous process that is T
t
-adapted. The return on the stock during the
time interval t measures the percentage increase in the stock price between instances
t and t + t and is given by
S
t+t
S
t
S
t
. When t is innitesimally small, we obtain
the instantaneous return,
dS
t
S
t
. This is supposed to be the sum of two components:
the predictable part dt
the noisy part due to unexpected news dW
t
.
Adding these parts yields
dS
t
S
t
= dt +dW
t
,
which leads to the stochastic equation
dS
t
= S
t
dt +S
t
dW
t
. (12.1.1)
The parameters and are positive constants which represent the drift and volatility
of the stock. This equation has been solved in Example 7.7.2 by applying the method
of variation of parameters. The solution is
S
t
= S
0
e
(
2
2
)t+Wt
, (12.1.2)
221
222
50 100 150 200 250 300 350
1.05
1.10
1.15
Figure 12.1: Two distinct simulations for the stochastic equation
dS
t
= 0.15S
t
dt + 0.07S
t
dW
t
, with S
0
= 1.
where S
0
denotes the price of the stock at time t = 0. It is worth noting that the stock
price is T
t
-adapted, positive, and it has a log-normal distribution. Using Exercise 1.6.2,
the mean and variance are
E[S
t
] = S
0
e
t
(12.1.3)
V ar[S
t
] = S
2
0
e
2t
(e
2
t
1). (12.1.4)
See Fig.12.1 for two simulations of the stock price.
Exercise 12.1.1 Let T
u
be the information set at time u. Find E[S
t
[T
u
] and V ar[S
t
[T
u
]
for u t.
1
How these formulas become in the case s = t?
Exercise 12.1.2 Find the stochastic process followed by ln S
t
. What are the values of
E[ln(S
t
)] and V ar[ln(S
t
)]?
Exercise 12.1.3 Find the stochastic dierential equations associated with the following
processes
(a)
1
S
t
(b) S
n
t
(c) (S
t
1)
2
.
Exercise 12.1.4 (a) Show that E[S
2
t
] = S
2
0
e
(2+
2
)t
.
(b) Find a similar formula for E[S
n
t
], with n positive integer.
Exercise 12.1.5 (a) Find the expectation E[S
t
W
t
].
(b) Find the correlation function
t
= Corr(S
t
, W
t
). What happens for t large?
1
The conditional variance is dened by V ar(X|F) = E[X
2
|F] E[X|F]
2
.
223
The next result deals with the probability that the stock price reaches a certain
barrier before another barrier.
Theorem 12.1.6 Let S
u
and S
d
be xed, such that S
d
< S
0
< S
u
. The probability
that the stock price S
t
hits the upper value S
u
before the lower value S
d
is
p =
d
1
d
,
where S
u
/S
0
= u, S
d
/S
0
= d, and = 1 2/
2
.
Proof: Let X
t
= mt +W
t
. Theorem 3.5.1 provides
P(X
t
e
2m
1
e
2m
e
2m
. (12.1.5)
Choosing the following values for the parameters
m =

2
, =
ln u
, =
lnd
,
we have the sequence of identities
P(X
t
= P(X
t
= P(S
0
e
Xt
goes up to S
0
e
before down to S
0
e
)
= P(S
t
goes up to S
u
before down to S
d
).
Using (12.1.5) yields
P(S
t
goes up to S
u
before down to S
d
) =
e
2m
1
e
2m
e
2m
. (12.1.6)
Since a computation shows that
e
2m
= e
(
2
2
1) lnd
= d
1
2
2
= d
e
2m
= e
(
2
2
+1) lnu
= u
1
2
2
= u
,
formula (12.1.6) becomes
P(S
t
goes up to S
u
before down to S
d
) =
d
1
d
,
224
Corollary 12.1.7 Let S
u
> S
0
> 0 be xed. Then
P(S
t
hits S
u
) =
_
S
0
S
u
_
1
2
2
for some t > 0.
Proof: Taking d = 0 implies S
d
= 0. Since S
t
never reaches zero,
P(S
t
hits S
u
) = P(S
t
goes up to S
u
before down to S
d
= 0)
=
d
1
d
d=0
=
1
u
=
_
S
0
S
u
_
1
2
2
.
Exercise 12.1.8 A stock has S
0
= $10, = 0.15, = 0.20. What is the probability
that the stock goes up to $15 before it goes down to $5?
Exercise 12.1.9 Let 0 < S
0
< S
u
. What is the probability that S
t
hits S
u
for some
time t > 0?
The next result deals with the probability of a stock to reach a certain barrier.
Proposition 12.1.10 Let b > 0. The probability that the stock S
t
will ever reach the
level b is
P(sup
t0
S
t
b) =
_
_
_
b
S
0
_2
2
1
, if r <

2
2
1, if r

2
2
(12.1.7)
Proof: Using S
t
= S
0
e
(r
2
2
)t+Wt
we have
P(S
t
b for some t 0) = P
_
S
0
e
(r
2
2
)t+Wt
b, for some t 0
_
= P
_
W
t
+ (r

2
2
)t ln
b
S
0
, for some t 0
_
= P
_
W
t
t , for some t 0
_
,
where =
1
ln
b
S
0
and =

2

r
. Using Exercise 3.5.2 the previous probability is

equal to
P
_
W
t
t , for some t 0
_
=
_
1, if 0
e
2
, if > 0,
which is equivalent to (12.1.7).
225
12.2 When Does the Stock Reach a Certain Barrier?
Let a > S
0
and consider the rst time when the stock reaches the barrier a
T
a
= inft > 0; S
t
= a.
Since we have
S
t
= a = S
0
e
(
1
2
2
)t+Wt
= a
= (
1
2
2
)t +W
t
= ln(a/S
0
)
= t +W
t
= x,
with x = ln(a/S
0
) and =
1
2
2
, using Proposition 3.6.5 (a) it follows that
T
a
= inft > 0; t +W
t
= x
has the inverse gaussian probability density
p() =
x
2
3
3
e
(x)
2
2
3
=
ln(a/S
0
)
2
3
3
e
_
ln(a/S
0
)
_
2
/(2
3
)
.
The mean and the variance of the hitting time T
a
are given by Proposition 3.6.5 (b)
E[T
a
] =
x
=
ln(a/S
0
)
(
1
2
2
)
V ar(T
a
) =
x
3
=
ln(a/S
0
)
(
1
2
2
)
3
.
Exercise 12.2.1 Consider the doubling time of a stock T
2
= inft > 0; S
t
= 2S
0
.
(a) Find E[T
2
] and V ar(T
2
). Do these values depend of S
0
?
(b) The expected return of a stock is = 0.15 and its volatility = 0.20. Find the
expected time when the stock dubles its value.
Exercise 12.2.2 Let S
t
= max
ut
S
u
and S
t
= min
ut
S
u
be the running maximum and
minimum of the stock. Find the distribution functions of S
t
and S
t
.
Exercise 12.2.3 (a) What is the probability that the stock S
t
reaches level a, a > S
0
,
before time T?
(b) What is the probability that the stock S
t
reaches level a, a > S
0
, before time T
2
and
after time T
1
?
We shall use the properties of the hitting time later when pricing perpetual lookback
options and rebate options.
226
12.3 Time-dependent Drift and Volatility Model
This model considers the drift = (t) and volatility = (t) to be deterministic
functions of time. In this case the equation (12.1.1) becomes
dS
t
= (t)S
t
dt +(t)S
t
dW
t
. (12.3.8)
We shall solve the equation using the method of integrating factors presented in section
7.8. Multiplying by the integrating factor
t
= e
t
0
(s) dWs+
1
2
t
0

2
(s) ds
the equation (12.3.8) becomes d(
t
S
t
) =
t
(t)S
t
dt. Substituting Y
t
=
t
S
t
yields the
deterministic equation dY
t
= (t)Y
t
dt with the solution
Y
t
= Y
0
e
t
0
(s) ds
.
Substituting back S
t
=
1
t
Y
t
, we obtain the closed-form solution of equation (12.3.8)
S
t
= S
0
e
t
0
((s)
1
2
2
(s)) ds+
t
0
(s) dWs
.
Proposition 12.3.1 The solution S
t
is T
t
-adapted and log-normally distributed, with
mean and variance given by
E[S
t
] = S
0
e
t
0
(s) ds
V ar[S
t
] = S
2
0
e
2
t
0
(s) ds
_
e
t
0

2
(s) ds
1
_
.
Proof: Let X
t
=
_
t
0
((s)
1
2
2
(s)) ds +
_
t
0
(s) dW
s
. Since X
t
is a sum of a predictable
integral function and a Wiener integral, it is normally distributed, see Proposition 4.6.1,
with
E[X
t
] =
_
t
0
_
(s)
1
2
2
(s)
_
ds
V ar[X
t
] = V ar
_
_
t
0
(s) dW
s
_
=
_
t
0
2
(s) ds.
Using Exercise 1.6.2, the mean and variance of the log-normal random variable S
t
=
S
0
e
Xt
are given by
E[S
t
] = S
0
e
t
0
(
2
2
) ds+
1
2
t
0

2
ds
= S
0
e
t
0
(s) ds
V ar[S
t
] = S
2
0
e
2
t
0
(
1
2
2
) ds+
t
0

2
ds
_
e
t
0

2
1
_
= S
2
0
e
2
t
0
(s) ds
_
e
t
0

2
(s) ds
1
_
.
227
If the average drift and average squared volatility are dened as
=
1
t
_
t
0
(s) ds
2
=
1
t
_
t
0
2
(s) ds,
the aforementioned formulas can also be written as
E[S
t
] = S
0
e
t
V ar[S
t
] = S
2
0
e
2t
(e
2
t
1).
It is worth noting that we have obtained formulas similar to (12.1.3)(12.1.4).
12.4 Models for Stock Price Averages
In this section we shall provide stochastic dierential equations for several types of
averages on stocks. These averages are used as underlying assets in the case of Asian
options. The most common type of average is the arithmetic one, which is used in the
denition of the stock index.
Let S
t
1
, S
t
2
, , S
tn
be a sample of stock prices at n instances of time t
1
< t
2
<
< t
n
. The most common types of discrete averages are:
The arithmetic average
A(t
1
, t
2
, , t
n
) =
1
n
n
k=1
S
t
k
.
The geometric average
G(t
1
, t
2
, , t
n
) =
_
n
k=1
S
t
k
_1
n
.
The harmonic average
H(t
1
, t
2
, , t
n
) =
n
n
k=1
1
S
t
k
.
The well-known inequality of means states that we always have
H(t
1
, t
2
, , t
n
) G(t
1
, t
2
, , t
n
) A(t
1
, t
2
, , t
n
), (12.4.9)
with identity in the case of constant stock prices.
228
In the following we shall obtain expressions for continuous sampled averages and
nd their associated stochastic equations.
The continuously sampled arithmetic average
Let t
n
= t and assume t
k+1
t
k
=
t
n
. Using the denition of the integral as a limit of
Riemann sums, we have
lim
n
1
n
n
k=1
S
t
k
= lim
n
1
t
n
k=1
S
t
k
t
n
=
1
t
_
t
0
S
u
du.
It follows that the continuously sampled arithmetic average of stock prices between 0
and t is given by
A
t
=
1
t
_
t
0
S
u
du.
Using the formula for S
t
we obtain
A
t
=
S
0
t
_
t
0
e
(
2
/2)u+Wu
du.
This integral can be computed explicitly only in the case = 0. It is worth noting
that A
t
is neither normal nor log-normal, a fact that makes the price of Asian options
on arithmetic averages hard to evaluate.
Let I
t
=
_
t
0
S
u
du. The Fundamental Theorem of Calculus implies dI
t
= S
t
dt. Then
the quotient rule yields
dA
t
= d
_
I
t
t
_
=
dI
t
t I
t
dt
t
2
=
S
t
t dt I
t
dt
t
2
=
1
t
(S
t
A
t
)dt,
i.e. the continuous arithmetic average A
t
satises
dA
t
=
1
t
(S
t
A
t
)dt.
If A
t
< S
t
, the right side is positive and hence dA
t
> 0, i.e. the average A
t
goes up.
Similarly, if A
t
> S
t
, then the average A
t
goes down. This shows that the average A
t
tends to trace the stock values S
t
.
By lHospitals rule we have
A
0
= lim
t0
I
t
t
= lim
t0
S
t
= S
0
.
Using that the expectation commutes with integrals, we have
E[A
t
] =
1
t
_
t
0
E[S
u
] du =
1
t
_
t
0
S
0
e
u
du = S
0
e
t
1
t
.
229
Hence
E[A
t
] =
_
S
0
e
t
1
t
, if t > 0
S
0
, if t = 0.
(12.4.10)
In the following we shall compute the variance V ar[A
t
]. Since
V ar[A
t
] =
1
t
2
E[I
2
t
] E[A
t
]
2
, (12.4.11)
it suces to nd E[I
2
t
]. We need rst the following result:
Lemma 12.4.1 (i) We have
E[I
t
S
t
] =
S
2
0
+
2
[e
(2+
2
)t
e
t
].
(ii) The processes A
t
and S
t
Proof: (i) Using Itos formula
d(I
t
S
t
) = dI
t
S
t
+I
t
dS
t
+dI
t
dS
t
= S
2
t
dt +I
t
(S
t
dt +S
t
dW
t
) +S
t
dt dS
t
. .
=0
= (S
2
t
+I
t
S
t
)dt +I
t
S
t
dW
t
.
Using I
0
S
0
= 0, integrating between 0 and t yields
I
t
S
t
=
_
t
0
(S
2
u
+I
u
S
u
) du +
_
t
0
I
u
S
u
dW
u
.
Since the expectation of the Ito integral is zero, we have
E[I
t
S
t
] =
_
t
0
(E[S
2
u
] +E[I
u
S
u
]) du.
Using Exercise 12.1.4 this becomes
E[I
t
S
t
] =
_
t
0
_
S
2
0
e
(2+
2
)u
+E[I
u
S
u
]
_
du.
If we denote g(t) = E[I
t
S
t
], dierentiating yields the ODE
g
(t) = S
2
0
e
(2+
2
)t
+g(t),
with the initial condition g(0) = 0. This can be solved as a linear dierential equation
in g(t) by multiplying by the integrating factor e
t
. The solution is
g(t) =
S
2
0
+
2
[e
(2+
2
)t
e
t
].
230
(ii) Since A
t
and I
t
are proportional, it suces to show that I
t
and S
t
are not inde-
pendent. This follows from part (i) and the fact that
E[I
t
S
t
] ,= E[I
t
]E[S
t
] =
S
2
0
(e
t
1)e
t
.
Next we shall nd E[I
2
t
]. Using dI
t
= S
t
dt, then (dI
t
)
2
= 0 and hence Itos formula
yields
d(I
2
t
) = 2I
t
dI
t
+ (dI
t
)
2
= 2I
t
S
t
dt.
Integrating between 0 and t and using I
0
= 0 leads to
I
2
t
= 2
_
t
0
I
u
S
u
du.
Taking the expectation and using Lemma 12.4.1 we obtain
E[I
2
t
] = 2
_
t
0
E[I
u
S
u
] du =
2S
2
0
+
2
_
e
(2+
2
)t
1
2 +
2

e
t
1
_
. (12.4.12)
V ar[A
t
] =
S
2
0
t
2
_
2
+
2
_
e
(2+
2
)t
1
2 +
2

e
t
1
(e
t
1)
2
2
_
.
Concluding the previous calculations, we have the following result:
Proposition 12.4.2 The arithmetic average A
t
satises the stochastic equation
dA
t
=
1
t
(S
t
A
t
)dt, A
0
= S
0
.
Its mean and variance are given by
E[A
t
] = S
0
e
t
1
t
, t > 0
V ar[A
t
] =
S
2
0
t
2
_
2
+
2
_
e
(2+
2
)t
1
2 +
2

e
t
1
(e
t
1)
2
2
_
.
Exercise 12.4.3 Find approximative formulas for E[A
t
] and V ar[A
t
] for t small, up
to the order O(t
2
). (Recall that f(t) = O(t
2
) if lim
t
f(t)
t
2
= c < . )
231
The continuously sampled geometric average
Dividing the interval (0, t) into equal subintervals of length t
k+1
t
k
=
t
n
, we have
G(t
1
, . . . , t
n
) =
_
n
k=1
S
t
k
_
1/n
= e
ln
_
n
k=1
St
k
_
1/n
= e
1
n
n
k=1
ln St
k
= e
1
t
n
k=1
lnSt
k
t
n
.
Using the denition of the integral as a limit of Riemann sums
G
t
= lim
n
_
n
k=1
S
t
k
_
1/n
= lim
n
e
1
t
n
k=1
lnSt
k
t
n
= e
1
t
t
0
ln Su du
.
Therefore, the continuously sampled geometric average of stock prices between instances
0 and t is given by
G
t
= e
1
t
t
0
ln Su du
. (12.4.13)
Theorem 12.4.4 G
t
has a log-normal distribution, with the mean and variance given
by
E[G
t
] = S
0
e
(
2
6
)
t
2
V ar[G
t
] = S
2
0
e
(
2
6
)t
_
e
2
t
3
1
_
.
Proof: Using
ln S
u
= ln
_
S
0
e
(
2
2
)u+Wu
_
= ln S
0
+ (

2
2
)u +W
u
,
then taking the logarithm yields
lnG
t
=
1
t
_
t
0
_
ln S
0
+(
2
2
)u+W
u
_
du = ln S
0
+(
2
2
)
t
2
+
t
_
t
0
W
u
du. (12.4.14)
Since the integrated Brownian motion Z
t
=
_
t
0
W
u
du is Gaussian with Z
t
N(0, t
3
/3),
it follows that ln G
t
has a normal distribution
ln G
t
N
_
ln S
0
+ (

2
2
)
t
2
,

2
t
3
_
. (12.4.15)
This implies that G
t
has a log-normal distribution. Using Exercise 1.6.2, we obtain
E[G
t
] = e
E[lnGt]+
1
2
V ar[lnGt]
= e
ln S
0
+(
2
2
)
t
2
+
2
t
6
= S
0
e
(
2
6
)
t
2
.
V ar[G
t
] = e
2E[lnGt]+V ar[ln Gt]
_
e
V ar[ln Gt]
1
_
= e
2 ln S
0
+(
2
6
)t
_
e
2
t
3
1
_
= S
2
0
e
(
2
6
)t
_
e
2
t
3
1
_
.
232
Corollary 12.4.5 The geometric average G
t
is given by the closed-form formula
G
t
= S
0
e
(
2
2
)
t
2
+
t
0
Wu du
.
Proof: Take the exponential in the formula (12.4.14).
An important consequence of the fact that G
t
is log-normal is that Asian options
on geometric averages have closed-form solutions.
Exercise 12.4.6 (a) Show that ln G
t
satises the stochastic dierential equation
d(ln G
t
) =
1
t
_
ln S
t
ln G
t
_
dt.
(b) Show that G
t
satises
dG
t
=
1
t
G
t
_
ln S
t
ln G
t
_
dt.
The continuously sampled harmonic average
Let S
t
k
be values of a stock evaluated at the sampling dates t
k
, i = 1, . . . , n. Their
harmonic average is dened by
H(t
1
, , t
n
) =
n
n
k=1
1
S(t
k
)
.
Consider t
k
=
kt
n
. Then the continuously sampled harmonic average is obtained by
taking the limit as n in the aforementioned relation
lim
n
n
n
k=1
1
S(t
k
)
= lim
n
t
n
k=1
1
S(t
k
)
t
n
=
t
_
t
0
1
S
u
du
.
Hence, the continuously sampled harmonic average is dened by
H
t
=
t
_
t
0
1
S
u
du
.
We may also write H
t
=
t
I
t
, where I
t
=
_
t
0
1
S
u
du satises
dI
t
=
1
S
t
dt, I
0
= 0, d
_
1
I
t
_
=
1
S
t
I
2
t
dt.
233
From the lHospitals rule we get
H
0
= lim
t0
H
t
= lim
t0
t
I
t
= S
0
.
Using the product rule we obtain the following:
dH
t
= t d
_
1
I
t
_
+
1
I
t
dt +dt d
_
1
I
t
_
=
1
I
t
_
1
t
S
t
I
t
_
dt =
1
t
H
t
_
1
H
t
S
t
_
dt,
so
dH
t
=
1
t
H
t
_
1
H
t
S
t
_
dt. (12.4.16)
If at the instance t we have H
t
< S
t
, it follows from the equation that dH
t
> 0, i.e. the
harmonic average increases. Similarly, if H
t
> S
t
, then dH
t
< 0, i.e H
t
decreases. It is
worth noting that the converses are also true. The random variable H
t
is not normally
distributed nor log-normally distributed.
H
t
t
is a decreasing function of t. What is its limit as
t ?
Exercise 12.4.8 Show that the continuous analog of inequality (12.4.9) is
H
t
G
t
A
t
.
t
k
be the stock price at time t
k
. Consider the power of the
arithmetic average of S
t
k
A
(t
1
, , t
n
) =
_
n
k=1
S
t
k
n
_
.
(a) Show that the aforementioned expression tends to
A
t
=
_
1
t
_
t
0
S
u
du
_
,
as n .
(b) Find the stochastic dierential equation satised by A
t
.
(c) What does A
t
become in the particular cases = 1?
234
Exercise 12.4.10 The stochastic average of stock prices between 0 and t is dened by
X
t
=
1
t
_
t
0
S
u
dW
u
,
where W
u
is a Brownian motion process.
(a) Find dX
t
, E[X
t
] and V ar[X
t
].
(b) Show that X
t
= R
t
A
t
, where R
t
=
S
t
S
0
t
is the raw average of the
stock price and A
t
=
1
t
_
t
0
S
u
du is the continuous arithmetic average.
12.5 Stock Prices with Rare Events
In order to model the stock price when rare events are taken into account, we shall
combine the eect of two stochastic processes:
the Brownian motion process W
t
, which models regular events given by innites-
imal changes in the price, and which is a continuous process;
the Poisson process N
t
, which is discontinuous and models sporadic jumps in the
stock price that corresponds to shocks in the market.
Since E[dN
t
] = dt, the Poisson process N
t
has a positive drift and we need to com-
pensate by subtracting t from N
t
. The resulting process M
t
= N
t
t is a martingale,
called the compensated Poisson process, that models unpredictable jumps of size 1 at a
constant rate . It is worth noting that the processes W
t
and M
t
involved in modeling
the stock price are assumed to be independent.
Let S
t
= lim
ut
S
u
denote the value of the stock before a possible jump occurs at time
t. To set up the model, we assume the instantaneous return on the stock,
dSt
S
t
, to be
the sum of the following three components:
the predictable part dt;
the noisy part due to unexpected news dW
t
;
the rare events part due to unexpected jumps dM
t
,
where , and are constants, corresponding to the drift rate of the stock, volatility
and instantaneous return jump size.
2
Adding yields
dS
t
S
t
= dt +dW
t
+dM
t
.
2
In this model the jump size is constant; there are models where the jump size is a random variable,
see Merton [11].
235
Hence, the dynamics of a stock price, subject to rare events, are modeled by the
following stochastic dierential equation
dS
t
= S
t
dt +S
t
dW
t
+S
t
dM
t
. (12.5.17)
It is worth noting that in the case of zero jumps, = 0, the previous equation becomes
the classical stochastic equation (12.1.1).
Using that W
t
and M
t
are martingales, we have
E[S
t
dM
t
[T
t
] = S
t
E[dM
t
[T
t
] = 0,
E[S
t
dW
t
[T
t
] = S
t
E[dW
t
[T
t
] = 0.
This shows the unpredictability of the last two terms, i.e. given the information set
T
t
at time t, it is not possible to predict any future increments in the next interval of
time dt. The term S
t
dW
t
captures regular events of insignicant size, while S
t
dM
t
captures rare events of large size. The rare events term, S
t
dM
t
, incorporates jumps
proportional to the stock price and is given in terms of the Poisson process N
t
as
S
t
dM
t
= S
t
d(N
t
t) = S
t
dN
t
S
t
dt.
Substituting into equation (12.5.17) yields
dS
t
= ( )S
t
dt +S
t
dW
t
+S
t
dN
t
. (12.5.18)
The constant represents the rate at which the jumps of the Poisson process N
t
occur.
This is the same as the rate of rare events in the market, and can be determined from
historical data.
The following result provides an explicit solution for the stock price when rare events
are taken into account.
Proposition 12.5.1 The solution of the stochastic equation (12.5.18) is given by
S
t
= S
0
e
(
2
2
)t+Wt
(1 +)
Nt
, (12.5.19)
where
is the stock price drift rate;
is the volatility of the stock;
is the rate at which rare events occur;
is the size of jump in the expected return when a rare event occurs.
Proof: We shall construct rst the solution and then show that it veries the equation
(12.5.18). If t
k
denotes the kth jump time, then N
t
k
= k. Since there are no jumps
before t
1
, the stock price just before this time is satisfying the stochastic dierential
equation
dS
t
= ( )S
t
dt +S
t
dW
t
236
with the solution given by the usual formula
S
t
1
= S
0
e
(
2
2
)t
1
+Wt
1
.
Since
dS
t
1
S
t
1
=
S
t
1
S
t
1
S
t
1
= , then S
t
1
= (1 + )S
t
1
. Substituting in the aforemen-

tioned formula yields
S
t
1
= S
t
1
(1 +) = S
0
e
(
2
2
)t
1
+Wt
1
(1 +).
Since there is no jump between t
1
and t
2
, a similar procedure leads to
S
t
2
= S
t
2
(1 +) = S
t
1
e
(
2
2
)(t
2
t
1
)+(Wt
2
Wt
1
)
(1 +)
= S
0
e
(
2
2
)t
2
+Wt
2
(1 +)
2
.
Inductively, we arrive at
S
t
k
= S
0
e
(
2
2
)t
k
+Wt
k
(1 +)
k
.
This formula holds for any t [t
k
, t
k+1
); replacing k by N
t
yields the desired formula
(12.5.19).
In the following we shall show that (12.5.19) is a solution of the stochastic dierential
equation (12.5.18). If denote
U
t
= S
0
e
(
2
2
)t+Wt
, V
t
= (1 +)
Nt
,
we have S
t
= U
t
V
t
and hence
dS
t
= V
t
dU
t
+U
t
dV
t
+dU
t
dV
t
. (12.5.20)
We already know that U
t
veries
dU
t
= ( )U
t
dt +U
t
dW
t
= ( )U
t
dt +U
t
dW
t
,
since U
t
is a continuous process, i.e., U
t
= U
t
. Then the rst term of (12.5.20) becomes
V
t
dU
t
= ( )S
t
dt +S
t
dW
t
.
In order to compute the second term of (12.5.20) we write
dV
t
= V
t
V
t
= (1 +)
Nt
(1 +)
Nt
=
_
(1 +)
1+N
t
(1 +)
N
t
, if t = t
k
(1 +)
Nt
(1 +)
Nt
, if t ,= t
k
=
_
(1 +)
N
t
, if t = t
k
0, if t ,= t
k
= (1 +)
N
t
_
1, if t = t
k
0, if t ,= t
k
= (1 +)
N
t
dN
t
,
237
so U
t
dV
t
= U
t
(1 +)
N
t
dN
t
= S
t
dN
t
. Since dtdN
t
= dN
t
dW
t
= 0, the last term of
(12.5.20) becomes dU
t
dV
t
= 0. Substituting back into (12.5.20) yields the equation
dS
t
= ( )S
t
dt +S
t
dW
t
+S
t
dN
t
.
Formula (12.5.19) provides the stock price at time t if exactly N
t
jumps have oc-
curred and all jumps in the return of the stock are equal to .
It is worth noting that if
k
denotes the jump of the instantaneous return at time
t
k
, a similar proof leads to the formula
S
t
= S
0
e
(
2
2
)t+Wt
Nt
k=1
(1 +
k
),
where = E[
k
]. The random variables
k
are assumed independent and identically
distributed. They are also independent of W
t
and N
t
. For more details the reader is
referred to Merton [11].
For the following exercises S
t
is given by (12.5.19).
Exercise 12.5.2 Find E[S
t
] and V ar[S
t
].
Exercise 12.5.3 Find E[ln S
t
] and V ar[ln S
t
].
Exercise 12.5.4 Compute the conditional expectation E[S
t
[T
u
] for u < t.
Remark 12.5.5 Besides stock, the underlying asset of a derivative can be also a stock
index, or foreign currency. When use the risk neutral valuation for derivatives on a
stock index that pays a continuous dividend yield at a rate q, the drift rate is replace
by r q.
In the case of foreign currency that pays interest at the foreign interest rate r
f
, the
drift rate is replace by r r
f
.
238
Chapter 13
Risk-Neutral Valuation
13.1 The Method of Risk-Neutral Valuation
This valuation method is based on the risk-neutral valuation principle, which states
that the price of a derivative on an asset S
t
is not aected by the risk preference of the
market participants; so we may assume they have the same risk aversion. In this case
the valuation of the derivative price f
t
at time t is done as in the following:
1. Assume the expected return of the asset S
t
is the risk-free rate, = r.
2. Calculate the expected payo of the derivative as of time t, under condition 1.
3. Discount at the risk-free rate from time T to time t.
The rst two steps require considering the expectation as of time t in a risk-neutral
world. This expectation is denoted by

E
t
[ ] and has the meaning of a conditional
expectation given by E[ [T
t
, = r]. The method states that if a derivative has the
payo f
T
, its price at any time t prior to maturity T is given by
f
t
= e
r(Tt)

E
t
[f
T
].
The rate r is considered constant, but the method can be easily adapted for time
dependent rates.
In the following we shall present explicit computations for the most common Euro-
pean type
1
derivative prices using the risk-neutral valuation method.
13.2 Call Option
A call option is a contract which gives the buyer the right of buying the stock at time T
for the price K. The time T is called maturity time or expiration date and K is called
the strike price. It is worth noting that a call option is a right (not an obligation!) of
1
A derivative is of European type if can be exercised only at the expiration time.
239
240
the buyer, which means that if the price of the stock at maturity, S
T
, is less than the
strike price, K, then the buyer may choose not to exercise the option. If the price S
T
exceeds the strike price K, then the buyer exercises the right to buy the stock, since he
pays K dollars for something which worth S
T
in the market. Buying at price K and
selling at S
T
yields a prot of S
T
K.
Consider a call option with maturity date T, strike price K with the underlying
stock price S
t
having constant volatility > 0. The payo at maturity time is f
T
=
max(S
T
K, 0), see Fig.13.1 a. The price of the call at any prior time 0 t T is
given by the expectation in a risk-neutral world
c(t) =

E
t
[e
r(Tt)
f
T
] = e
r(Tt)

E
t
[f
T
]. (13.2.1)
If we let x = ln(S
T
/S
t
), using the log-normality of the stock price in a risk-neutral
world
S
T
= S
t
e
(r
2
2
)(Tt)+(W
T
Wt)
,
and it follows that x has the normal distribution
x N
_
(r

2
2
)(T t),
2
(T t)
_
.
Then the density function of x is
p(x) =
1
_
2(T t)
e
_
x(r
2
2
)(Tt)
_
2
2
2
(Tt)
.
We can write the expectation as
E
t
[f
T
] =

E
t
[max(S
T
K, 0)] =

E
t
[max(S
t
e
x
K, 0)]
=
_

max(S
t
e
x
K, 0)p(x) dx =
_

ln(K/St)
(S
t
e
x
K)p(x) dx
= I
2
I
1
, (13.2.2)
with notations
I
1
=
_

ln(K/St)
Kp(x) dx, I
2
=
_

ln(K/St)
S
t
e
x
p(x) dx
With the substitution y =
x(r
2
2
)(Tt)
Tt
, the rst integral becomes
I
1
= K
_

ln(K/St)
p(x) dx = K
_

d
2
1
2
e
y
2
2
dy
= K
_
d
2
2
e
y
2
2
dy = KN(d
2
),
241
where
d
2
=
ln(S
t
/K) + (r

2
2
)(T t)
T t
,
and
N(u) =
1
2
_
u
e
z
2
/2
dz
denotes the standard normal distribution function.
Using the aforementioned substitution the second integral can be computed by
completing the square
I
2
= S
t
_

ln(K/St)
e
x
p(x) dx = S
t
_

d
2
1
2
e
1
2
y
2
+y
Tt+(r
2
2
)(Tt)
dx
= S
t
_

d
2
1
2
e
1
2
(y
Tt)
2
e
r(Tt)
dy (let z = y
T t)
= S
t
e
r(Tt)
_

d
2
Tt
1
2
e
z
2
2
dz = S
t
e
r(Tt)
_
d
1
2
e
z
2
2
dz
= S
t
e
r(Tt)
N(d
1
),
where
d
1
= d
2
+
T t =
ln(S
t
/K) + (r +

2
2
)(T t)
T t
.
Substituting back into (13.2.2) and then into (13.2.1) yields
c(t) = e
r(Tt)
(I
2
I
1
) = e
r(Tt)
[S
t
e
r(Tt)
N(d
1
) KN(d
2
)]
= S
t
N(d
1
) Ke
r(Tt)
N(d
2
).
We have obtained the well known formula of Black and Scholes:
Proposition 13.2.1 The price of a European call option at time t is given by
c(t) = S
t
N(d
1
) Ke
r(Tt)
N(d
2
).
(a) lim
St
c(t)
S
t
= 1;
(b) lim
St0
c(t) = 0;
(c) c(t) S
t
Ke
r(Tt)
for S
t
large.
dc(t)
dS
t
= N(d
1
). This expression is called the delta of a
call option.
242
a b
Figure 13.1: a The payo of a call option; b The payo of a cash-or-nothing contract.
13.3 Cash-or-nothing Contract
A nancial security that pays 1 dollar if the stock price S
T
K and 0 otherwise, is
called a bet contract, or cash-or-nothing contract, see Fig.13.1 b. The payo can be
written as
f
T
=
_
1, if S
T
K
0, if S
T
< K.
Substituting S
t
= e
Xt
, the payo becomes
f
T
=
_
1, if X
T
ln K
0, if X
T
< ln K,
where X
T
has the normal distribution
X
T
N
_
ln S
t
+ (

2
2
)(T t),
2
(T t)
_
.
The expectation in the risk-neutral world as of time t is
E
t
[f
T
] = E[f
T
[T
t
, = r] =
_

f
T
(x)p(x) dx
=
_
+
ln K
1
T t
e
[xlnS
t
(r
2
2
)(Tt)]
2
2
2
(Tt)
dx
=
_

d
2
1
2
e
y
2
2
dy =
_
d
2
2
e
y
2
2
dy = N(d
2
),
where we used the substitution y =
xlnSt(r
2
2
)(Tt)
Tt
and the notation
d
2
=
ln S
t
ln K + (r

2
2
)(T t)
T t
.
243
a b
Figure 13.2: a The payo of a box-bet; b The payo of an asset-or-nothing contract.
The price at time t of a bet contract is
f
t
= e
r(Tt)
E
t
[f
T
] = e
r(Tt)
N(d
2
). (13.3.3)
Exercise 13.3.1 Let 0 < K
1
< K
2
. Find the price of a nancial derivative which pays
at maturity $1 if K
1
S
t
K
2
and zero otherwise, see Fig.13.2 a. This is a box-bet
and its payo is given by
f
T
=
_
1, if K
1
S
T
K
2
0, otherwise.
Exercise 13.3.2 An asset-or-nothing contract pays S
T
if S
T
> K at maturity time T,
and pays 0 otherwise, see Fig.13.2 b. Show that the price of the contract at time t is
f
t
= S
t
N(d
1
).
Exercise 13.3.3 (a) Find the price at time t of a derivative which pays at maturity
f
T
=
_
S
n
T
, if S
T
K
0, otherwise.
(b) Show that the value of the contract can be written as f
t
= g
t
N(d
2
+ n
T t),
where g
t
is the value at time t of a power contract at time t given by (13.5.5).
(c) Recover the result of Exercise 13.3.2 in the case n = 1.
13.4 Log-contract
A nancial security that pays at maturity f
T
= ln S
T
is called a log-contract. Since the
stock is log-normally distributed,
ln S
T
N
_
ln S
t
+ (

2
2
)(T t),
2
(T t)
_
,
244
the risk-neutral expectation at time t is
E
t
[f
T
] = E[ln S
T
[T
t
, = r] = ln S
t
+ (r

2
2
)(T t),
and hence the price of the log-contract is given by
f
t
= e
r(Tt)
E
t
[f
T
] = e
r(Tt)
_
lnS
t
+ (r

2
2
)(T t)
_
. (13.4.4)
Exercise 13.4.1 Find the price at time t of a square log-contract whose payo is given
by f
T
= (ln S
T
)
2
.
13.5 Power-contract
The nancial derivative which pays at maturity the nth power of the stock price, S
n
T
,
is called a power contract. Since S
T
has has a log-normal distribution with
S
T
= S
t
e
(
2
2
)(Tt)+(W
T
Wt)
,
the nth power of the stock, S
n
T
, is also log-normally distributed, with
g
T
= S
n
T
= S
n
t
e
n(
2
2
)(Tt)+n(W
T
Wt)
.
Then the expectation at time t in the risk-neutral world is
E
t
[g
T
] = E[g
T
[T
t
, = r] = S
n
t
e
n(r
2
2
)(Tt)
E[e
n(W
T
Wt)
[T
t
]
= S
n
t
e
n(r
2
2
)(Tt)
E[e
n(W
T
Wt)
] = S
n
t
e
n(r
2
2
)(Tt)
E[e
nW
Tt
]
= S
n
t
e
n(r
2
2
)(Tt)
e
1
2
n
2
2
(Tt)
.
The price of the power-contract is obtained by discounting to time t
g
t
= e
r(Tt)
E[g
T
[T
t
, = r] = S
n
t
e
r(Tt)
e
n(r
2
2
)(Tt)
e
1
2
n
2
2
(Tt)
= S
n
t
e
(n1)(r+
n
2
2
)(Tt)
.
Hence the value of a power contract at time t is given by
g
t
= S
n
t
e
(n1)(r+
n
2
2
)(Tt)
. (13.5.5)
It is worth noting that if n = 1, i.e. if the payo is g
T
= S
T
, then the price of the
contract at any time t T is g
t
= S
t
, i.e. the stock price itself. This will be used
shortly when valuing forward contracts.
In the case n = 2, i.e. if the contract pays S
2
T
at maturity, then the price is g
t
=
S
2
t
e
(r+
2
)(Tt)
.
Exercise 13.5.1 Let n 1 be an integer. Find the price of a power call option whose
payo is given by f
T
= max(S
n
T
K, 0).
245
13.6 Forward Contract on the Stock
A forward contract pays at maturity the dierence between the stock price, S
T
, and
the delivery price of the asset, K. The price at time t is
f
t
= e
r(Tt)
E
t
[S
T
K] = e
r(Tt)
E
t
[S
T
] e
r(Tt)
K = S
t
e
r(Tt)
K,
where we used that K is a constant and that

E
t
[S
T
] = S
t
e
r(Tt)
.
Exercise 13.6.1 Let n 2, 3. Find the price of the power forward contract that
pays at maturity f
T
= (S
T
K)
n
.
13.7 The Superposition Principle
If the payo of a derivative, f
T
, can be written as a linear combination of payos
f
T
=
n
i=1
c
i
h
i,T
with c
i
constants, then the price at time t is given by
f
t
=
n
i=1
c
i
h
i,t
where h
i,t
is the price at time t of a derivative that pays at maturity h
i,T
. We shall
successfully use this method in the situation when the payo f
T
can be decomposed into
simpler payos, for which we can evaluate directly the price of the associate derivative.
In this case the price of the initial derivative, f
t
, is obtained as a combination of the
prices of the easier to valuate derivatives.
The reason underlying the aforementioned superposition principle is the linearity
of the expectation operator

E
f
t
= e
r(Tt)
E[f
T
] = e
r(Tt)
E[
n
i=1
c
i
h
i,T
]
= e
r(Tt)
n
i=1
c
i
E[h
i,T
] =
n
i=1
c
i
h
i,t
.
This principle is also connected with the absence of arbitrage opportunities
2
in the
market. Consider two portfolios of derivatives with equal values at the maturity time
T
n
i=1
c
i
h
i,T
=
m
j=1
a
j
g
j,T
.
2
An arbitrage opportunity deals with the practice of making prots by taking simultaneous long
and short positions in the market
246
If we take this common value to be the payo of a derivative, f
T
, then by the afore-
mentioned principle, the portfolios have the same value at any prior time t T
n
i=1
c
i
h
i,t
=
m
j=1
a
j
g
j,t
.
The last identity can also result from the absence of arbitrage opportunities in the
market. If there is a time t at which the identity fails, then buying the cheaper portfolio
and selling the more expensive one will lead to an arbitrage prot.
The superposition principle can be used to price package derivatives such as spreads,
straddles, strips, straps and strangles. We shall deal with these type of derivatives in
proposed exercises.
13.8 General Contract on the Stock
By a general contract on the stock we mean a derivative that pays at maturity the
amount g
T
= G(S
T
), where G is a given analytic function. In the case G(S
T
) =
S
n
T
, we have a power contract. If G(S
T
) = lnS
T
, we have a log-contract. If choose
G(S
T
) = S
T
K, we obtain a forward contract. Since G is analytic we shall write
G(x) =
n0
c
n
x
n
, where c
n
= G
(n)
(0)/n!.
Decomposing into power contracts and using the superposition principle we have
E
t
[g
T
] =

E
t
[G(S
T
)] =

E
t
_
n0
c
n
S
n
T
n0
c
n
E
t
_
S
n
T
n0
c
n
S
n
t
e
n(r
2
2
)(Tt)
e
1
2
n
2
2
(Tt)
=
n0
c
n
_
S
t
e
n(r
2
2
)(Tt)
_
n
_
e
1
2
2
(Tt)
_
n
2
.
Hence the value at time t of a general contract is
g
t
= e
r(Tt)
n0
c
n
_
S
t
e
n(r
2
2
)(Tt)
_
n
_
e
1
2
2
(Tt)
_
n
2
.
Exercise 13.8.1 Find the value at time t of an exponential contract that pays at ma-
turity the amount e
S
T
.
247
13.9 Call Option
In the following we price a European call using the superposition principle. The payo
of a call option can be decomposed as
c
T
= max(S
T
K, 0) = h
1,T
Kh
2,T
,
with
h
1,T
=
_
S
T
, if S
T
K
0, if S
T
< K,
h
2,T
=
_
1, if S
T
K
0, if S
T
< K.
These are the payos of asset-or-nothing and of cash-or-nothing derivatives. From
section 13.3 and Exercise 13.3.2 we have h
1,t
= S
t
N(d
1
), h
2,t
= e
r(Tt)
N(d
2
). By
superposition we get the price of a call at time t
c
t
= h
1,t
Kh
2,t
= S
t
N(d
1
) Ke
r(Tt)
N(d
2
).
Exercise 13.9.1 (Put option) (a) Consider the payo h
1,T
=
_
1, if S
T
K
0, if S
T
> K.
Show that
h
1,t
= e
r(Tt)
N(d
2
), t T.
(b) Consider the payo h
2,T
=
_
S
T
, if S
T
K
0, if S
T
> K.
Show that
h
2,t
= S
t
N(d
1
), t T.
(c) The payo of a put is p
T
= max(K S
T
, 0). Verify that
p
T
= Kh
1,T
h
2,T
and use the superposition principle to nd the price p
t
of a put.
13.10 General Options on the Stock
Let G be an increasing analytic function with the inverse G
1
and K be a positive
constant. A general call option is a contract with the payo
f
T
=
_
G(S
T
) K, if S
T
G
1
(K)
0, otherwise.
(13.10.6)
We note the payo function f
T
is continuous. We shall work out the value of the
contract at time t, f
t
, using the superposition method. Since the payo can be written
as the linear combination
f
T
= h
1,T
Kh
2,T
,
248
with
h
1,T
=
_
G(S
T
), if S
T
G
1
(K)
0, otherwise,
h
2,T
=
_
1, if S
T
G
1
(K)
0, otherwise,
then
f
t
= h
1,t
Kh
2,t
. (13.10.7)
We had already computed the value h
2,t
. In this case we have h
2,t
= e
r(Tt)
N(d
G
2
),
where d
G
2
is obtained by replacing K with G
1
(K) in the formula of d
2
d
G
2
=
ln S
t
ln
_
G
1
(K)
_
+ (r

2
2
)(T t)
T t
. (13.10.8)
We shall compute in the following h
1,t
. Let G(S
T
) =
n0
c
n
S
n
T
with c
n
= G
(n)
(0)/n!.
Then h
1,T
=
n0
c
n
f
(n)
T
, where
f
(n)
T
=
_
S
n
T
, if S
T
G
1
(K)
0, otherwise.
By Exercise 13.3.3 the price at time t for a contract with the payo f
(n)
T
is
f
(n)
t
= S
n
t
e
(n1)(r+
n
2
2
)(Tt)
N(d
G
2
+n
T t).
The value at time t of h
1,T
is given by
h
1,t
=
n0
c
n
f
(n)
t
=
n0
c
n
S
n
t
e
(n1)(r+
n
2
2
)(Tt)
N(d
G
2
+n
T t).
Substituting in (13.10.7) we obtain the following value at time t of a general call option
with the payo (13.10.6)
f
t
=
n0
c
n
S
n
t
e
(n1)(r+
n
2
2
)(Tt)
N(d
G
2
+n
T t) Ke
r(Tt)
N(d
G
2
), (13.10.9)
with c
n
=
G
(n)
n!
and d
G
2
given by (13.10.8).
It is worth noting that in the case G(S
T
) = S
T
we have n = 1, d
2
= d
G
2
, d
1
=
d
2
+(T t) formula (13.10.9) becomes the value of a plain vanilla call option
f
t
= S
t
N(d
1
) Ke
r(Tt)
.
249
13.11 Packages
Packages are derivatives whose payos are linear combinations of payos of options,
cash and underlying asset. They can be priced using the superposition principle. Some
of these packages are used in hedging techniques.
The Bull Spread Let 0 < K
1
< K
2
. A derivative with the payo
f
T
=
_
_
_
0, if S
T
K
1
S
T
K
1
, if K
1
< S
T
K
2
K
2
K
1
, if K
2
< S
T
is called a bull spread, see Fig.13.3 a. A market participant enters a bull spread
position when the stock price is expected to increase. The payo f
T
can be written as
the dierence of the payos of two calls with strike prices K
1
and K
2
:
f
T
= c
1
(T) c
2
(T),
c
1
(T) =
_
0, if S
T
K
1
S
T
K
1
, if K
1
< S
T
c
2
(T) =
_
0, if S
T
K
2
S
T
K
2
, if K
2
< S
T
Using the superposition principle, the price of a bull spread at time t is
f
t
= c
1
(t) c
2
(t)
= S
t
N
_
d
1
(K
1
)
_
K
1
e
r(Tt)
N
_
d
2
(K
1
)
_
_
S
t
N
_
d
1
(K
2
)
_
K
2
e
r(Tt)
N
_
d
2
(K
2
)
_
_
= S
t
[N
_
d
1
(K
1
)
_
N
_
d
1
(K
2
)
_
] e
r(Tt)
[N
_
d
2
(K
1
)
_
N
_
d
2
(K
2
)
_
],
with
d
2
(K
i
) =
ln S
t
ln K
i
+ (r

2
2
)(T t)
T t
, d
1
(K
i
) = d
2
(K
i
) +
T t, i = 1, 2.
The Bear Spread Let 0 < K
1
< K
2
. A derivative with the payo
f
T
=
_
_
_
K
2
K
1
, if S
T
K
1
K
2
S
T
, if K
1
< S
T
K
2
0, if K
2
< S
T
is called a bear spread, see Fig.13.3 b. A long position in this derivative leads to prots
when the stock price is expected to decrease.
Exercise 13.11.1 Find the price of a bear spread at time t, with t < T.
250
a b
Figure 13.3: a The payo of a bull spread; b The payo of a bear spread.
a b
Figure 13.4: a The payo of a buttery spread; b The payo of a straddle.
The Buttery Spread Let 0 < K
1
< K
2
< K
3
, with K
2
= (K
1
+K
3
)/2. A buttery
spread is a derivative with the payo given by
f
T
=
_
_
0, if S
T
K
1
S
T
K
1
, if K
1
< S
T
K
2
K
3
S
T
, if K
2
S
T
< K
3
0, if K
3
S
T
.
A short position in a buttery spread leads to prots when a small move in the stock
price occurs, see Fig.13.4 a.
Exercise 13.11.2 Find the price of a buttery spread at time t, with t < T.
Straddles A derivative with the payo f
T
= [S
T
K[ is called a straddle, see see
Fig.13.4 b. A long position in a straddle leads to prots when a move in any direction
of the stock price occurs.
251
a b
Figure 13.5: a The payo for a strangle; b The payo for a condor.
Exercise 13.11.3 (a) Show that the payo of a straddle can be written as
f
T
=
_
K S
T
, if S
T
K
S
T
K, if K < S
T
.
(b) Find the price of a straddle at time t, with t < T.
Strangles Let 0 < K
1
< K
2
and K = (K
2
+K
1
)/2, K
= (K
2
K
1
)/2. A derivative
with the payo f
T
= max([S
T
K[ K
, 0) is called a strangle, see Fig.13.5 a. A long

position in a strangle leads to prots when a small move in any direction of the stock
price occurs.
Exercise 13.11.4 (a) Show that the payo of a strangle can be written as
f
T
=
_
_
_
K
1
S
T
, if S
T
K
1
0, if K
1
< S
T
K
2
S
T
K
2
, if K
2
S
T
.
(b) Find the price of a strangle at time t, with t < T;
(c) Which is cheaper: a straddle or a strangle?
Exercise 13.11.5 Let 0 < K
1
< K
2
< K
3
< K
4
, with K
4
K
3
= K
2
K
1
. Find the
value at time t of a derivative with the payo f
T
given in Fig.13.5 b.
13.12 Asian Forward Contracts
Forward Contracts on the Arithmetic Average Let A
T
=
1
T
_
t
0
S
u
du denote the
continuous arithmetic average of the asset price between 0 and T. It sometimes makes
252
sense for two parties to make a contract in which one party pays the other at maturity
time T the dierence between the average price of the asset, A
T
, and the xed delivery
price K. The payo of a forward contract on arithmetic average is
f
T
= A
T
K.
For instance, if the asset is natural gas, it makes sense to make a deal on the average
price of the asset, since the price is volatile and can become expensive during the winter
season.
Since the risk-neutral expectation at time t = 0,

E
0
[A
T
], is given by E[A
T
] where
is replaced by r, the price of the forward contract at t = 0 is
f
0
= e
rT

E
0
[f
T
] = e
rT
(
E
0
[A
T
] K)
= e
rT
_
S
0
e
rT
1
rT
K
_
= S
0
1 e
rT
rT
e
rT
K
=
S
0
rT
e
rT
_
S
0
rT
+K
_
.
Hence the value of a forward contract on the arithmetic average at time t = 0 is given
by
f
0
=
S
0
rT
e
rT
_
S
0
rT
+K
_
. (13.12.10)
It is worth noting that the price of a forward contract on the arithmetic average
is cheaper than the price of a usual forward contract on the asset. To see that, we
substitute x = rT in the inequality e
x
> 1 x, x > 0, to get
1 e
rT
rT
< 1.
This implies the inequality
S
0
1 e
rT
rT
e
rT
K < S
0
e
rT
K.
Since the left side is the price of an Asian forward contract on the arithmetic average,
while the right side is the price of a usual forward contract, we obtain the desired
inequality.
Formula (13.12.10) provides the price of the contract at time t = 0. What is the
price at any time t < T? One might be tempted to say that replacing T by T t and
S
0
by S
t
in the formula of f
0
leads to the corresponding formula for f
t
. However, this
does not hold true, as the next result shows:
253
Proposition 13.12.1 The value at time t of a contract that pays at maturity f
T
=
A
T
K is given by
f
t
= e
r(Tt)
_
t
T
A
t
K
_
+
1
rT
S
t
_
1 e
r(Tt)
_
. (13.12.11)
Proof: We start by computing the risk-neutral expectation

E
t
[A
T
]. Splitting the
integral into a predictable and an unpredictable part, we have
E
t
[A
T
] =

E
_
1
T
_
T
0
S
u
du[T
t
_
=

E
_
1
T
_
t
0
S
u
du +
1
T
_
T
t
S
u
du[T
t
_
=
1
T
_
t
0
S
u
du +

E
_
1
T
_
T
t
S
u
du[T
t
_
=
1
T
_
t
0
S
u
du +
1
T
_
T
t
E[S
u
[T
t
] du
=
1
T
_
t
0
S
u
du +
1
T
_
T
t
S
t
e
r(ut)
du,
where we replaced by r in the formula

E[S
u
[T
t
] = S
t
e
(ut)
. Integrating we obtain
E
t
[A
T
] =
1
T
_
t
0
S
u
du +
1
T
_
T
t
S
t
e
r(ut)
du
=
t
T
A
t
+
1
T
S
t
e
rt
e
rT
e
rt
r
=
t
T
A
t
+
1
rT
S
t
_
e
r(Tt)
1
_
. (13.12.12)
Using (13.12.12), the risk-neutral valuation provides
f
t
= e
r(Tt)
E
t
[A
T
K] = e
r(Tt)
E
t
[A
T
] e
r(Tt)
K
= e
r(Tt)
_
t
T
A
t
K
_
+
1
rT
S
t
_
1 e
r(Tt)
_
.
Exercise 13.12.2 Show that lim
t0
f
t
= f
0
, where f
t
is given by (13.12.11) and f
0
by
(13.12.10).
Exercise 13.12.3 Find the value at time t of a contract that pays at maturity date
the dierence between the asset price and its arithmetic average, f
T
= S
T
A
T
.
254
Forward Contracts on the Geometric Average We shall consider in the following
Asian forward contracts on the geometric average. This is a derivative that pays at
maturity the dierence f
T
= G
T
K, where G
T
is the continuous geometric average
of the asset price between 0 and T and K is a xed delivery price.
We shall work out rst the value of the contract at t = 0. Substituting = r in
the rst relation provided by Theorem 12.4.4, the risk-neutral expectation of G
T
as of
time t = 0 is
E
0
[G
T
] = S
0
e
1
2
(r
2
6
)T
.
Then
f
0
= e
rT

E
0
[G
T
K] = e
rT

E
0
[G
T
] e
rT
K
= e
rT
S
0
e
1
2
(r
2
6
)T
e
rT
K
= S
0
e
1
2
(r+
2
6
)T
e
rT
K.
Thus, the price of a forward contract on geometric average at t = 0 is given by
f
0
= S
0
e
1
2
(r+
2
6
)T
e
rT
K. (13.12.13)
As in the case of forward contracts on arithmetic average, the value at time 0 <
t < T cannot be obtain from (13.12.13) by replacing blindly T and S
0
by T t and S
t
,
respectively. The correct relation is given by the following result:
Proposition 13.12.4 The value at time t of a contract which pays at maturity G
T
K
is
f
t
= G
t
T
t
S
1
t
T
t
e
r(Tt)+(r
2
2
)
(Tt)
2
2T
+

2
T
2
(Tt)
3
6
e
r(Tt)
K. (13.12.14)
Proof: Since for t < u
ln S
u
= ln S
t
+ (

2
2
)(u t) +(W
u
W
t
),
we have
_
T
0
ln S
u
du =
_
t
0
ln S
u
du +
_
T
t
ln S
u
du
=
_
t
0
ln S
u
du +
_
T
t
_
ln S
t
+ (

2
2
)(u t) +(W
u
W
t
)
_
du
=
_
t
0
ln S
u
du + (T t) ln S
t
+ (

2
2
)
_
T
2
t
2
2
t(T t)
_
+
_
T
t
(W
u
W
t
) du.
255
The geometric average becomes
G
T
= e
1
T
T
0
lnSu du
= e
1
T
t
0
ln Su du
S
1
t
T
t
e
1
T
(
2
2
)(
T+t
2
t)(Tt)
e
T
t
(WuWt) du
= G
t
T
t
S
1
t
T
t
e
(
2
2
)
(Tt)
2
2T
e
T
t
(WuWt) du
, (13.12.15)
where we used that
e
1
T
T
0
ln Su du
= e
t
T
lnGt
= G
t
T
t
.
Relation (13.12.15) provides G
T
in terms of G
t
and S
t
. Taking the predictable part
out and replacing by r we have
E
t
[G
T
] = G
t
T
t
S
1
t
T
t
e
(r
2
2
)
(Tt)
2
2T
E
_
e
T
t
(WuWt) du
[T
t
_
. (13.12.16)
Since the jump W
u
W
t
is independent of the information set T
t
, the condition can
be dropped
E
_
e
T
t
(WuWt) du
[T
t
_
= E
_
e
T
t
(WuWt) du
_
.
Integrating by parts yields
_
T
t
(W
u
W
t
) du =
_
T
t
W
u
du (T t)W
t
= TW
T
tW
t
_
T
t
udW
u
TW
t
+tW
t
= T(W
T
W
t
)
_
T
t
udW
u
=
_
T
t
T dW
u

_
T
t
udW
u
=
_
T
t
(T u) dW
u
,
which is a Wiener integral. This is normal distributed with mean 0 and variance
_
T
t
(T u)
2
du =
(T t)
3
3
.
Then
_
T
t
(W
u
W
t
) du N
_
0,
(T t)
3
3
_
and hence
E
_
e
T
t
(WuWt) du
_
= E
_
e
T
t
(Tu) dWu
_
= e
1
2
2
T
2
(Tt)
3
3
= e
2
T
2
(Tt)
3
6
.
E
t
[G
T
] = G
t
T
t
S
1
t
T
t
e
(r
2
2
)
(Tt)
2
2T
e
2
T
2
(Tt)
3
6
. (13.12.17)
256
Hence the value of the contract at time t is given by
f
t
= e
r(Tt)
E
t
[G
T
K] = G
t
T
t
S
1
t
T
t
e
r(Tt)+(r
2
2
)
(Tt)
2
2T
+
2
T
2
(Tt)
3
6
e
r(Tt)
K.
Exercise 13.12.5 Show that lim
t0
f
t
= f
0
, where f
t
is given by (13.12.14) and f
0
by
(13.12.13).
Exercise 13.12.6 Which is cheaper: an Asian forward contract on A
t
or an Asian
forward contract on G
t
?
Exercise 13.12.7 Using Corollary 12.4.5 nd a formula of G
T
in terms of G
t
, and
then compute the risk-neutral world expectation

E
t
[G
T
].
13.13 Asian Options
There are several types of Asian options depending on how the payo is related to the
average stock price:
Average Price options:
Call: f
T
= max(S
ave
K, 0)
Put: f
T
= max(K S
ave
, 0).
Average Strike options:
Call: f
T
= max(S
T
S
ave
, 0)
Put: f
T
= max(S
ave
S
T
, 0).
The average asset price S
ave
can be either the arithmetic or the geometric average of
the asset price between 0 and T.
Geometric average price options When the asset is the geometric average, G
T
, we
shall obtain closed form formulas for average price options. Since G
T
is log-normally
distributed, the pricing procedure is similar with the one used for the usual options
on stock. We shall do this by using the superposition principle and the following two
results. The rst one is a cash-or-nothing type contract where the underlying asset is
the geometric mean of the stock between 0 and T.
Lemma 13.13.1 The value at time t = 0 of a derivative, which pays at maturity $1
if the geometric average G
T
K and 0 otherwise, is given by
h
0
= e
rT
N(
d
2
),
257
where
d
2
=
ln S
0
ln K + (

2
2
)
T
2
_
T/3
.
Proof: The payo can be written as
h
T
=
_
1, if G
T
K
0, if G
T
< K
=
_
1, if X
T
ln K
0, if X
T
< ln K,
where X
T
= lnG
T
has the normal distribution
X
T
N
_
ln S
0
+ (

2
2
)
T
2
,

2
T
3
_
,
see formula (12.4.15). Let p(x) be the probability density of the random variable X
T
p(x) =
1
2
_
T/3
e
[xlnS
0
(
2
2
)
T
2
]
2
/(
2
2
T
3
)
. (13.13.18)
The risk neutral expectation of the payo at time t = 0 is
E
0
[h
T
] =
_
h
T
(x)p(x) dx =
_

ln K
p(x) dx
=
1
2
_
T/3
_

ln K
e
[xlnS
0
(r
2
2
)
T
2
]
2
/(
2
2
T
3
)
dx,
where was replaced by r. Substituting
y =
x ln S
0
(r

2
2
)
T
2
_
T/3
, (13.13.19)
yields
E
0
[h
T
] =
1
2
_

d
2
e
y
2
/2
dy =
1
2
_

d
2
e
y
2
/2
dy
= N(
d
2
),
where
d
2
=
ln S
0
ln K + (r

2
2
)
T
2
_
T/3
.
Discounting to the free interest rate yields the price at time t = 0
h
0
= e
rT

E
0
[h
T
] = e
rT
N(
d
2
).
The following result deals with the price of an average-or-nothing derivative on the
geometric average.
258
Lemma 13.13.2 The value at time t = 0 of a derivative, which pays at maturity G
T
if G
T
K and 0 otherwise, is given by the formula
g
0
= e
1
2
(r+
2
6
)T
S
t
N(
d
1
),
where
d
1
=
ln S
0
ln K + (r +

2
6
)
T
2
_
T/3
.
Proof: Since the payo can be written as
g
T
=
_
G
T
, if G
T
K
0, if G
T
< K
=
_
e
X
T
, if X
T
ln K
0, if X
T
< ln K,
with X
T
= ln G
T
, the risk neutral expectation at the time t = 0 of the payo is
E
0
[g
T
] =
_

g
T
(x)p(x) dx =
_

ln K
e
x
p(x) dx,
where p(x) is given by (13.13.18), with replaced by r. Using the substitution
(13.13.19) and completing the square yields
E
0
[g
T
] =
1
2
_
T/3
_

lnK
e
x
e
[xlnS
0
(r
2
2
)
T
2
]
2
/(
2
2
T
3
)
dx
=
1
2
S
0
e
1
2
(r
2
6
)T
_

d
2
e
1
2
[y
T/3]
2
dy
If we let
d
1
=

d
2
+
_
T/3
=
ln S
0
ln K + (r

2
2
)
T
2
_
T/3
+
_
T/3
=
ln S
0
ln K + (r +

2
6
)
T
2
_
T/3
,
the previous integral becomes, after substituting z = y
_
T/3,
1
2
S
0
e
1
2
(r
2
6
)T
_

d
1
e
1
2
z
2
dz = S
0
e
1
2
(r
2
6
)T
N(
d
1
).
Then the risk neutral expectation of the payo is
E
0
[g
T
] = S
0
e
1
2
(r
2
6
)T
N(
d
1
).
259
The value of the derivative at time t is obtained by discounting at the interest rate r
g
0
= e
rT

E
0
[g
T
] = e
rT
S
0
e
1
2
(r
2
6
)T
N(
d
1
)
= e
1
2
(r+
2
6
)T
S
0
N(
d
1
).
Proposition 13.13.3 The value at time t of a geometric average price call option is
f
0
= e
1
2
(r+
2
6
)T
S
0
N(
d
1
) Ke
rT
N(
d
2
).
Proof: Since the payo f
T
= max(G
T
K, 0) can be decomposed as
f
T
= g
T
Kh
T
,
with
g
T
=
_
G
T
, if G
T
K
0, if G
T
< K,
h
T
=
_
1, if G
T
K
0, if G
T
< K,
applying the superposition principle and Lemmas 13.13.1 and 13.13.2 yields
f
0
= g
0
Kh
0
= e
1
2
(r+
2
6
)T
S
0
N(
d
1
) Ke
rT
N(
d
2
).
Exercise 13.13.4 Find the value at time t = 0 of a price put option on a geometric
average, i.e. a derivative with the payo f
T
= max(K G
T
, 0).
Arithmetic average price options There is no simple closed-form solution for a
call or for a put on the arithmetic average A
t
. However, there is an approximate
solution based on computing exactly the rst two moments of the distribution of A
t
,
and applying the risk-neutral valuation assuming that the distribution is log-normally
with the same two moments. This idea was developed by Turnbull and Wakeman [15],
and works pretty well for volatilities up to about 20%.
The following result provides the mean and variance of a normal distribution in
terms of the rst two moments of the associated log-normally distribution.
Proposition 13.13.5 Let Y be a log-normally distributed random variable, having the
rst two moments given by
m
1
= E[Y ], m
2
= E[Y
2
].
Then ln Y has the normal distribution ln Y N(,
2
), with the mean and variance
given respectively by
= ln
m
2
1
m
2
,
2
= ln
m
2
m
2
1
(13.13.20)
260
Proof: Using Exercise 1.6.2 we have
m
1
= E[Y ] = e
+
2
2
, m
2
= E[Y
2
] = e
2+2
2
.
Taking a logarithm yields
+

2
2
= ln m
1
, 2 + 2
2
= ln m
2
.
Solving for and yields (13.13.20).
Assume the arithmetic average A
t
=
It
t
has a log-normally distribution. Then
ln A
t
= ln I
t
ln t is normal, so ln I
t
is normal, and hence I
t
is log-normally distributed.
Since I
T
=
_
T
0
S
u
du, using (12.4.12) yields
m
1
= E[I
T
] = S
0
e
T
1
;
m
2
= E[I
2
T
] =
2S
2
0
+
2
_
e
(2+
2
)T
1
2 +
2

e
T
1
_
.
Using Proposition 13.13.5 it follows that ln A
t
is normally distributed, with
ln A
T
N
_
ln
m
2
1
m
2
ln t, ln
m
2
m
2
1
_
. (13.13.21)
Relation (13.13.21) represents the normal approximation of ln A
t
. We shall price the
arithmetic average price call under this condition.
In the next two exercises we shall assume that the distribution of A
T
is given by the
log-normal distribution (13.13.21).
Exercise 13.13.6 Using a method similar to the one used in Lemma 13.13.1, show
that an approximate value at time 0 of a derivative, which pays at maturity $1 if the
arithmetic average A
T
K and 0 otherwise, is given by
h
0
= e
rT
N(

d
2
),
with
d
2
=
ln(m
2
1
/
m
2
) ln K ln t
_
ln(m
2
/m
2
1
)
, (13.13.22)
where in the expressions of m
1
and m
2
we replaced by r.
Exercise 13.13.7 Using a method similar to the one used in Lemma 13.13.2, show
that the approximate value at time 0 of a derivative, which pays at maturity A
T
if
A
T
K and 0 otherwise, is given by the formula
a
0
= S
0
1 e
rT
rt
N(

d
1
),
261
where (CHECK THIS AGAIN!)
d
1
= ln(m
2
/m
2
1
) +
ln
m
2
1
m
2
+ ln t ln K
_
ln(m
2
/m
2
1
)
, (13.13.23)
where in the expressions of m
1
and m
2
we replaced by r.
Proposition 13.13.8 The approximate value at t = 0 of an arithmetic average price
call is given by
f
0
=
S
0
(1 e
rT
)
r
N(

d
1
) Ke
rT
N(

d
2
),
with

d
1
and

d
2
given by formulas (13.13.22) and (13.13.23).
Exercise 13.13.9 (a) Prove Proposition 13.13.8.
(b) How does the formula change if the value is taken at time t instead of time 0?
13.14 Forward Contracts with Rare Events
We shall evaluate the price of a forward contract on a stock which follows a stochastic
process with rare events, where the number of events n = N
T
until time T is assumed
Poisson distributed. As usual, T denotes the maturity of the forward contract.
Let the jump ratios be Y
j
= S
t
j
/S
t
j
, where the events occur at times 0 < t

1
<
t
2
< < t
n
, where n = N
T
. The stock price at maturity is given by Mertons formula
xxx
S
T
= S
0
e
(
1
2
2
)T+W
T
N
T
j=1
Y
j
,
where = E[Y
j
] is the expected jump size, and Y
1
, , Y
n
are considered independent
among themselves and also with respect to W
t
. Conditioning over N
T
= n yields
E
_
N
T
j=1
Y
j
n0
E
_
N
T
j=1
[N
T
= n
P(N
T
= n)
=
n0
n
j=1
E[Y
j
]P(N
T
= n)
=
n0
n
(T)
n
n!
e
T
= e
T
e
T
.
Since W
T
is independent of N
T
and Y
j
we have
E[S
T
] = S
0
e
(
2
2
)T
E[e
W
T
]E[
N
T
j=0
Y
j
]
= S
0
e
()T
e
T
e
T
= S
0
e
()T
.
262
Since the payo at maturity of the forward contract is f
T
= S
T
K, with K delivery
price, the value of the contract at time t = 0 is obtained by the method of risk neutral
valuation
f
0
= e
rT
_
E
0
[S
T
K]
_
= e
rT
_
S
0
e
(r)T
K
_
= S
0
e
T
Ke
rT
.
Replacing T with T t and S
0
with S
t
yields the value of the contract time t
f
t
= S
t
e
(Tt)
Ke
r(Tt)
, 0 t T. (13.14.24)
It is worth noting that if the rate of jumps ocurrence is = 0, we obtain the familiar
result
f
t
= S
t
e
r(Tt)
K.
13.15 All-or-Nothing Lookback Options (Needs work!)
Consider a contract that pays the cash amount K at time T if the stock price S
t
did
ever reach or exceed level z until time T, and the amount 0 otherwise. The payo is
V
T
=
_
K, if S
T
z
0, otherwise,
where S
T
= max
tT
S
t
is the running maximum of the stock.
In order to compute the exact value of the option we need to nd the probability
density of the running maximum
X
t
= S
t
= max
st
S
s
= S
0
e
max
st
(s +W
s
)
where = r

2
2
. Let Y
t
= max
st
(s +W
s
).
Let T
x
be the rst time the process t + W
t
reaches level x, with x > 0. The
probability density of T
x
is given by Proposition 3.6.5
p() =
x
2
3
e
(x)
2
2
2
.
The probability function of Y
t
is given by
P(Y
t
x) = 1 P(Y
t
> x) = 1 P
_
max
0st
(s +W
s
) > x
_
= 1 P(T
x
t) = 1
_
t
0
x
2
3
e
(x)
2
2
2
d
=
_

t
x
2
3
e
(x)
2
2
2
d.
263
Then the probability function of X
t
becomes
P(X
t
u) = P(e
Yt
u/S
0
) = P(Y
t
ln(u/S
0
))
=
_

t
ln(u/S
0
)
2
3
e
(ln(u/S
0
))
2
2
2
d.
Let S
0
< z. What is the probability that the stock S
t
hits the barrier z before time
T? This can be formalized as the probability P(max
tT
S
t
> z), which can be computed
using the previous probability function:
P(S
T
> z) = 1 P(max
tT
S
t
z) = 1 P(X
T
z)
= 1
_

T
ln(z/S
0
)
2
3
e
(ln(z/S
0
))
2
2
2
d
=
_
T
0
ln(z/S
0
)
2
3
e
(ln(z/S
0
))
2
2
2
d,
where = r
2
/2. The exact value of the all-or-nothing lookback option at time
t = 0 is
V
0
= e
rT
E[V
T
]
= E
_
V
T
[ max
tT
S
t
z
P(max
tT
S
t
z)e
rT
+E
_
V
T
[ max
tT
S
t
< z
. .
=0
P(max
tT
S
t
< z)e
rT
= e
rT
KP(max
tT
S
t
z)
= e
rT
K
_
T
0
ln(z/S
0
)
2
3
e
(ln(z/S
0
))
2
2
2
d. (13.15.25)
Since the previous integral does not have an elementary expression, we shall work out
some lower and upper bounds.
Proposition 13.15.1 (Typos in the proof ) We have
e
rT
K
_
1
_
2
3
T
ln
z
S
0
_
< V
0
<
KS
0
z
.
Proof: First, we shall nd an upper bound for the option price using Doobs inequality.
The stock price at time t is given by
S
t
= S
0
e
(r
2
2
)t+Wt
.
264
Under normal market conditions we have r >

2
2
, and by Corollary 2.12.4, part (b)
it follows that S
t
is a submartingale (with respect to the ltration induced by W
t
).
Applying Doobs submartingale inequality, Proposition 2.12.5, we have
P(max
tT
S
t
z)
1
z
E[S
T
] =
S
0
z
e
rT
.
Then the expected value of V
T
can be estimated as
E[V
T
] = K P(max
tT
S
t
z) + 0 P(max
tT
S
t
< z)
KS
0
z
e
rT
.
Discounting, we obtain an upper bound for the value of the option at time t = 0
V
0
= e
rT
E[V
T
]
KS
0
z

In order to obtain a lower bound, we need to write the integral away from the singularity
at = 0, and use that e
x
> 1 x for x > 0:
V
0
= e
rT
K e
rT
K
_

T
ln(z/S
0
)
2
3
3
e
(ln(z/S
0
))
2
2
3
d
< e
rT
K e
rT
K
_

T
ln(z/S
0
)
2
3
3
d
= e
rT
K
_
1
_
2
3
T
ln
z
S
0
_
.
Exercise 13.15.2 Find the value of an asset-or-nothing look-back option whose payo
is
V
T
=
_
S
T
, if S
T
K
0, otherwise.
K denotes the strike price.
Exercise 13.15.3 Find the value of a price call look-back option with the payo given
by
V
T
=
_
S
T
K, if S
T
K
0, otherwise.
Exercise 13.15.4 Find the value of a price put look-back option whose payo is given
by
V
T
=
_
K S
T
, if S
T
< K
0, otherwise.
265
Exercise 13.15.5 Evaluate the following strike look-back options
(a) V
T
= S
T
S
T
(call)
(b) V
T
= S
T
S
T
(put)
It is worth noting that (S
T
S
T
)
+
= S
T
S
T
; this explains why the payo is not given
as a piece-wise function.
Hint: see p. 739 of Mc Donald for closed form formula.
Exercise 13.15.6 Find a put-call parity for look-back options.
Exercise 13.15.7 Starting from the formula (see K&S p. 265)
P
_
sup
tT
(W
t
t)
_
= 1 N
_
T +

T
_
+e
2
N
_
T
_
,
with > 0, R and N(x) =
1
2
_

e
z
2
/2
dz, calculate the price at t = 0 of a
contract that pays K at time T if the stock price S
t
did ever reach or exceed level z
before time T.
13.16 Perpetual Look-back Options
Consider a contract that pays K at the time the stock price S
t
reaches level b for the
rst time. What is the price of such contract at any time t 0?
This type of contract is called perpetual look-back because the time horizon is
innite, i.e. there is no expiration date for the contract.
The contract pays K at time
b
, where
b
= inft > 0; S
t
b. The value at time
t = 0 is obtained by discounting at the risk-free interest rate r
V
0
= E
_
Ke
r
b
= KE
_
e
r
b
. (13.16.26)
This expectation will be worked out using formula (3.6.11). First, we notice that the
time
b
at which S
t
= b is the same as the time for which S
0
e
(r
2
/2)t+Wt
= b, or
equivalently,
_
r

2
2
_
t +W
t
= ln
b
S
0
.
Applying Proposition 3.6.4 with = r
2
/2, s = r, and x = ln(b/S
0
) yields
E[e
r
b
] = e
1
2
_
r
2
2

2r+(r
2
2
)
2
_
ln
b
S
0
=
_
b
S
0
_
_
r
2
2

2r+(r
2
2
)
2
_
/
2
.
266
Substituting in (13.16.26) yields the price of the perpetual look-back option at t = 0
V
0
= K
_
b
S
0
_
_
r
2
2

2r+(r
2
2
)
2
_
/
2
. (13.16.27)
Exercise 13.16.1 Find the price of a contract that pays K when the initial stock price
doubles its value.
13.17 Immediate Rebate Options
This contract pays at the time the barrier is hit, T
a
, the amount K. The discounted
value at t = 0 is
f
0
=
_
e
rTa
, if T
a
T
0, otherwise.
The price of the contract at t = 0 is
V
0
= E[f
0
] = E[f
0
[T
a
< T]P(T
a
< T) +E[f
0
[T
a
T]
. .
=0
P(T
a
T)
= KE[e
rTa
]P(T
a
< T)
= K
_

0
e
r
p
a
() d
_
T
0
p
a
() d,
where
p
a
() =
ln
a
S
0
2
3
3
e
(ln
a
S
0
)
2
/(2
3
)
,
and = r

2
2
. Question: Why in McDonald (22.20) the formula is dierent. Are
they equivalent?
13.18 Deferred Rebate Options
The payo of this contract pays K as long as a certain barrier has been hit. It is called
deferred because the payment K is made at the expiration time T. If T
a
is the rst
time the stock reaches the barrier a, the payo can be formalized as
V
T
=
_
K, if T
a
< T
0, if T
a
T.
The value of the contract at time t = 0 is
V
0
= e
rT
E[V
T
] = e
rT
KP(T
a
< T)
= Ke
rT
ln
a
S
0
_
T
0
1
2
3
e
(ln
a
S
0
)
2
/(2
2
)
d,
with = r

2
2
.
Chapter 14
Martingale Measures
14.1 Martingale Measures
An T
t
-predictable stochastic process X
t
on the probability space (, T, P) is not al-
ways a martingale. However, it might become a martingale with respect to another
probability measure Q on T. This is called a martingale measure. The main result
of this section is nding a martingale measure with respect to which the discounted
stock price is a martingale. This measure plays an important role in the mathematical
explanation of the risk-neutral valuation.
14.1.1 Is the stock price S
t
a martingale?
Since the stock price S
t
is an T
t
-predictable and non-explosive process, the only con-
dition which needs to be satised to be a T
t
-martingale is
E[S
t
[T
u
] = S
u
, u < t. (14.1.1)
Heuristically speaking, this means that given all information in the market at time u,
T
u
, the expected price of any future stock price is the price of the stock at time u, i.e
S
u
. This does not make sense, since in this case the investor would prefer investing
the money in a bank at a risk-free interest rate, rather than buying a stock with zero
return. Then (14.1.1) does not hold. The next result shows how to x this problem.
Proposition 14.1.1 Let be the rate of return of the stock S
t
. Then
E[e
t
S
t
[T
u
] = e
u
S
u
, u < t, (14.1.2)
i.e. e
t
S
t
is an T
t
-martingale.
Proof: The process e
t
S
t
is non-explosive since
E[[e
t
S
t
[] = e
t
E[S
t
] = e
t
S
0
e
t
= S
0
< .
267
268
Since S
t
is T
t
-predictable, so is e
t
S
t
. Using formula (12.1.2) and taking out the
predictable part yields
E[S
t
[T
u
] = E[S
0
e
(
1
2
2
)t+Wt
[T
u
]
= E[S
0
e
(
1
2
2
)u+Wu
e
(
1
2
2
)(tu)+(WtWu)
[T
u
]
= E[S
u
e
(
1
2
2
)(tu)+(WtWu)
[T
u
]
= S
u
e
(
1
2
2
)(tu)
E[e
(WtWu)
[T
u
]. (14.1.3)
Since the increment W
t
W
u
is independent of all values W
s
, s u, then it will also be
independent of T
u
. By Proposition 1.11.4, part 6, the conditional expectation becomes
the usual expectation
E[e
(WtWu)
[T
u
] = E[e
(WtWu)
].
Since (W
t
W
u
) N
_
0,
2
(t u)
_
, from Exercise 1.6.2 (b) we get
E[e
(WtWu)
] = e
1
2
2
(tu)
.
Substituting back into (14.1.3) yields
E[S
t
[T
u
] = S
u
e
(
1
2
2
)(tu)
e
1
2
2
(tu)
= S
u
e
(tu)
,
which is equivalent to
E[e
t
S
t
[T
u
] = e
u
S
u
.
The conditional expectation E[S
t
[T
u
] can be expressed in terms of the conditional
density function as
E[S
t
[T
u
] =
_
S
t
p(S
t
[T
u
) dS
t
, (14.1.4)
where S
t
is taken as an integration variable.
Exercise 14.1.2 (a) Find the formula for conditional density function, p(S
t
[T
u
), de-
ned by (14.1.4).
(b) Verify the formula
E[S
t
[T
0
] = E[S
t
]
in two dierent ways, either by using part (a), or by using the independence of S
t
with
respect to T
0
.
The martingale relation (14.1.2) can be written equivalently as
_
e
t
S
t
p(S
t
[T
u
) dS
t
= e
u
S
u
, u < t.
This way, dP(x) = p(x[T
u
) dx becomes a martingale measure for e
t
S
t
.
269
14.1.2 Risk-neutral World and Martingale Measure
Since the rate of return might not be known from the beginning, and it depends on
each particular stock, a meaningful question would be:
Under what martingale measure does the discounted stock price, M
t
= e
rt
S
t
, become
a martingale?
The constant r denotes, as usual, the risk-free interest rate. Assume such a martingale
measure exists. Then we must have
E
u
[e
rt
S
t
] =

E[e
rt
S
t
[T
u
] = e
ru
S
u
,
where

E denotes the expectation with respect to the requested martingale measure.
The previous relation can also be written as
e
r(tu)
E[S
t
[T
u
] = S
u
, u < t.
This states that the discounted expectation at the risk-free interest rate for the time
interval t u is the price of the stock, S
u
. Since this does not involve any of the
riskiness of the stock, we might think of it as an expectation in the risk-neutral world.
The aforementioned formula can be written in the compound mode as
E[S
t
[T
u
] = S
u
e
r(tu)
, u < t. (14.1.5)
This formula can be obtained from the conditional expectation E[S
t
[T
u
] = S
u
e
(tu)
by substituting = r and replacing E by

E, which corresponds to the denition of the
expectation in a risk-neutral world. Therefore, the evaluation of derivatives in section
13 is done by using the aforementioned martingale measure under which e
rt
S
t
is a
martingale. In the next section we shall determine this measure explicitly.
Exercise 14.1.3 Consider the following two games that consit in ipping a fair coin
and taking decisions:
A. If the coin lands Heads, you win $2; otherwise you loose $1.
B. If the coin lands Heads, you win $20,000; otherwise you loose $10,000.
(a) Which game involves more risk? Explain your answer.
(b) Which game would you choose to play, and why?
(c) Are you risk-neutral in your decision?
The risk neutral measure is the measure with respect to which investors are not
risk-averse. In the case of the previous exercise, it is the measure with respect to which
all players are indiferent whether they choose option A or B.
270
14.1.3 Finding the Risk-Neutral Measure
The solution of the stochastic dierential equation of the stock price
dS
t
= S
t
dt +S
t
dW
t
can be written as
S
t
= S
0
e
t
e
Wt
1
2
2
t
.
Then e
t
S
t
= S
0
e
Wt
1
2
2
t
is an exponential process. By Example 9.1.3, particular
case 1, this process is an T
t
-martingale, where T
t
= W
u
; u t is the information
available in the market until time t. Hence e
t
S
t
is a martingale, which is a result
proved also by Proposition 14.1.1. The probability space where this martingale exists
is (, T, P).
In the following we shall change the rate of return into the risk-free rate r and
change the probability measure such that the discounted stock price becomes a mar-
tingale. The discounted stock price can be expressed in terms of the Brownian motion
with drift (a hat was added in order to distinguish it from the former Brownian motion
W
t
)
W
t
=
r
t +W
t
(14.1.6)
as in the following
e
rt
S
t
= e
rt
S
0
e
t
e
Wt
1
2
2
t
= e
Wt
1
2
2
t
.
If we let =
r
in Corollary 9.2.4 of Girsanovs theorem, it follows that

W
t
is a
Brownian motion on the probability space (, T, Q), where
dQ = e
1
2
(
r
)
2
TW
T
dP.
As an exponential process, e
Wt
1
2
2
t
becomes a martingale on this space. Consequently
e
rt
S
t
is a martingale process w.r.t. the probability measure Q. This means
E
Q
[e
rt
S
t
[T
u
] = e
ru
S
u
, u < t.
where E
Q
[ [T
u
] denotes the conditional expectation in the measure Q, and it is given
by
E
Q
[X
t
[T
u
] = E
P
[X
t
e
1
2
(
r
)
2
TW
T
[T
u
].
The measure Q is called the equivalent martingale measure, or the risk-neutral measure.
The expectation taken with respect to this measure is called the expectation in the risk-
neutral world. Customarily we shall use the notations
E[e
rt
S
t
] = E
Q
[e
rt
S
t
]
E
u
[e
rt
S
t
] = E
Q
[e
rt
S
t
[T
u
]
271
It is worth noting that

E[e
rt
S
t
] =

E
0
[e
rt
S
t
], since e
rt
S
t
is independent of the initial
information set T
0
.
The importance of the process

W
t
is contained in the following useful result.
Proposition 14.1.4 The probability measure that makes the discounted stock price,
e
rt
S
t
, a martingale changes the rate of return into the risk-free interest rate r, i.e
dS
t
= rS
t
dt +S
t
d
W
t
.
Proof: The proof is a straightforward verication using (14.1.6)
dS
t
= S
t
dt +S
t
dW
t
= rS
t
dt + ( r)S
t
dt +S
t
dW
t
= rS
t
dt +S
t
_
r
dt +dW
t
_
= rS
t
dt +S
t
d
W
t
We note that the solution of the previous stochastic equation is
S
t
= S
0
e
rt
e
Wt
1
2
2
t
.
Exercise 14.1.5 Assume ,= r and let u < t.
(a) Find E
P
[e
rt
S
t
[T
u
] and show that e
rt
S
t
is not a martingale w.r.t. the proba-
bility measure P.
(b) Find E
Q
[e
t
S
t
[T
u
] and show that e
t
S
t
is not a martingale w.r.t. the proba-
bility measure Q.
14.2 Risk-neutral World Density Functions
The purpose of this section is to establish formulas for the densities of Brownian motions
W
t
and

W
t
with respect to both probability measures P and Q, and discuss their
relationship. This will clear some confusions that appear in practical applications
when we need to choose the right probability density.
The densities of W
t
and

W
t
with respect to P and Q will be denoted respectively
by p
P
, p
Q
and p
P
, p
Q
. Since W
t
and

W
t
are Brownian motions on the spaces (, T, P)
and (, T, Q), respectively, they have the following normal probability densities
p
P
(x) =
1
2t
e
x
2
2t
= p(x);
p
Q
(x) =
1
2t
e
x
2
2t
= p(x).
272
The associated distribution functions are
F
P
Wt
(x) = P(W
t
x) =
_
{Wtx}
dP() =
_
x
p(u) du;
F
Q
Wt
(x) = Q(
W
t
x) =
_
{
Wtx}
dQ() =
_
x
p(u) du.
Expressing W
t
in terms of

W
t
and using that

W
t
is normally distributed with respect
to Q we get the distribution function of W
t
with respect to Q as
F
Q
Wt
(x) = Q(W
t
x) = Q(
W
t
t x)
= Q(
W
t
x +t) =
_
{
Wtx+t}
dQ()
=
_
x+t
p(y) dy =
_
x+t
2t
e
y
2
2t
dy.
Dierentiating yields the density function
p
Q
(x) =
d
dx
F
Q
Wt
(x) =
1
2t
e
1
2t
(y+t)
2
.
It is worth noting that p
Q
(x) can be decomposed as
p
Q
(x) = e
x
1
2
2
t
p(x),
which makes the connection with the Girsanov theorem.
The distribution function of

W
t
with respect to P can be worked out in a similar
way
F
P
Wt
(x) = P(
W
t
x) = P(W
t
+t x)
= P(W
t
x t) =
_
{Wtxt}
dP()
=
_
xt
p(y) dy =
_
xt
2t
e
y
2
2t
dy,
so the density function is
p
P
(x) =
d
dx
F
P
Wt
(x) =
1
2t
e
1
2t
(yt)
2
.
273
14.3 Correlation of Stocks
Consider two stock prices driven by the same novelty term
dS
1
=
1
S
1
dt +
1
S
1
dW
t
(14.3.7)
dS
2
=
2
S
2
dt +
2
S
2
dW
t
. (14.3.8)
Since the underlying Brownian motions are perfectly correlated, one may be tempted
to think that the stock prices S
1
and S
2
are also correlated. The following result shows
that in general the stock prices are positively correlated:
Proposition 14.3.1 The correlation coecient between the stock prices S
1
and S
2
driven by the same Brownian motion is
Corr(S
1
, S
2
) =
e
2
t
1
(e
2
1
t
1)
1/2
(e
2
2
t
1)
1/2
> 0.
In particular, if
1
=
2
, then Corr(S
1
, S
2
) = 1.
Proof: Since
S
1
(t) = S
1
(0)e
1
t
1
2
2
1
t
e
1
Wt
, S
2
(t) = S
2
(0)e
2
t
1
2
2
2
t
, e
2
Wt
,
from Exercise 14.3.4 and formula E[e
kWt
] = e
k
2
t/2
we have
Corr(S
1
, S
2
) = Corr(e
1
Wt
, e
2
Wt
) =
Cov(e
1
Wt
, e
2
Wt
)
_
V ar(e
1
Wt
)V ar(e
2
Wt
)
=
E[e
(
1
+
2
)Wt
] E[e
1
Wt
]E[e
2
Wt
]
_
V ar(e
1
Wt
)V ar(e
2
Wt
)
=
e
1
2
(
1
+
2
)
2
t
e
1
2
2
1
t
e
1
2
2
2
t
[e
2
1
t
(e
2
1
t
1)e
2
2
t
(e
2
2
t
1)]
1/2
=
e
2
t
1
(e
2
1
t
1)
1/2
(e
2
2
t
1)
1/2
.
If
1
=
2
= then the previous formula provides
Corr(S
1
, S
2
) =
e
2
t
1
e
2
t
1
= 1,
i.e. the stocks are perfectly correlated if they have the same volatility.
Corollary 14.3.2 The stock prices S
1
and S
2
are positively strongly correlated for
small values of t:
Corr(S
1
, S
2
) 1 as t 0.
274
20 40 60 80 100
0.2
0.4
0.6
0.8
1.0
Figure 14.1: The correlation function f(t) =
e
2
t
1
(e
2
1
t
1)
1/2
(e
2
2
t
1)
1/2
in the case
1
= 0.15,
2
= 0.40.
This fact has the following nancial interpretation. If some stocks are driven by the
same unpredictable news, when one stock increases, then the other one tends to increase
too, at least for a small amount of time. In the case when some bad news aects an
entire nancial market, the risk becomes systemic, and hence if one stock fails, all the
others tend to decrease as well, leading to a severe strain on the nancial market.
Corollary 14.3.3 The stock prices correlation gets weak as t gets large:
Corr(S
1
, S
2
) 0 as t .
Proof: It follows from the asymptotic correspondence
e
2
t
1
(e
2
1
t
1)
1/2
(e
2
2
t
1)
1/2
2
t
e
2
1
+
2
2
2
t
= e
(
1
2
)
2
2
t
0, t 0.
It follows that in the long run any two stocks tend to become uncorrelated, see Fig.14.1.
Exercise 14.3.4 If X and Y are random variables and , R, show that
Corr(X, Y ) =
_
Corr(X, Y ), if > 0
Corr(X, Y ), if < 0.
Exercise 14.3.5 Find the following
(a) Cov
_
dS
1
(t), dS
2
(t)
_
;
(b) Corr
_
dS
1
(t), dS
2
(t)
_
.
275
14.4 Self-nancing Portfolios
The value of a portfolio that contains
j
(t) units of stock S
j
(t) at time t is given by
V (t) =
n
j=1
j
(t)S
j
(t).
The portfolio is called self-nancing if
dV (t) =
n
j=1
j
(t)dS
j
(t).
This means that an innitesinal change in the value of the portfolio is due to innites-
imal changes in stock values. All portfolios will be assumed self-nancing, if otherwise
stated.
14.5 The Sharpe Ratio
If is the expected return on the stock S
t
, the risk premium is dened as the dierence
r, where r is the risk-free interest rate. The Sharpe ratio, , is the quotient between
the risk premium and stock price volatility
=
r
.
The following result shows that the Sharpe ratio is an important invariant for the
family of stocks driven by the same uncertain source. It is also known under the name
of the market price of risk for assets.
Proposition 14.5.1 Let S
1
and S
2
be two stocks satisfying equations (14.3.7)(14.3.8).
Then their Sharpe ratio are equal
1
r
1
=

2
r
2
(14.5.9)
Proof: Eliminating the term dW
t
from equations (14.3.7) (14.3.8) yields
2
S
1
dS
1

1
S
2
dS
2
= (
1
1
)dt. (14.5.10)
Consider the portfolio P(t) =
1
(t)S
1
(t)
2
(t)S
2
(t), with
1
(t) =

2
(t)
S
1
(t)
and
2
(t) =
1
(t)
S
2
(t)
. Using the properties of self-nancing portfolios,, we have
dP(t) =
1
(t)dS
1
(t)
2
(t)dS
2
(t)
=

2
S
1
dS
1

1
S
2
dS
2
.
276
Substituting in (14.5.10) yields dP = (
1
2

2
1
)dt, i.e. P is a risk-less portfolio.
Since the portfolio earns interest at the risk-free interest rate, we have dP = rPdt.
Then equating the coecients of dt yields
1
= rP(t).
Using the denition of P(t), the previous relation becomes
1
= r
1
S
1
r
2
S
2
,
that can be transformed to
1
= r
2
r
1
,
which is equivalent with (14.5.9).
Using Proposition 14.1.4, relations (14.3.7) (14.3.8) can be written as
dS
1
= rS
1
dt +
1
S
1
d
W
t
dS
2
= rS
2
dt +
2
S
2
d
W
t
,
where the risk-neutral process d
W
t
is the same in both equations
d
W
t
=

1
r
1
dt +dW
t
=

2
r
2
dt +dW
t
.
This shows that the process

W
t
is a Brownian motion with drift, where the drift is the
Sharpe ratio.
14.6 Risk-neutral Valuation for Derivatives
The risk-neutral process d
W
t
plays an important role in the risk neutral valuation of
derivatives. In this section we shall prove that if f
T
is the price of a derivative at the
maturity time, then f
t
=

E[e
r(Tt)
f
T
[T
t
] is the price of the derivative at the time t,
for any t < T.
In other words, the discounted price of a derivative in the risk-neutral world is the
price of the derivative at the new instance of time. This is based on the fact that e
rt
f
t
is an T
t
-martingale with respect to the risk-neutral measure Q introduced previously.
In particular, the idea of the proof can be applied for the stock S
t
. Applying the
product rule
d(e
rt
S
t
) = d(e
rt
)S
t
+e
rt
dS
t
+d(e
rt
)dS
t
. .
=0
= re
rt
S
t
dt +e
rt
(rS
t
dt +S
t
d
W
t
)
= e
rt
(rS
t
dt +S
t
d
W
t
).
277
If u < t, integrating between u and t
e
rt
S
t
= e
ru
S
u
+
_
t
u
e
rs
S
s
d
W
s
,
and taking the risk-neutral expectation with respect to the information set T
u
yields
E[e
rt
S
t
[T
u
] =

E[e
ru
S
u
+
_
t
u
e
rs
S
s
d
W
s
[ T
u
]
= e
ru
S
u
+

E
_
_
t
u
e
rs
S
s
d
W
s
[T
u
= e
ru
S
u
+

E
_
_
t
u
e
rs
S
s
d
W
s
= e
ru
S
u
,
since
_
t
u
e
rs
S
s
d
W
s
is independent of T
u
. It follows that e
rt
S
t
is an T
t
-martingale in
the risk-neutral world. The following fundamental result can be shown using a similar
proof as the one encountered previously:
Theorem 14.6.1 If f
t
= f(t, S
t
) is the price of a derivative at time t, then e
rt
f
t
is
an T
t
-martingale in the risk-neutral world, i.e.
E[e
rt
f
t
[T
u
] = e
ru
f
u
, 0 < u < t.
Proof: Using Itos formula and the risk neutral process dS = rS +Sd
W
t
, the process
followed by f
t
is
df
t
=
f
t
dt +
f
S
dS +
1
2
2
f
S
2
(dS)
2
=
f
t
dt +
f
S
(rSdt +Sd
W
t
) +
1
2
2
S
2
2
f
S
2
dt
=
_
f
t
+rS
f
S
+
1
2
2
S
2
2
f
S
2
_
dt +S
f
S
d
W
t
= rfdt +S
f
S
d
W
t
,
where in the last identity we used that f satises the Black-Scholes equation. Applying
the product rule we obtain
d(e
rt
f
t
) = d(e
rt
)f
t
+e
rt
df
t
+d(e
rt
)df
t
. .
=0
= re
rt
f
t
dt +e
rt
(rf
t
dt +S
f
S
d
W
t
)
= e
rt
S
f
S
d
W
t
.
278
Integrating between u and t we get
e
rt
f
t
= e
ru
f
u
+
_
t
u
e
rs
S
f
s
S
d
W
s
,
which assures that e
rt
f
t
is a martingale, since

W
s
is a Brownian motion process. Using
that e
ru
f
u
is T
u
-predictable, and
_
t
u
e
rs
S
f
s
S
d
W
s
is independent of the information
set T
u
, we have
E[e
rt
f
t
[T
u
] = e
ru
f
u
+

E
_
_
t
u
e
rs
S
f
s
S
d
W
s
= e
ru
f
u
.
Exercise 14.6.2 Show the following:
(a)

E[e
(WtWu)
[T
u
] = e
(r+
1
2
2
)(tu)
, u < t;
(b)

E
_
St
Su
[T
u
_
= e
(
1
2
2
)(tu)
E
_
e
(WtWu)
[T
u
, u < t;
(c)

E
_
St
Su
[T
u
_
= e
r(tu)
, u < t.
Exercise 14.6.3 Find the following risk-neutral world conditional expectations:
(a)

E[
_
t
0
S
u
du[T
s
], s < t;
(b)

E[S
t
_
t
0
S
u
du[T
s
], s < t;
(c)

E[
_
t
0
S
u
dW
u
[T
s
], s < t;
(d)

E[S
t
_
t
0
S
u
dW
u
[T
s
], s < t;
(e)

E[
_ _
t
0
S
u
du
_
2
[T
s
], s < t;
Exercise 14.6.4 Use risk-neutral valuation to nd the price of a derivative that pays
at maturity the following payos:
(a) f
T
= TS
T
;
(b) f
T
=
_
T
0
S
u
du;
(c) f
T
=
_
T
0
S
u
dW
u
.
Chapter 15
Black-Scholes Analysis
15.1 Heat Equation
This section is devoted to a basic discussion on the heat equation. Its importance resides
in the remarkable fact that the Black-Scholes equation, which is the main equation of
derivatives calculus, can be reduced to this type of equation.
Let u(, x) denote the temperature in an innite rod at point x and time . In the
absence of exterior heat sources the heat diuses according to the following parabolic
dierential equation
u

2
u
x
2
= 0, (15.1.1)
called the heat equation. If the initial heat distribution is known and is given by
u(0, x) = f(x), then we have an initial value problem for the heat equation.
Solving this equation involves a convolution between the initial temperature f(x)
and the fundamental solution of the heat equation G(, x), which will be dened shortly.
Denition 15.1.1 The function
G(, x) =
1
4
e
x
2
4
, > 0,
is called the fundamental solution of the heat equation (15.1.1).
We recall the most important properties of the function G(, x).
G(, x) has the properties of a probability density
1
, i.e.
1. G(, x) > 0, x R, > 0;
1
In fact it is a Gaussian probability density.
279
280
G, x
0.10
0.24
0.38
0.50
x
2 1 1 2
0.2
0.4
0.6
0.8
1.0
1.2
Figure 15.1: The function G(, x) tends to the Dirac measure (x) as 0, and
attens out as .
2.
_
R
G(, x) dx = 1, > 0.
it satises the heat equation
G

2
G
x
2
= 0, > 0.
G tends to the Dirac measure (x) as gets closer to the initial time
lim
0
G(, x) = (x),
where the Dirac measure can be dened using integration as
_
R
(x)(x)dx = (0),
for any smooth function with compact support . Consequently, we also have
_
R
(x)(x y)dx = (y).
One can think of (x) as a measure with innite value at x = 0 and zero for the rest
of the values, and with the integral equal to 1, see Fig.15.1.
The physical signicance of the fundamental solution G(, x) is that it describes the
heat evolution in the innite rod after an initial heat impulse of innite size is applied
at x = 0.
281
Proposition 15.1.2 The solution of the initial value heat equation
u

2
u
x
2
= 0
u(0, x) = f(x)
is given by the convolution between the fundamental solution and the initial temperature
u(, x) =
_
R
G(, y x)f(y) dy, > 0.
Proof: Substituting z = y x, the solution can be written as
u(, x) =
_
R
G(, z)f(x +z) dz. (15.1.2)
Dierentiating under the integral yields
u
=
_
R
G(, z)
f(x +z) dz,
2
u
x
2
=
_
R
G(, z)
2
f(x +z)
x
2
dz =
_
R
G(, z)
2
f(x +z)
z
2
dz
=
_
R
2
G(, z)
z
2
f(x +z) dz,
where we applied integration by parts twice and the fact that
lim
z
G(, z) = lim
z
G(, z)
z
= 0.
Since G satises the heat equation,
u

2
u
x
2
=
_
R
_
G(, z)

2
G(, z)
z
2
_
f(x +z) dz = 0.
Since the limit and the integral commute
2
, using the properties of the Dirac measure,
we have
u(0, x) = lim
0
u(, x) = lim
0
_
R
G(, z)f(x +z) dz
=
_
R
(z)f(x +z) dz = f(x).
Hence (15.1.2) satises the initial value heat equation.
It is worth noting that the solution u(, x) =
_
R
G(y x, )f(y) dy provides the
temperature at any point in the rod for any time > 0, but it cannot provide the
2
This is allowed by the dominated convergence theorem.
282
temperature for < 0, because of the singularity the fundamental solution exhibits at
= 0. We can reformulate this by saying that the heat equation is semi-deterministic,
in the sense that given the present, we can know the future but not the past.
The semi-deterministic character of diusion phenomena can be exemplied with a
drop of ink which starts diusing in a bucket of water at time t = 0. We can determine
the density of the ink at any time t > 0 at any point x in the bucket. However, given
the density of ink at a time t > 0, it is not possible to trace back in time the ink
distribution density and to nd the initial point where the drop started its diusion.
The semi-deterministic behavior occurs in the study of derivatives too. In the case
of the Black-Scholes equation, which is a backwards heat equation
3
, given the present
value of the derivative, we can nd the past values but not the future ones. This
is the capital diculty in foreseeing the prices of stock market instruments from the
present prices. This diculty will be overcome by working the price from the given
nal condition, which is the payo at maturity.
15.2 What is a Portfolio?
A portfolio is a position in the market that consists in long and short positions in
one or more stocks and other securities. The value of a portfolio can be represented
algebraically as a linear combination of stock prices and other securities values:
P =
n
j=1
a
j
S
j
+
m
k=1
b
k
F
k
.
The market participant holds a
j
units of stock S
j
and b
k
units in derivative F
k
. The
coecients are positive for long positions and negative for short positions. For instance,
a portfolio given by P = 2F 3S means that we buy 2 securities and sell 3 units of
stock (a position with 2 securities long and 3 stocks short).
The portfolio is self-nancing if
dP =
n
j=1
a
j
dS
j
+
m
k=1
b
k
dF
k
.
15.3 Risk-less Portfolios
A portfolio P is called risk-less if the increments dP are completely predictable. In this
case the increments value dP should equal the interest earned in the time interval dt
on the portfolio P. This can be written as
dP = rPdt, (15.3.3)
3
This comes from the fact that at some point becomes due to a substitution
283
where r denotes the risk-free interest rate. For the sake of simplicity the rate r will be
assumed constant throughout this section.
Lets assume now that the portfolio P depends on only one stock S and one deriva-
tive F, whose underlying asset is S. The portfolio depends also on time t, so
P = P(t, S, F).
We are interested in deriving the stochastic dierential equation followed by the port-
folio P. We note that at this moment the portfolio is not assumed risk-less. By Itos
formula we get
dP =
P
t
dt +
P
S
dS +
P
F
dF +
1
2
2
P
S
2
dS
2
+
1
2
2
P
F
2
(dF)
2
. (15.3.4)
The stock S is assumed to follow the geometric Brownian motion
dS = Sdt +SdW
t
, (15.3.5)
where the expected return rate on the stock and the stocks volatility are constants.
Since the derivative F depends on time and underlying stock, we can write F = F(t, S).
Applying Itos formula, yields
dF =
F
t
dt +
F
S
dS +
1
2
2
F
S
2
(dS)
2
=
_
F
t
+S
F
S
+
1
2
2
S
2
2
F
S
2
_
dt +S
F
S
dW
t
, (15.3.6)
where we have used (15.3.5). Taking the squares in relations (15.3.5) and (15.3.6), and
using the stochastic relations (dW
t
)
2
= dt and dt
2
= dW
t
dt = 0, we get
(dS)
2
=
2
S
2
dt
(dF)
2
=
2
S
2
_
F
S
_
2
dt.
Substituting back in (15.3.4), and collecting the predictable and unpredictable parts,
yields
dP =
_
P
t
+S
_
P
S
+
P
F
F
S
_
+
P
F
_
F
t
+
1
2
2
S
2
2
F
S
2
_
+
1
2
2
S
2
_
2
P
S
2
+

2
P
F
2
_
F
S
_
2
__
dt
+S
_
P
S
+
P
F
F
S
_
dW
t
. (15.3.7)
Looking at the unpredictable component, we have the following result:
284
Proposition 15.3.1 The portfolio P is risk-less if and only if
dP
dS
= 0.
Proof: A portfolio P is risk-less if and only if its unpredictable component is identically
zero, i.e.
P
S
+
P
F
F
S
= 0.
Since the total derivative of P is given by
dP
dS
=
P
S
+
P
F
F
S
,
the previous relation becomes
dP
dS
= 0.
Denition 15.3.2 The amount
P
=
dP
dS
is called the delta of the portfolio P.
The previous result can be reformulated by saying that a portfolio is risk-less if and
only if its delta vanishes. In practice this can hold only for a short amount of time, so
the portfolio needs to be re-balanced periodically. The process of making a portfolio
risk-less involves a procedure called delta hedging, through which the portfolios delta
becomes zero or very close to this value.
Assume P is a risk-less portfolio, so
dP
dS
=
P
S
+
P
F
F
S
= 0. (15.3.8)
Then equation (15.3.7) simplies to
dP =
_
P
t
+
P
F
_
F
t
+
1
2
2
S
2
2
F
S
2
_
+
1
2
2
S
2
_
2
P
S
2
+

2
P
F
2
_
F
S
_
2
__
dt. (15.3.9)
Comparing with (15.3.3) yields
P
t
+
P
F
_
F
t
+
1
2
2
S
2
2
F
S
2
_
+
1
2
2
S
2
_
2
P
S
2
+

2
P
F
2
_
F
S
_
2
_
= rP. (15.3.10)
This is the equation satised by a risk-free nancial instrument, P = P(t, S, F), that
depends on time t, stock S and derivative price F.
285
15.4 Black-Scholes Equation
This section deals with a parabolic partial dierential equation satised by all European-
type securities, called the Black-Scholes equation. This was initially used by Black and
Scholes to nd the value of options. This is a deterministic equation obtained by elim-
inating the unpredictable component of the derivative by making a risk-less portfolio.
The main reason for this being possible is the fact that both the derivative F and the
stock S are driven by the same source of uncertainty.
The next result holds in a market with the following restrictive conditions:
the risk-free rate r and stock volatility are constant.
there are no arbitrage opportunities.
no transaction costs.
Proposition 15.4.1 If F(t, S) is a derivative dened for t [0, T], then
F
t
+rS
F
S
+
1
2
2
S
2
2
F
S
2
= rF. (15.4.11)
Proof: The equation (15.3.10) works under the general hypothesis that P = P(t, S, F)
is a risk-free nancial instrument that depends on time t, stock S and derivative F.
We shall consider P to be the following particular portfolio
P = F S.
This means to take a long position in derivative and a short position in units of stock
(assuming positive). The partial derivatives in this case are
P
t
= 0,
P
F
= 1,
P
S
= ,
2
P
F
2
= 0,

2
P
S
2
= 0.
From the risk-less property (15.3.8) we get =
F
S
. Substituting into equation
(15.3.10) yields
F
t
+
1
2
2
S
2
2
F
S
2
= rF rS
F
S
,
which is equivalent to the desired equation.
However, the Black-Scholes equation is derived most often in a less rigorous way.
It is based on the assumption that the number =
F
S
, which appears in the formula
286
of the risk-less portfolio P = F S, is considered constant for the time interval t.
If we consider the increments over the time interval t
W
t
= W
t+t
W
t
S = S
t+t
S
t
F = F(t + t, S
t
+ S) F(t, S),
then Itos formula yields
F =
_
F
t
(t, S) +S
F
S
(t, S) +
1
2
2
S
2
2
F
S
2
(t, S)
_
t
+S
F
S
(t, S)W
t
.
On the other side, the increments in the stock are given by
S = St +SW
t
.
Since both increments F and S are driven by the same uncertainly source, W
t
,
we can eliminate it by multiplying the latter equation by
F
S
and subtract it from the
former
F
F
S
(t, S)S =
_
F
t
(t, S) +
1
2
2
S
2
2
F
S
2
(t, S)
_
t.
The left side can be regarded as the increment P, of the portfolio
P = F
F
S
S.
This portfolio is risk-less because its increments are totally deterministic, so it must
also satisfy P = rPt. The number
F
S
is assumed constant for small intervals of
time t. Even if this assumption is not rigorous enough, the procedure still leads to
the right equation. This is obtained by equating the coecients of t in the last two
equations
F
t
(t, S) +
1
2
2
S
2
2
F
S
2
(t, S) = r
_
F
F
S
S
_
,
which is equivalent to the Black-Scholes equation.
15.5 Delta Hedging
The proof for the Black-Scholes equation is based on the fact that the portfolio P =
F
F
S
S is risk-less. Since the delta of the derivative F is
F
=
dF
dS
=
F
S
,
then the portfolio P = F
F
S is risk-less. This leads to the delta-hedging procedure,
by which selling
F
units of the underlying stock S yields a risk-less investment.
287
Exercise 15.5.1 Find the value of the portfolio P = F
F
S in th ecase when F is
a call option.
P = F
F
S = c N(d
1
)S = Ke
r(Tt)
.
15.6 Tradeable securities
A derivative F(t, S) that is a solution of the Black-Scholes equation is called tradeable.
Its name comes from the fact that it can be traded (either on an exchange or over-the-
counter). The Black-Scholes equation constitutes the equilibrium relation that provides
the traded price of the derivative. We shall deal next with a few examples of tradeable
securities.
Example 15.6.1 (i) It is easy to show that F = S is a solution of the Black-Scholes
equation. Hence the stock is a tradeable security.
(ii) If K is a constant, then F = e
rt
K is a tradeable security.
(iii) If S is the stock price, then F = e
S
is not a tradeable security, since F does not
satisfy equation (15.4.11).
Exercise 15.6.1 Show that F = ln S is not a tradeable security.
Exercise 15.6.2 Find all constants such that S
is tradeable.
Substituting F = S
into equation (15.4.11) we obtain

rSS
1
+
1
2
2
S
2
( 1)S
2
= rS
.
Dividing by S
yields r +
1
2
2
( 1) = r. This can be factorized as
1
2
2
( 1)( +
2r
2
) = 0,
with two distinct solutions
1
= 1 and
2
=
2r
2
. Hence there are only two tradeable
securities that are powers of the stock: the stock itself, S, and S
2r/
2
. In particular,
S
2
is not tradeable, since 2r/
2
,= 2 (the left side is negative). The role of these two
cases will be claried by the next result.
Proposition 15.6.3 The general form of a traded derivative, which does not depend
explicitly on time, is given by
F(S) = C
1
S +C
2
S
2r/
2
, (15.6.12)
with C
1
, C
2
constants.
288
Proof: If the derivative depends solely on the stock, F = F(S), then the Black-Scholes
equation becomes the ordinary dierential equation
rS
dF
dS
+
1
2
2
S
2
d
2
F
dS
2
= rF. (15.6.13)
This is an Euler-type equation, which can be solved by using the substitution S = e
x
.
The derivatives
d
dS
and
d
dx
are related by the chain rule
d
dx
=
dS
dx
d
dS
.
Since
dS
dx
=
de
x
dx
= e
x
= S, it follows that
d
dx
= S
d
dS
. Using the product rule,
d
2
dx
2
= S
d
dS
_
S
d
dS
_
= S
d
dS
+S
2
d
2
dS
2
,
and hence
S
2
d
2
dS
2
=
d
2
dx
2

d
dx
.
1
2
2
d
2
G(x)
dx
2
+ (r
1
2
2
)
dG(x)
dx
= rG(x).
where G(x) = G(e
x
) = F(S). The associated indicial equation
1
2
2
(r
1
2
2
) = r
has solutions
1
= 1,
2
= r/
2
, so the general solution has the form
G(x) = C
1
e
x
+C
2
e
2
x
,
which is equivalent with (15.6.12).
Exercise 15.6.4 Show that the price of a forward contract, which is given by F(t, S) = S Ke
r(Tt)
,
satises the Black-Scholes equation, i.e. a forward contract is a tradeable derivative.
Exercise 15.6.5 Show that the bond F(t) = e
r(Tt)
K is a tradeable security.
Exercise 15.6.6 Let d
1
and d
2
be given by
d
1
= d
2
+
T t
d
2
=
ln(S
t
/K) + (r

2
2
)(T t)
T t
.
289
Show that the following functions satisfy the Black-Scholes equation:
(a) F
1
(t, S) = SN(d
1
)
(b) F
2
(t, S) = e
r(Tt)
N(d
2
)
(c) F
2
(t, S) = SN(d
1
) Ke
r(Tt)
N(d
2
).
To which well-known derivatives do these formulas correspond?
15.7 Risk-less investment revised
A risk-less investment, P(t, S, F), which depends on time t, stock price S and derivative
F, and has S as underlying asset, satises equation (15.3.10). Using the Black-Scholes
equation satised by the derivative F
F
t
+
1
2
2
S
2
2
F
S
2
= rF rS
F
S
,
equation (15.3.10) becomes
P
t
+
P
F
_
rF rS
F
S
_
+
1
2
2
S
2
_
2
P
S
2
+

2
P
F
2
_
F
S
_
2
_
= rP.
Using the risk-less condition (15.3.8)
P
S
=
P
F
F
S
, (15.7.14)
the previous equation becomes
P
t
+rS
P
S
+rF
P
F
+
1
2
2
S
2
_
2
P
S
2
+

2
P
F
2
_
F
S
_
2
_
= rP. (15.7.15)
In the following we shall nd an equivalent expression for the last term on the left side.
Dierentiating in (15.7.14) with respect to F yields
2
P
FS
=
2
P
F
2
F
S

P
F
2
F
FS
=
2
P
F
2
F
S
,
where we used
2
F
FS
=

S
F
F
= 1.
Multiplying by
F
S
implies
2
P
F
2
_
F
S
_
2
=

2
P
FS
F
S
.
290
Substituting in the aforementioned equation yields the Black-Scholes equation for port-
folios
P
t
+rS
P
S
+rF
P
F
+
1
2
2
S
2
_
2
P
S
2

2
P
FS
F
S
_
= rP. (15.7.16)
We have seen in section 15.4 that P = F
F
S
S is a risk-less investment, in fact a
risk-less portfolio. We shall discuss in the following another risk-less investment.
Application 15.7.1 If a risk-less investment P has the variable S and F separable,
i.e. it is the sum P(S, F) = f(F) +g(S), with f and g smooth functions, then
P(S, F) = F +c
1
S +c
2
S
2r/
2
,
with c
1
, c
2
constants. The derivative F is given by the formula
F(t, S) = c
1
S c
2
S
2r/
2
+c
3
e
rt
, c
3
R.
Since P has separable variables, the mixed derivative term vanishes, and the equation
(15.7.16) becomes
Sg
(S) +

2
2r
S
2
g
(S) g(S) = f(F) Ff
(F).
There is a separation constant C such that
f(F) Ff
(F) = C
Sg
(S) +

2
2r
S
2
g
(S) g(S) = C.
Dividing the rst equation by F
2
_
1
F
f(F)
_
=
C
F
2
,
with the solution f(F) = c
0
F +C. To solve the second equation, let =

2
2r
. Then the
substitution S = e
x
leads to the ordinary dierential equation with constant coecients
h
(x) + (1 )h
(x) h(x) = C,
where h(x) = g(e
x
) = g(S). The associated indicial equation
2
+ (1 ) 1 = 0
has the solutions
1
= 1,
2
=
1
. The general solution is the sum between the par-

ticular solution h
p
(x) = C and the solution of the associated homogeneous equation,
which is h
0
(x) = c
1
e
x
+c
2
e
x
. Then
h(x) = c
1
e
x
+c
2
e
x
C.
291
Going back to the variable S, we get the general form of g(S)
g(S) = c
1
S +c
2
S
2r/
2
C,
with c
1
, c
2
constants. Since the constant C cancels by addition, we have the following
formula for the risk-less investment with separable variables F and S:
P(S, F) = f(F) +g(S) = c
0
F +c
1
S +c
2
S
2r/
2
.
Dividing by c
0
, we may assume c
0
= 1. We shall nd the derivative F(t, S) which
enters the previous formula. Substituting in (15.7.14) yields
F
S
= c
1
2r
2
c
2
S
12r/
2
,
which after partial integration in S gives
F(t, S) = c
1
S c
2
S
2r/
2
+(t),
where the integration constant (t) is a function of t. The sum of the rst two terms
is the derivative given by formula (15.6.12). The remaining function (t) has also to
satisfy the Black-Scholes equation, and hence it is of the form (t) = c
3
e
rt
, with c
3
constant. Then the derivative F is given by
F(t, S) = c
1
S c
2
S
2r/
2
+c
3
e
rt
.
It is worth noting that substituting in the formula of P yields P = c
3
e
rt
, which agrees
with the formula of a risk-less investment.
Example 15.7.1 Find the function g(S) such that the product P = Fg(S) is a risk-
less investment, with F = F(t, S) derivative. Find the expression of the derivative F
in terms of S and t.
Proof: Substituting P = Fg(S) into equation (15.7.15) and simplifying by rF yields
S
dg(S)
dS
+

2
2r
S
2
d
2
g(S)
dS
2
= 0.
Substituting S = e
x
, and h(x) = g(e
x
) = g(S) yields
h
(x) +
_
2r
2
1
_
h
(x) = 0.
Integrating leads to the solution
h(x) = C
1
+C
2
e
(1
2r
2
)x
.
Going back to variable S
g(S) = h(ln S) = C
1
+C
2
e
(1
2r
2
) lnS
= C
1
+C
2
S
1
2r
2
.
292
15.8 Solving the Black-Scholes
In this section we shall solve the Black-Scholes equation and show that its solution
coincides with the one provided by the risk-neutral evaluation in section 13. This way,
the Black-Scholes equation provides a variant approach for European-type derivatives
by using partial dierential equations instead of expectations.
Consider a European-type derivative F, with the payo at maturity T given by
f
T
, which is a function of the stock price at maturity, S
T
. Then F(t, S) satises the
following nal condition partial dierential equation
F
t
+rS
F
S
+
1
2
2
S
2
2
F
S
2
= rF
F(T, S
T
) = f
T
(S
T
).
This means the solution is known at the nal time T and we need to nd its expression
at any time t prior to T, i.e.
f
t
= F(t, S
t
), 0 t < T.
First we shall transform the equation into an equation with constant coecients.
Substituting S = e
x
, and using the identities
S

S
=

x
, S
2

2
S
2
=

2
x
2

x
the equation becomes
V
t
+
1
2
2
V
x
2
+ (r
1
2
2
)
V
x
= rV,
where V (t, x) = F(t, e
x
). Using the time scaling =
1
2
2
(T t), the chain rule provides
t
=

t
=
1
2
.
Denote k =
2r
2
. Substituting in the aforementioned equation yields
W
=

2
W
x
2
+ (k 1)
W
x
kW, (15.8.17)
where W(, x) = V (t, x). Next we shall get rid of the last two terms on the right side
of the equation by using a crafted substitution.
Consider W(, x) = e
u(, x), where = x + , with , constants that will

be determined such that the equation satised by u(, x) has on the right side only the
293
second derivative in x. Since
W
x
= e
_
u +
u
x
_
2
W
x
2
= e
2
u + 2
u
x
+
u
x
2
_
W
= e
_
u +
u
_
,
substituting in (15.8.17), dividing by e
and collecting the derivatives yields

u
=

2
u
x
2
+
_
2 +k 1
_
u
x
+
_
2
+(k 1) k
_
u = 0
The constants and are chosen such that the coecients of
u
x
and u vanish
2 +k 1 = 0
2
+(k 1) k = 0.
Solving yields
=
k 1
2
=
2
+(k 1) k =
(k + 1)
2
4
.
The function u(, x) satises the heat equation
u
=

2
u
x
2
with the initial condition expressible in terms of f
T
u(0, x) = e
(0,x)
W(0, x) = e
x
V (T, x)
= e
x
F(T, e
x
) = e
x
f
T
(e
x
).
From the general theory of heat equation, the solution can be expressed as the convo-
lution between the fundamental solution and the initial condition
u(, x) =
_

4
e
(yx)
2
4
u(0, y) dy
The previous substitutions yield the following relation between F and u
F(t, S) = F(t, e
x
) = V (t, x) = W(, x) = e
(,x)
u(, x),
so F(T, e
x
) = e
x
u(0, x). This implies
294
F(t, e
x
) = e
(,x)
u(, x) = e
(,x)
_

4
e
(yx)
2
4
u(0, y) dy
= e
(,x)
_

4
e
(yx)
2
4
e
y
F(T, e
y
) dy.
With the substitution y = x = s
2 this becomes
F(t, e
x
) = e
(,x)
_

2
e
s
2
2
(x+s
2)
F(T, e
x+s
2
) dy.
Completing the square as
s
2
2
(x +s
2) =
1
2
_
s
k 1
2
2
_
2
+
(k 1)
2
4
+
k 1
2
x,
after cancelations, the previous integral becomes
F(t, e
x
) = e
(k+1)
2
4

1
2
_

1
2
_
s
k1
2
2
_
2
e
(k1)
2
4

F(T, e
x+s
2
) ds.
Using
e
(k+1)
2
4

e
(k1)
2
4

= e
k
= e
r(Tt)
,
(k 1) = (r
1
2
2
)(T t),
after the substitution z = x +s
2 we get
F(t, e
x
) = e
r(Tt)
1
2
_

1
2
(zx(k1))
2
2
F(T, e
z
)
1
2
dz
= e
r(Tt)
1
_
2
2
(T t)
_

[zx(r
1
2
2
)(Tt)]
2
2
2
(Tt)
F(T, e
z
) dz.
Since e
x
= S
t
, considering the probability density
p(z) =
1
_
2
2
(T t)
e
[zlnS
t
(r
1
2
2
)(Tt)]
2
2
2
(Tt)
,
the previous expression becomes
F(t, S
t
) = e
r(Tt)
_

p(z)f
T
(e
z
) dz = e
r(Tt)
E
t
[f
T
],
with f
T
(S
T
) = F(T, S
T
) and

E
t
the risk-neutral expectation operator as of time t,
which was introduced and used in section 13.
295
15.9 Black-Scholes and Risk-neutral Valuation
The conclusion of the computation of the last section is of capital importance for
derivatives calculus. It shows the equivalence between the Black-Scholes equation and
the risk-neutral evaluation. It turns out that instead of computing the risk-neutral
expectation of the payo, as in the case of the risk-neutral evaluation, we may have
the choice to solve the Black-Scholes equation directly, and impose the nal condition
to be the payo.
In many cases solving a partial dierential equation is simpler than evaluating the
expectation integral. This is due to the fact that we may look for a solution dictated
by the particular form of the payo f
T
. We shall apply that in nding put-call parities
for dierent types of derivatives.
Consequently, all derivatives evaluated by the risk-neutral valuation are solutions
of the Black-Scholes equation. The only distinction is their payo. A few of them are
given in the next example.
Example 15.9.1 (a) The price of a European call option is the solution F(t, S) of the
Black-Scholes equation satisfying
f
T
(S
T
) = max(S
T
K, 0).
(b) The price of a European put option is the solution F(t, S) of the Black-Scholes
equation with the nal condition
f
T
(S
T
) = max(K S
T
, 0).
(c) The value of a forward contract is the solution the Black-Scholes equation with the
nal condition
f
T
(S
T
) = S
T
K.
It is worth noting that the superposition principle discussed in section 13 can be
explained now by the fact that the solution space of the Black-Scholes equation is a
linear space. This means that a linear combination of solutions is also a solution.
Another interesting feature of the Black-Scholes equation is its independence of the
stock drift rate . Then its solutions must have the same property. This explains why,
in the risk-neutral valuation, the value of does not appear explicitly in the solution.
Asian options satisfy similar Black-Scholes equations, with small dierences, as we
shall see in the next section.
15.10 Boundary Conditions
We have solved the Black-Scholes equation for a call option, under the assumption
that there is a unique solution. The Black-Scholes equation is of rst order in the time
296
S K
10 20 30 40 50 60
5
10
15
20
Figure 15.2: NEEDS TO BE REDONE The graph of the option price before maturity
in the case K = 40, = 30%, r = 8%, and T t = 1.
variable t and of second order in the stock variable S, so it needs one nal condition
at t = T and two boundary conditions for S = 0 and S .
In the case of a call option, the nal condition is given by the following payo:
F(T, S
T
) = maxS
T
K, 0.
When S 0, the option does not get exercised, so the initial boundary condition is
F(t, 0) = 0.
When S the price becomes linear
F(t, S) S K,
the graph of F( , S) having a slant asymptote, see Fig.15.2.
15.11 Risk-less Portfolios for Rare Events
Consider the derivative P = P(t, S, F), which depends on the time t, stock price S and
the derivative F, whose underlying asset is S. We shall nd the stochastic dierential
equation followed by P, under the hypothesis that the stock exhibits rare events, i.e.
dS = Sdt +SdW
t
+SdM
t
, (15.11.18)
where the constant , , denote the drift rate, volatility and jump in the stock price
in the case of a rare event. The processes W
t
and M
t
= N
t
t denote the Brownian
motion and the compensated Poisson process, respectively. The constant > 0 denotes
the rate of occurrence of the rare events in the market.
297
By Itos formula we get
dP =
P
t
dt +
P
S
dS +
P
F
dF +
1
2
2
P
S
2
dS
2
+
1
2
2
P
F
2
(dF)
2
. (15.11.19)
We shall use the following stochastic relations:
(dW
t
)
2
= dt, (dM
t
)
2
= dN
t
, dt
2
= dt dW
t
= dt dM
t
= dW
t
dM
t
= 0,
see sections 3.11.2 and 3.11.2. Then
(dS)
2
=
2
S
2
dt +
2
S
2
dN
t
= (
2
+
2
)S
2
dt +
2
S
2
dM
t
, (15.11.20)
where we used dM
t
= dN
t
dt. It is worth noting that the unpredictable part of (dS)
2
depends only on the rare events, and does not depend on the regular daily events.
Exercise 15.11.1 If S satises (15.11.18), nd the following
(a) E[(dS)
2
] (b) E[dS] (c) V ar[dS].
Using Itos formula, the innitesimal change in the value of the derivative F = F(t, S)
is given by
dF =
F
t
dt +
F
S
dS +
1
2
2
F
S
2
(dS)
2
=
_
F
t
+S
F
S
+
1
2
(
2
+
2
)S
2
2
F
S
2
_
dt
+S
F
S
dW
t
+
_
1
2
2
S
2
2
F
S
2
+S
F
S
_
dM
t
, (15.11.21)
where we have used (15.11.18) and (15.11.20). The increment dF has two independent
sources of uncertainty: dW
t
and dM
t
, both with mean equal to 0.
Taking the square, yields
(dF)
2
=
2
S
2
_
F
S
_
2
dt +
_
1
2
2
S
2
2
F
S
2
+S
F
S
_
dN
t
. (15.11.22)
Substituting back in (15.11.19), we obtain the unpredictable part of dP as the sum of
two components
S
_
P
S
+
P
F
F
S
_
dW
t
+S
_
P
S
+
P
F
F
S
_
+
1
2
2
S
2
_
P
F
2
F
S
2
+

2
P
S
2
+

2
P
F
2
2
F
S
2
_
dM
t
.
298
The risk-less condition for the portfolio P is obtained when the coecients of dW
t
and dM
t
vanish
P
S
+
P
F
F
S
= 0 (15.11.23)
P
F
2
F
S
2
+

2
P
S
2
+

2
P
F
2
2
F
S
2
= 0. (15.11.24)
These relations can be further simplied. If we dierentiate in (15.11.23) with respect
to S
2
P
S
2
+
P
F
2
F
S
2
=

2
P
SF
F
S
.
2
P
F
2
2
F
S
2
=

2
P
SF
F
S
. (15.11.25)
Dierentiating in (15.11.23) with respect to F we get
2
P
FS
=
2
P
F
2
F
S

P
F
2
F
FS
=
2
P
F
2
F
S
, (15.11.26)
since
2
F
FS
=

S
_
F
F
_
.
Multiplying (15.11.26) by
F
S
yields
2
P
FS
F
S
=
2
P
F
2
_
F
S
_
2
,
and substituting in the right side of (15.11.25) leads to the equation
2
P
F
2
_
2
F
S
2
+
_
F
S
_
2
_
= 0.
We have arrived at the following result:
Proposition 15.11.2 Let F = F(t, S) be a derivative with the underlying asset S.
The investment P = P(t, S, F) is risk-less if and only if
P
S
+
P
F
F
S
= 0
2
P
F
2
_
2
F
S
2
+
_
F
S
_
2
_
= 0.
299
There are two risk-less conditions because there are two unpredictable components in
the increments of dP, one due to regular changes and the other due to rare events.
The rst condition is equivalent with the vanishing total derivative,
dP
dS
= 0, and
corresponds to osetting the regular risk.
The second condition vanishes either if

2
P
F
2
= 0 or if

2
F
S
2
+
_
F
S
_
2
= 0. In the
rst case P is linear in F. For instance, if P = F f(S), from the rst condition yields
f
(S) =
F
S
. In the second case, denote U(t, S) =
F
S
. Then we need to solve the
partial dierential equation
U
t
+U
2
= 0.
Future research directions:
1. Solve the above equation.
2. Find the predictable part of dP.
3. Get an analog of the Black-Scholes in this case.
4. Evaluate a call option in this case.
5. Is the risk-neutral valuation still working and why?
300
Chapter 16
Black-Scholes for Asian
Derivatives
In this chapter we shall develop the Black-Scholes equation in the case of Asian deriva-
tives and we shall discuss the particular cases of options and forward contracts on
weighted averages. In the case of the latter contracts we obtain closed form solutions,
while for the former ones we apply the reduction variable method to decrease the
number of variables and discuss the solution.
16.1 Weighted averages
In many practical problems the asset price needs to be considered with a certain weight.
For instance, when computing car insurance, more weight is assumed for recent acci-
dents than for accidents that occurred 10 years ago.
In the following we shall dene the weight function and provide several examples.
Let : [0, T] R be a weight function, i.e. a function satisfying
1. > 0;
2.
_
T
0
(t) dt = 1.
The stock weighted average with respect to the weight is dened as
S
ave
=
_
T
0
(t)S
t
dt.
Example 16.1.1 (a) The uniform weight is obtained for (t) =
1
T
. In this case
S
ave
=
1
T
_
T
0
S
t
dt
301
302
is the continuous arithmetic average of the stock on the time interval [0, T].
(b) The linear weight is obtained if (t) =
2t
T
2
. In this case the weight is the time
S
ave
=
2
T
2
_
T
0
tS
t
dt.
(c) The exponential weight is obtained for (t) =
ke
kt
e
kT
1
. If k > 0, the weight is
increasing, so recent data are weighted more than old data; if k < 0, the weight is
decreasing. The exponential weighted average is given by
S
ave
=
k
e
kT
1
_
T
0
e
kt
S
t
dt.
Exercise 16.1.2 Consider the polynomial weighted average
S
(n)
ave
=
n + 1
T
n+1
_
T
0
t
n
S
t
dt.
Find the limit lim
n
S
(n)
ave
in the cases 0 < T < 1, T = 1, and T > 1.
In all previous examples (t) = (t, T) =
f(t)
g(T)
, with
_
T
0
f(t) dt = g(T), so g
(T) =
f(T) and g(0) = 0. The average becomes
S
ave
(T) =
1
g(T)
_
T
0
f(u)S
u
du =
I
T
g(T)
,
with I
t
=
_
t
0
f(u)S
u
du satisfying dI
t
= f(t)S
t
dt. From the product rule we get
dS
ave
(t) =
dI
t
g(t) I
t
dg(t)
g(t)
2
=
_
f(t)
g(t)
S
t
(t)
g(t)
I
t
g(t)
_
dt
=
f(t)
g(t)
_
S
t
(t)
f(t)
S
ave
(t)
_
dt
=
f(t)
g(t)
_
S
t
S
ave
(t)
_
dt,
since g
(t) = f(t). The initial condition is

S
ave
(0) = lim
t0
S
ave
(t) = lim
t0
I
t
g(t)
= lim
t0
f(t)S
t
g
(t)
= S
0
lim
t0
f(t)
g
(t)
= S
0
,
303
Proposition 16.1.3 The weighted average S
ave
(t) satises the stochastic dierential
equation
dX
t
=
f(t)
g(t)
(S
t
X
t
)dt
X
0
= S
0
.
Exercise 16.1.4 Let x(t) = E[S
ave
(t)].
(a) Show that x(t) satises the ordinary dierential equation
x
(t) =
f(t)
g(t)
_
S
0
e
t
x(t)
_
x(0) = S
0
.
(b) Find x(t).
Exercise 16.1.5 Let y(t) = E[S
2
ave
(t)].
(a) Find the stochastic dierential equation satised by S
2
ave
(t);
(b) Find the ordinary dierential equation satised by y(t);
(c) Solve the previous equation to get y(t) and compute V ar[S
ave
].
16.2 Setting up the Black-Scholes Equation
Consider an Asian derivative whose value at time t, F(t, S
t
, S
ave
(t)), depends on vari-
ables t, S
t
, and S
ave
(t). Using the stochastic process of S
t
dS
t
= S
t
dt +S
t
dW
t
and Proposition 16.1.3, an application of Itos formula together with the stochastic
formulas
dt
2
= 0, (dW
t
)
2
= 0, (dS
t
)
2
=
2
S
2
dt, (dS
ave
)
2
= 0
yields
dF =
F
t
dt +
F
S
t
dS
t
+
1
2
2
F
S
2
t
(dS
t
)
2
+
F
S
ave
dS
ave
=
_
F
t
+S
t
F
S
t
+
1
2
2
S
2
t
2
F
S
2
t
+
f(t)
g(t)
(S
t
S
ave
)
F
S
ave
_
dt
+S
t
F
S
t
dW
t
.
Let
F
=
F
St
. Consider the following portfolio at time t
P(t) = F
F
S
t
,
304
obtained by buying one derivative F and selling
F
units of stock. The change in the
portfolio value during the time dt does not depend on W
t
dP = dF
F
dS
t
=
_
F
t
+
1
2
2
S
2
t
2
F
S
2
t
+
f(t)
g(t)
(S
t
S
ave
)
F
S
ave
_
dt (16.2.1)
so the portfolio P is risk-less. Since no arbitrage opportunities are allowed, investing a
value P at time t in a bank at the risk-free rate r for the time interval dt yields
dP = rPdt =
_
rF rS
t
F
S
t
_
dt. (16.2.2)
Equating (16.2.1) and (16.2.2) yields the following form of the Black-Scholes equation
for Asian derivatives on weighted averages
F
t
+rS
t
F
S
t
+
1
2
2
S
2
t
2
F
S
2
t
+
f(t)
g(t)
(S
t
S
ave
)
F
S
ave
= rF.
16.3 Weighted Average Strike Call Option
In this section we shall use the reduction variable method to decrease the number of
variables from three to two. Since S
ave
(t) =
I
t
g(t)
, it is convenient to consider the
derivative as a function of t, S
t
and I
t
V (t, S
t
, I
t
) = F(t, S
t
, S
ave
).
A computation similar to the previous one yields the simpler equation
V
t
+rS
t
V
S
t
+
1
2
2
S
2
t
2
V
S
2
t
+f(t)S
t
V
I
t
= rV. (16.3.3)
The payo at maturity of an average strike call option can be written in the following
form
V
T
= V (T, S
T
, I
T
) = maxS
T
S
ave
(T), 0
= maxS
T

I
T
g(T)
, 0 = S
T
max1
1
g(T)
I
T
S
T
, 0
= S
T
L(T, R
T
),
where
R
t
=
I
t
S
t
, L(t, R) = max1
1
g(t)
R, 0.
305
Since at maturity the variable S
T
is separated from T and R
T
, we shall look for a
solution of equation (16.3.3) of the same type for any t T, i.e. V (t, S, I) = SG(t, R)
. Since
V
t
= S
G
t
,
V
I
= S
G
R
1
S
=
G
R
;
V
S
= G+S
G
R
R
S
= GR
G
R
;
2
V
S
2
=

S
(GR
G
R
) =
G
R
R
S

R
S
G
R
R
2
G
R
2
R
S
;
= R
2
G
R
2
I
S
;
S
2
2
V
S
2
= RI
2
G
R
2
,
substituting in (16.3.3) and using that
RI
S
= R
2
, after cancelations yields
G
t
+
1
2
2
R
2
2
G
R
2
+ (f(t) rR)
G
R
= 0. (16.3.4)
This is a partial dierential equation in only two variables, t and R. It can be solved
explicitly sometimes, depending on the form of the nal condition G(T, R
T
) and ex-
pression of the function f(t).
In the case of a weighted average strike call option the nal condition is
G(T, R
T
) = max1
R
T
g(T)
, 0. (16.3.5)
Example 16.3.1 In the case of the arithmetic average the function G(t, R) satises
the partial dierential equation
G
t
+
1
2
2
R
2
2
G
R
2
+ (1 rR)
G
R
= 0
with the nal condition G(T, R
T
) = max1
R
T
T
, 0.
Example 16.3.2 In the case of the exponential average the function G(t, R) satises
the equation
G
t
+
1
2
2
R
2
2
G
R
2
+ (ke
kt
rR)
G
R
= 0 (16.3.6)
with the nal condition G(T, R
T
) = max1
R
T
e
kT
1
, 0.
Neither of the previous two nal condition problems can be solved explicitly.
306
20 40 60 80 100
0.2
0.4
0.6
0.8
1.0
20 40 60 80 100
0.2
0.4
0.6
0.8
1.0
(a) (b)
Figure 16.1: The prole of the solution H(t, R): (a) at expiration; (b) when there is
T t time left before expiration.
16.4 Boundary Conditions
The partial dierential equation (16.3.4) is of rst order in t and second order in R.
We need to specify one condition at t = T (the payo at maturity), which is given
by (16.3.5), and two conditions for R = 0 and R , which specify the behavior of
solution G(t, R) at two limiting positions of the variable R.
Taking R 0 in equation (16.3.4) and using Exercise 16.4.1 yields the rst bound-
ary condition for G(t, R)
_
G
t
+f
G
R
_
R=0
= 0. (16.4.7)
The term
G
R
R=0
represents the slope of G(t, R) with respect to R at R = 0, while
G
t
R=0
is the variation of the price G with respect to time t when R = 0.
Another boundary condition is obtained by specifying the behavior of G(t, R) for
large values of R. If R
t
, we must have S
t
0, because
R
t
=
1
S
t
_
t
0
f(u)S
u
du
and
_
t
0
f(u)S
u
du > 0 for t > 0. In this case we are better o not exercising the option
(since otherwise we get a negative payo), so the boundary condition is
lim
R
G(R, t) = 0. (16.4.8)
It can be shown in the theory of partial dierential equations that equation (16.3.4)
together with the nal condition (16.3.5), see Fig.16.1(a), and boundary conditions
(16.4.7) and (16.4.8) has a unique solution G(t, R), see Fig.16.1(b).
307
Exercise 16.4.1 Let f be a bounded dierentiable function. Show that
(a) lim
x0
xf
(x) = 0;
(b) lim
x0
x
2
f
(x) = 0.
There is no close form solution for the weighted average strike call option. Even in
the simplest case, when the average is arithmetic, the solution is just approximative,
see section 13.13. In real life the price is worked out using the Monte-Carlo simulation.
This is based on averaging a large number, n, of simulations of the process R
t
in the
risk-neutral world, i.e. assuming = r. For each realization, the associated payo
G
T,j
= max1
R
T,j
g(T)
is computed, with j n. Here R
T,j
represents the value of R
at time T in the jth realization. The average
1
n
n
j=1
G
T,j
is a good approximation of the payo expectation E[G
T
]. Discounting under the risk-
free rate we get the price at time t
G(t, R) = e
r(Tt)
_
1
n
n
j=1
G
T,j
_
.
It is worth noting that the term on the right is an approximation of the risk neutral
conditional expectation

E[G
T
[T
t
].
When simulating the process R
t
, it is convenient to know its stochastic dierential
equation. Using
dI
t
= f(t)S
t
dt, d
_
1
S
t
_
=
1
S
t
_
(
2
)dt dW
t
_
dt,
the product rule yields
dR
t
= d
_
I
t
S
t
_
= d
_
I
t
1
S
t
_
= dI
t
1
S
t
+I
t
d
_
1
S
t
_
+dI
t
d
_
1
S
t
_
= f(t)dt +R
t
_
(
2
)dt dW
t
_
.
Collecting terms yields the following stochastic dierential equation for R
t
:
dR
t
= R
t
dW
t
+
_
f(t) + (
2
)R
t
_
dt. (16.4.9)
308
The initial condition is R
0
=
I
0
S
0
= 0, since I
0
= 0.
Can we solve explicitly this equation? Can we nd the mean and variance of R
t
?
We shall start by nding the mean E[R
t
]. The equation can be written as
dR
t
(
2
)R
t
dt = f(t)dt R
t
dW
t
.
Multiplying by e
(
2
)t
d
_
e
(
2
)t
R
t
_
= e
(
2
)t
f(t)dt e
(
2
)t
R
t
dW
t
.
Integrating yields
e
(
2
)t
R
t
=
_
t
0
e
(
2
)u
f(u) du
_
t
0
e
(
2
)u
R
u
dW
u
.
The rst integral is deterministic while the second is an Ito integral. Using that the
expectations of Ito integrals vanish, we get
E[e
(
2
)t
R
t
] =
_
t
0
e
(
2
)u
f(u) du
and hence
E[R
t
] = e
(
2
)t
_
t
0
e
(
2
)u
f(u) du.
Exercise 16.4.2 Find E[R
2
t
] and V ar[R
t
].
Equation (16.4.9) is a linear equation of the type discussed in section 7.8. Multi-
plying by the integrating factor
t
= e
Wt+
1
2
2
t
the equation is transformed into an exact equation
d(
t
R
t
) =
_
t
f(t) + (
2
)
t
R
t
_
dt.
Substituting Y
t
=
t
R
t
yields
dY
t
=
_
t
f(t) + (
2
)Y
t
_
dt,
dY
t
(
2
)Y
t
dt =
t
f(t)dt.
Multiplying by e
(
2
)t
d(e
(
2
)t
Y
t
) = e
(
2
)t
t
f(t)dt,
309
which can be solved by integration
e
(
2
)t
Y
t
=
_
t
0
e
(
2
)u
u
f(u) du.
Going back to the variable R
t
= Y
t
/
t
, we obtain the following closed form expression
R
t
=
_
t
0
e
(
1
2
2
)(ut)+(WuWt)
f(u) du. (16.4.10)
Exercise 16.4.3 Find E[R
t
] by taking the expectation in formula (16.4.10).
It is worth noting that we can arrive at formula (16.4.10) directly, without going
through solving a stochastic dierential equation. We shall show this procedure in the
following.
Using the well-known formulas for the stock price
S
u
= S
0
e
(
1
2
2
)u+Wu
, S
t
= S
0
e
(
1
2
2
)t+Wt
,
and dividing, yields
S
u
S
t
= e
(
1
2
2
)(ut)+(WuWt)
.
Then we get
R
t
=
I
t
S
t
=
1
S
t
_
t
0
S
u
f(u) du
=
_
t
0
S
u
S
t
f(u) du =
_
t
0
e
(
1
2
2
)(ut)+(WuWt)
f(u) du,
which is formula (16.4.10).
Exercise 16.4.4 Find an explicit formula for R
t
in terms of the integrated Brownian
motion Z
()
t
=
_
t
0
e
Wu
du, in the case of an exponential weight with k =
1
2
2
, see
Example 16.1.1(c).
Exercise 16.4.5 (a) Find the price of a derivative G which satises
G
t
+
1
2
2
R
2
2
G
R
2
+ (1 rR)
G
R
= 0
with the payo G(T, R
T
) = R
2
T
.
(b) Find the value of an Asian derivative V
t
on the arithmetic average, that has the
payo
V
T
= V (T, S
T
, I
T
) =
I
2
T
S
T
,
where I
T
=
_
T
0
S
t
dt.
Exercise 16.4.6 Use a computer simulation to nd the value of an Asian arithmetic
average strike option with r = 4%, = 50%, S
0
= $40, and T = 0.5 years.
310
16.5 Asian Forward Contracts on Weighted Averages
Since the payo of this derivative is given by
V
T
= S
T
S
ave
(T) = S
T
_
1
R
T
g(T)
_
,
the reduction variable method suggests considering a solution of the type V (t, S
t
, I
t
) =
S
t
G(t, R
t
), where G(t, T) satises equation (16.3.4) with the nal condition G(T, R
T
) =
1
R
T
g(T)
. Since this is linear in R
T
, this implies looking for a solution G(t, R
t
) in the
following form
G(t, R
t
) = a(t)R
t
+b(t), (16.5.11)
with functions a(t) and b(t) subject to be determined. Substituting into (16.3.4) and
collecting R
t
yields
(a
(t) ra(t))R
t
+b
(t) +f(t)a(t) = 0.
Since this polynomial in R
t
vanishes for all values of R
t
, then its coecients are iden-
tically zero, so
a
(t) ra(t) = 0, b
(t) +f(t)a(t) = 0.
When t = T we have
G(T, R
T
) = a(T)R
T
+b(T) = 1
R
T
g(T)
.
Equating the coecients of R
T
yields the nal conditions
a(T) =
1
g(T)
, b(T) = 1.
The coecient a(t) satises the ordinary dierential equation
a
(t) = ra(t)
a(T) =
1
g(T)
which has the solution
a(t) =
1
g(T)
e
r(Tt)
.
The coecient b(t) satises the equation
b
(t) = f(t)a(t)
b(T) = 1
311
with the solution
b(t) = 1 +
_
T
t
f(u)a(u) du.
G(t, R) =
1
g(T)
e
r(Tt)
R
t
+ 1 +
_
T
t
f(u)a(u) du
= 1
1
g(T)
_
R
t
e
r(Tt)
+
_
T
t
f(u)e
r(Tu)
du
.
Then going back into the variable I
t
= S
t
R
t
yields
V (t, S
t
, I
t
) = S
t
G(t, R
t
)
= S
t
1
g(T)
_
I
t
e
r(Tt)
+S
t
_
T
t
f(u)e
r(Tu)
du
.
Using that (u) =
f(u)
g(T)
and going back to the initial variable S
ave
(t) = I
t
/g(t) yields
F(t, S
t
, S
ave
(t)) = V (t, S
t
, I
t
)
= S
t
g(t)
g(T)
S
ave
(t)e
r(Tt)
S
t
_
T
t
(u)e
r(Tu)
du.
We have arrived at the following result:
Proposition 16.5.1 The value at time t of an Asian forward contract on a weighted
average with the weight function (t), i.e. an Asian derivative with the payo F
T
=
S
T
S
ave
(T), is given by
F(t, S
t
, S
ave
(t)) = S
t
_
1
_
T
t
(u)e
r(Tu)
du
_
g(t)
g(T)
e
r(Tt)
S
ave
(t).
It is worth noting that the previous price can be written as a linear combination of
S
t
and S
ave
(t)
F(t, S
t
, S
ave
(t)) = (t)S
t
+(t)S
ave
(t),
where
(t) = 1
_
T
t
(u)e
r(Tu)
du
(t) =
g(t)
g(T)
e
r(Tt)
=
_
t
0
f(u) du
_
T
0
f(u) du
e
r(Tt)
.
In the rst formula (u)e
r(Tu)
is the discounted weight at time u, and (t) is 1 minus
the total discounted weight between t and T. One can easily check that (T) = 1 and
(T) = 1.
312
Exercise 16.5.2 Find the value at time t of an Asian forward contract on an arith-
metic average A
t
=
_
t
0
S
u
du.
Exercise 16.5.3 (a) Find the value at time t of an Asian forward contract on an
exponential weighted average with the weight given by Example 16.1.1 (c).
(b) What happens if k = r? Why?
Exercise 16.5.4 Find the value at time t of an Asian power contract with the payo
F
T
=
_ _
T
0
S
u
du
_
n
.
Chapter 17
American Options
American options are options that are allowed to be execised at any time before ma-
turity. Because of this advantage, they tend to be more expensive than the European
counterparts. Exact pricing formulas exist just for perpetuities, while they are missing
for nitely lived American options.
17.1 Perpetual American Options
A perpetual American option is an American option that never expires. These contracts
can be exercised at any time t, 0 t . Even if nding the optimal exercise time for
nite maturity American options is a delicate matter, in the case of perpetual American
calls and puts there is always possible to nd the optimal exercise time and to derive
a close form pricing formula (see Merton, [12]).
17.1.1 Present Value of Barriers
Our goal in this section will be to learn how to compute the present value of a contract
that pays a xed cash amount at a stochastic time dened by the rst passage of time
of a stock. These formulas will be the main ingredient in pricing perpetual American
options over the next couple of sections.
Reaching the barrier from below Let S
t
denote the stock price with initial
value S
0
and consider a positive number b such that b > S
0
. We recall the rst passage
of time
b
when the stock S
t
hits for the rst time the barrier b, see Fig. 17.1
b
= inft > 0; S
t
= b.
Consider a contract that pays $1 at the time when the stock reaches the barrier b for
the rst time. Under the constant interest rate assumption, the value of the contract
at time t = 0 is obtained by discounting the value of $1 at the rate r for the period
b
and taking the expectation in the risk neutral world
f
0
=

E[e
r
b
].
313
314
b
b
S
t
500 1000 1500 2000 2500 3000 3500
10
15
20
25
30
Figure 17.1: The rst passage of time when S
t
hits the level b from below.
In the following we shall compute the right side of the previous expression using
two dierent approaches. Using that the stock price in the risk-neutral world is given
by the expression S
t
= S
0
e
(r
2
2
)t+Wt
, with r >

2
2
, then
M
t
= e
rt
S
t
= S
0
e
Wt
2
2
t
, t 0
is a martingale. Applying the Optional Stopping Theorem (Theorem 3.2.1) yields
E[M
b
] = E[M
0
], which is equivalent to
E[e
r
b
S
b
] = S
0
.
Since S
b
= b, the previous relation implies
E[e
r
b
] =
S
0
b
,
where the expectation is taken in the risk-neutral world. Hence, we arrived at the
following result:
Proposition 17.1.1 The value at time t = 0 of $1 received at the time when the stock
reaches level b from below is
f
0
=
S
0
b
.
Exercise 17.1.2 In the previous proof we had applied the Optional Stopping Theorem
(Theorem 3.2.1). Show that the hypothesis of the theorem are satised.
Exercise 17.1.3 Let 0 < S
0
< b and assume r >

2
2
.
(a) Show that P(S
t
reaches b) = 1. Compare with Exercise 3.5.2.
(b) Prove the identity P(;
b
() < ) = 1.
315
The result of Proposition 17.1.1 can be also obtained directly as a consequence of
Proposition 3.6.4. Using the expression of the stock price in the risk-neutral world,
S
t
= S
0
e
(r
2
2
)t+Wt
, we have
b
= inft > 0; S
t
= b = inft > 0; (r

2
2
)t +W
t
= ln
b
S
0
= inft > 0; t +W
t
= x,
where x = ln
b
S
0
> 0 and = r

2
2
> 0. Choosing s = r in Proposition 3.6.4 yields
E[e
r
b
] = e
1
2
(
2s
2
+
2
)x
= e
1
2
_
r
2
2

2r
2
+(r
2
2
)
2
_
ln
b
S
0
= e
ln
b
S
0
= e
ln
S
0
b
=
S
0
b
,
where we used that
r

2
2

_
2r
2
+
_
r

2
2
_
2
= r

2
2

_
_
r +

2
2
_
2
= r

2
2
r

2
2
=
2
.
0
< b. Find the probability density function of the hitting time
b
.
Exercise 17.1.5 Assume the stock pays continuous dividends at the constant rate >
0 and let b > 0 such that S
0
< b.
(a) Prove that the value at time t = 0 of $1 received at the time when the stock reaches
level b from below is
f
0
=
_
S
0
b
_
h
1
,
where
h
1
=
1
2

r
2
+
_
_
r
2

1
2
_
2
+
2r
2
.
(b) Show that h
1
(0) = 1 and h
1
() is an increasing function for > 0.
(c) Find the limit of the rate of change lim
1
().
(d) Work out a formula for the sensitivity of the value f
0
with respect to the dividend
rate and compute the long run value of this rate.
Reaching the barrier from above Sometimes a stock can reach a barrier b from
above. Let S
0
be the initial value of the stock S
t
and assume the inequality b < S
0
.
Consider again the rst passage of time
b
= inft > 0; S
t
= b, see Fig. 17.2.
In this paragraph we compute the value of a contract that pays $1 at the time when
the stock reaches the barrier b for the rst time, which is given by f
0
=

E[e
r
b
]. We
shall keep the assumption that the interest rate r is constant and r >

2
2
.
316
b
b
S
t
500 1000 1500 2000 2500 3000 3500
4
6
8
10
12
Figure 17.2: The rst passage of time when S
t
hits the level b from above.
Proposition 17.1.6 The value at time t = 0 of $1 received at the time when the stock
reaches level b from above is
f
0
=
_
S
0
b
_2r
2
.
Proof: The reader might be tempted to use the Optional Stopping Theorem, but we
refrain from applying it in this case (Why?) We should rather use a technique which
reduces the problem to Proposition 3.6.6. Following this idea, we write
b
= inft > 0; S
t
= b = inft > 0;
_
r

2
2
_
t +W
t
= ln
b
S
0
= inft > 0; t +W
t
= x,
where x = ln
S
0
b
> 0, = r

2
2
. Choosing s = r in Proposition 3.6.6 yields
E[e
r
b
] = e
1
2
(+
2r
2
+
2
)x
= e
1
2
_
r
2
2
+
2r
2
+(r
2
2
)
2
_
x
= e
2r
2
x
= e
2r
2
ln
S
0
b
=
_
S
0
b
_
2r
2
.
In the previous computation we used that
r

2
2
+
_
2r
2
+
_
r

2
2
_
2
= r

2
2
+
_
_
r +

2
2
_
2
= r

2
2
+r +

2
2
= 2r.
Exercise 17.1.7 Assume the stock pays continuous dividends at the constant rate >
0 and let b > 0 such that b < S
0
.
317
(a) Use a similar method as in the proof of Proposition 17.1.6 to prove that the value
at time t = 0 of $1 received at the time when the stock reaches level b from above is
f
0
=
_
S
0
b
_
h
2
,
where
h
2
=
1
2

r
2

_
_
r
2

1
2
_
2
+
2r
2
.
(b) What is the value of the contract at any time t, with 0 t <
b
?
Perpetual American options have simple exact pricing formulas. This is because of
the time invariance property of their values. Since the time to expiration for these
type of options is the same (i.e innity), the option exercise problem looks the same
at every instance of time. Consequently, their value do not depend on the time to
expiration.
17.1.2 Perpetual American Calls
A perpetual American call is a call option that never expires, i.e. is a contract that
gives the holder the right to buy the stock for the price K at any instance of time
0 t +. The innity is included to cover the case when the option is never
exercised.
When the call is exercised the holder receives S
K, where denotes the exercise

time. Assume the holder has the strategy to exercise the call whenever the stock S
t
reaches the barrier b, with b > K subject to be determined later. Then at exercise time
b
the payo is b K, where
b
= inft > 0; S
t
= b.
We note that it makes sense to choose the barrier such that S
0
< b. The value of
this amount at time t = 0 is obtained discounting at the interest rate r and using
Proposition 17.1.1
f(b) = E[(b K)e
r
b
] = (b K)
S
0
b
=
_
1
K
b
_
S
0
.
We need to choose the value of the barrier b > 0 for which f(b) reaches its maximum.
Since 1
K
b
is an increasing function of b, the optimum value can be evaluated as
max
b>0
f(b) = max
b>0
_
1
K
b
_
S
0
= lim
b
_
1
K
b
_
S
0
= S
0
.
This is reached for the optimal barrier b
= , which corresponds to the innite

exercise time
b
= . Hence, it is never optimal to exercise a perpetual call option on
a nondividend paying stock.
The next exercise covers the case of the dividend paying stock. The method is
similar with the one described previously.
318
Exercise 17.1.8 Consider a stock that pays continuous dividends at rate > 0.
(a) Assume a perpetual call is exercised whenever the stock reaches the barrier b from
below. Show that the discounted value at time t = 0 is
f(b) = (b K)
_
S
0
h
_
h
1
,
where
h
1
=
1
2

r
2
+
_
_
r
2

1
2
_
2
+
2r
2
.
(b) Use dierentiation to show that the maximum value of f(b) is realized for
b
= K
h
1
h
1
1
.
(c) Prove the price of perpetual call
f(b
) = max
b>0
f(b) =
K
h
1
1
_
h
1
1
h
1
S
0
K
_
h
1
.
(d) Let
b
be the exercise time of the perpetual call. When do you expect to exercise
the call? (Find E[
b
]).
17.1.3 Perpetual American Puts
A perpetual American put is a put option that never expires, i.e. is a contract that gives
the holder the right to sell the stock for the price K at any instance of time 0 t < .
Assume the put is exercised when S
t
reaches the barrier b. Then its payo, KS
t
=
K b, has a couple of noteworthy features. First, if we choose b too large, we loose
option value, which eventually vanishes for b K. Second, if we pick b too small,
the chances that the stock S
t
will hit b are also small (see Exercise 17.1.9), fact that
diminishes the put value. It follows that the optimum exercise barrier, b
, is somewhere
in between the two previous extreme values.
Exercise 17.1.9 Let 0 < b < S
0
and let t > 0 xed.
(a) Show that the following inequality holds in the risk neutral world
P(S
t
< b) e
1
2
2
t
[ln(S
0
/b)+(r
2
/2)t]
2
.
(b) Use the Squeeze Theorem to show that lim
b0
+
P(S
t
< b) = 0.
Since a put is an insurance that gets exercised when the the stock declines, it makes
sense to assume that at the exercise time,
b
, the stock reaches the barrier b from above,
319
i.e 0 < b < S
0
. Using Proposition 17.1.6 we obtain the value of the contract that pays
K b at time
b
f(b) = E[(K b)e
r
b
] = (K b)E[e
r
b
] = (K b)
_
S
0
b
_
2r
2
.
We need to pick the optimal value b
for which f(b) is maximum

f(b
) = max
0<b<S
0
f(b).
It is useful to notice that the functions f(b) and g(b) = (Kb)b
2r
2
reach the maximum
for the same value of b. For the same of simplicity, denote =
2r
2
. Then
g(b) = Kb
b
+1
g
(b) = Kb
1
( + 1)b
= b
1
[K ( + 1)b],
and the equation g
(b) = 0 has the solution b
=

+1
K. Since g
(b) > 0 for b < b
and
g
(b) < 0 for b > b
, it follows that b
is a maximum point for the function g(b) and

hence for the function f(b). Substituting for the value of the optimal value of the
barrier becomes
b
=
2r/
2
2r/
2
+ 1
K =
K
1 +

2
2r
. (17.1.1)
The condition b
< K is obviously satised, while the condition b
< S
0
is equivalent
with
K <
_
1 +

2
2r
_
S
0
.
The value of the perpetual put is obtained computing the value at b
f(b
) = max f(b) = (K b
)
_
S
0
b
2r
2
=
_
K
K
1 +

2
2r
__
S
0
K
_
1 +

2
2r
__
2r
2
=
K
1 +
2r
2
_
S
0
K
_
1 +

2
2r
__
2r
2
.
Hence the price of a perpetual put is
K
1 +
2r
2
_
S
0
K
_
1 +

2
2r
__
2r
2
.
The optimal exercise time of the put,
b
, is when the stock hits the optimal barrier
b
= inft > 0; S
t
= b
= inft > 0; S
t
= K/
_
1 +

2
2r
_
.
320
But when is the expected exercise time of a perpetual American put? To answer this
question we need to compute E[
b
]. Substituting
=
b
, x = ln
S
0
b
, = r

2
2
(17.1.2)
in Proposition 3.6.6 (c) yields
E[
b
] =
x
2x
2
=
ln
S
0
b
r

2
2
e
2
(r
2
2
) ln
S
0
b
= ln
__
S
0
b
_
1
r
2
2
_
e
(1
2r
2
) ln
S
0
b
= ln
__
S
0
b
_
1
r
2
2
__
S
0
b
_
1
2r
2
.
Hence the expected time when the holder should exercise the put is given by the exact
formula
E[
b
] = ln
__
S
0
b
_
1
r
2
2
__
S
0
b
_
1
2r
2
,
with b
given by (17.1.1). The probability density function of the optimal exercise time
b
can be found from Proposition 3.6.6 (b) using substitutions (17.1.2)
p() =
ln
S
0
b
2
3/2
e
_
ln
S
0
b
+(r
2
2
)
_
2
2
2
, > 0. (17.1.3)
Exercise 17.1.10 Consider a stock that pays continuous dividends at rate > 0.
(a) Assume a perpetual put is exercised whenever the stock reaches the barrier b from
above. Show that the discounted value at time t = 0 is
g(b) = (K b)
_
S
0
h
_
h
2
,
where
h
2
=
1
2

r
2

_
_
r
2

1
2
_
2
+
2r
2
.
(b) Use dierentiation to show that the maximum value of g(b) is realized for
b
= K
h
2
h
2
1
.
(c) Prove the price of perpetual put
g(b
) = max
b>0
g(b) =
K
1 h
2
_
h
2
1
h
2
S
0
K
_
h
2
.
321
e
5 10 15 20 25 30
0.3
0.2
0.1
0.1
0.2
0.3
Figure 17.3: The graph of the function g(b) =
ln b
b
, b > 0.
17.2 Perpetual American Log Contract
A perpetual American log contract is a contract that never expires and can be exercised
at any time, providing the holder the log of the value of the stock, ln S
t
, at the exercise
time t. It is interesting to note that these type of contracts are always optimal to be
exercised, and their pricing formula is fairly uncomplicated.
Assume the contract is exercised when the stock S
t
reaches the barrier b, with
S
0
< b. If the hitting time of the barrier b is
b
, then its payo is ln S
= ln b.
Discounting at the risk free interest rate, the value of the contract at time t = 0 is
f(b) = E[e
r
b
ln S
] = E[e
r
b
ln b] =
ln b
b
S
0
,
since the barrier is assumed to be reached from below, see Proposition 17.1.1.
The function g(b) =
ln b
b
, b > 0, has the derivative g
(b) =
1 ln b
b
2
, so b
= e is a
global maximum point, see Fig. 17.3. The maximum value is g(b
) = 1/e. Then the

optimal value of the barrier is b
= e, and the price of the contract at t = 0 is

f
0
= max
b>0
f(b) = S
0
max
b>0
g(b) =
S
0
e
.
In order for the stock to reach the optimum barrier b
from below we need to require

the condition S
0
< e. Hence we arrived at the following result:
Proposition 17.2.1 Let S
0
< e. Then the optimal exercise price of a perpetual Amer-
ican log contract is
= inft > 0; S
t
= e,
and its value at t = 0 is
S
0
e
.
Remark 17.2.2 If S
0
> e, then it is optimal to exercise the perpetual log contract as
soon as possible.
322
Exercise 17.2.3 Consider a stock that pays continuous dividends at a rate > 0, and
assume that S
0
< e
1/h
1
, where
h
1
=
1
2

r
2
+
_
_
r
2

1
2
_
2
+
2r
2
.
(a) Assume a perpetual log contract is exercised whenever the stock reaches the barrier
b from below. Show that the discounted value at time t = 0 is
f(b) = ln b
_
S
0
b
_
h
1
.
(b) Use dierentiation to show that the maximum value of f(b) is realized for
b
= e
1/h
1
.
(c) Prove the price of perpetual log contract
f(b
) = max
b>0
f(b) =
S
h
1
0
h
1
e
.
(d) Show that the higher the dividend rate , the lower the optimal exercise time is.
17.3 Perpetual American Power Contract
A perpetual American power contract is a contract that never expires and can be exer-
cised at any time, providing the holder the n-th power of the value of the stock, (S
t
)
,
at the exercise time t, where ,= 0. (If = 0 the payo is a constant, which is equal
to 1).
1. Case > 0. Since we expect the value of the payo to increase over time, we
assume the contract is exercised when the stock S
t
reaches the barrier b, from below.
If the hitting time of the barrier b is
b
, then its payo is (S
. Discounting at the
risk free interest rate, the value of the contract at time t = 0 is
f(b) = E[e
r
b
(S
] = E[e
r
b
b
] = b
S
0
b
= b
1
S
0
,
where we used Proposition 17.1.1. We shall discuss the following cases:
(i) If > 1, then the optimal barrier is b
= , and hence, it is never optimal to

exercise the contract in this case.
(ii) If 0 < < 1, the function f(b) is decreasing, so its maximum is reached for
b
= S
0
, which corresponds to
b
= 0. Hence is is optimal to exercise the contract as
soon as possible.
(iii) In the case = 1, the value of f(b) is constant, and the contract can be
exercised at any time.
323
2. Case < 0. The payo value, (S
t
)
, is expected to decrease, so we assume the

exercise occurs when the stock reaches the barrier b from above. Discounting to initial
time t = 0 and using Proposition 17.1.6 yields
f(b) = E[e
r
b
(S
] = E[e
r
b
b
] = b
_
S
0
b
_
2r
2
= b
+
2r
2
S
2r
2
0
.
(i) If <
2r
2
, then f(b) is decreasing, so its maximum is reached for b
= S
0
,
which corresponds to
b
= 0. Hence is is optimal to exercise the contract as soon as
possible.
(ii) If
2r
2
< < 0, then the maximum of f(b) occurs for b
= . Hence it is
never optimal to exercise the contract.
Exercise 17.3.1 Consider a stock that pays continuous dividends at a rate > 0, and
assume > h
1
, with h
1
=
1
2
2
+
_
_
r
2

1
2
_
2
+
2r
2
. Show that the perpetual power
contract with payo S
t
is never optimal to be exercised.
Exercise 17.3.2 (Perpetual American power put) Consider an perpetual American-
type contract with the payo (KS
t
)
2
, where K > 0 is the strike price. Find the optimal
exercise time and the contract value at t = 0.
17.4 Finitely Lived American Options
Exact pricing formulas are great when they exist and can be easily implemented. Even
if we cherish all closed form pricing formulas we can get, there is also a time when
exact formulas are not possible and approximations are in order. If we run into a
problem whose solution cannot be found explicitly, it would still be very valuable to
know something about its approximate quantitative behavior. This will be the case of
nitely lived American options.
17.4.1 American Call
The case of Non-dividend Paying Stock
The holder has the right to buy a stock for the price K at any time before or at the
expiration T. The strike price K and expiration T are specied at the beginning of
the contract. The payo at time t is (S
t
K)
+
= maxS
t
K, 0. The price of the
American call at time t = 0 is given by
f
0
= max
0T
E[e
r
(S
K)
+
],
where the maximum is taken over all stopping times less than or equal to T.
324
Theorem 17.4.1 It is not optimal to exercise an American call on a non-dividend pay-
ing stock early. It is optimal to exercise the call at maturity, T, if at all. Consequently,
the price of an American call is equal to the price of the corresponding European call.
Proof: The heuristic idea of the proof is based on the observation that the dierence
S
t
K tends to be larger as time goes on. As a result, there is always hope for a larger
payo and the later we exercise the better. In the following we shall formalize this
idea mathematically using the submartingale property of the stock together with the
Optional Stopping Theorem.
Let X
t
= e
rt
S
t
and f(t) = Ke
rt
. Since X
t
is a martingale in the risk-neutral-
world, see Proposition 14.1.1, and f(t) is an increasing, integrable function, applying
Proposition 2.12.3 (c) it follows that
Y
t
= X
t
+f(t) = e
rt
(S
t
K)
is an T
t
-submartingale, where T
t
is the information set provided by the underlying
Brownian motion W
t
.
Since the hokey-stick function (x) = x
+
= maxx, 0 is convex, then by Proposi-
tion 2.12.3 (b), the process Z
t
= (Y
t
) = e
rt
(S
t
K)
+
is a submartingale. Applying
Doobs stopping theorem (see Theorem 3.2.2) for stopping times and T, with T,
we obtain

E[Z
]

E[Z
T
]. This means
E[e
r
(S
K)
+
]

E[e
rT
(S
T
K)
+
],
i.e. the maximum of the American call price is realized for the optimum exercise time
= T. The maximum value is given by the right side, which denotes the price of an
European call option.
With a slight modication in the proof we can treat the problem of American power
contract.
Proposition 17.4.2 (American power contract) Consider a contract with matu-
rity date T, which pays, when exercised, S
n
t
, where n > 1, and t T. Then it is not
optimal to exercise this contract early.
Proof: Using that M
t
= e
rt
S
t
is a martingale (in the risk neutral world), then
X
t
= M
n
t
is a submartingale. Since Y
t
= e
rt
S
n
t
=
_
e
rt
S
t
_
n
e
(n1)rt
= X
t
e
(n1)rt
,
then for s < t
E[Y
t
[T
s
] = E[X
t
e
(n1)rt
[T
s
] > E[X
t
e
(n1)rs
[T
s
]
= e
(n1)rs
E[X
t
[T
s
] e
(n1)rs
X
s
= Y
s
,
325
so Y
t
is a submartingale. Applying Doobs stopping theorem (see Theorem 3.2.2)
for stopping times and T, with T, we obtain

E[Y
]

E[Y
T
], or equivalently,
E[e
r
S
n
]

E[e
rT
S
n
T
], which implies
max
T
E[e
r
S
n
] =

E[e
rT
S
n
T
].
Then it is optimal to exercise the contract at maturity T.
Exercise 17.4.3 Consider an American future contract with maturity date T and de-
livery price K, i.e. a contract with payo at maturity S
T
K, which can be exercised
at any time t T. Show that it is not optimal to exercise this contract early.
Exercise 17.4.4 Consider an American option contract with maturity date T, and
time-dependent strike price K(t), i.e. a contract with payo at maturity S
T
K(T),
which can be exercised at any time.
(a) Show that it is not optimal to exercise this contract early in the following two cases:
(i) If K(t) is a decreasing function;
(ii) If K(t) is an increasing function with K(t) < e
rt
.
(b) What happens if K(t) is increasing and K(t) > e
rt
?
The case of Dividend Paying Stock
When the stock pays dividends it is optimal to exercise the American call early. An
exact solution of this problem is hard to get explicitly, or might not exist. However,
there are some asymptotic solutions that are valid close to expiration (see Wilmott)
and analytic approximations given by MacMillan, Barone-Adesi and Whaley.
In the following we shall discuss why it is dicult to nd an exact optimal exercise
time for an American call on a dividend paying stock. First, consider two contracts:
1. Consider a contract by which one can acquire a stock, S
t
, at any time before or
at time T. This can be seen as a contract with expiration T, that pays when exercised
the stock price, S
t
. Assume the stock pays dividends at a continuous rate > 0. When
should the contract be exercised in order to maximize its value?
The value of the contract at time t = 0 is max
T
e
r
S
, where the maximum is

taken over all stopping times less than or equal to T. Since the stock price, which
pays dividends at rate , is given by
S
t
= S
0
e
(r
2
2
)t+Wt
,
then M
t
= e
(r)t
S
t
is a martingale (in the risk neutral world). Therefore e
rt
S
t
=
e
t
M
t
. Let X
t
= e
rt
S
t
. Then for 0 < s < t
E[X
t
[T
s
] =

E[e
t
M
t
[T
s
] <

E[e
s
M
t
[T
s
]
= e
s
E[M
t
[T
s
] = e
s
M
s
= e
rs
S
s
= M
s
,
326
so X
t
is a supermartingale (i.e. X
t
is a submartingale). Applying the Optional
Stopping Theorem for the stopping time we obtain
E[X
]

E[X
0
] = S
0
.
Hence it is optimal to exercise the contract at the initial time t = 0. This makes sense,
since in this case we have a longer period of time during which dividends are collected.
2. Consider a contract by which one has to pay the amount of cash K at any time
before or at time T. Given the time value of money
e
rt
K > e
rT
K, t < T,
it is always optimal to defer the payment until time T.
3. Now consider a combination of the previous two contracts. This new contract
pays S
t
K and can be exercised at any time t, with t T. Since it is not clear when
is optimal to exercise this contract, we shall consider two limiting cases. Let
denote
its optimal exercise time.
(i) When K 0
+
, then
0
+
, because we approach the conditions of case 1,
and also assume continuity conditions on the price.
(ii) If K , then the latter the pay day, the better, i.e.
.
The optimal exercise time,
, is somewhere between 0 and T, with the tendency

of moving towards T as K gets large.
17.4.2 American Put
17.4.3 Mac Millan-Barone-Adesi-Whaley Approximation
TO COME
17.4.4 Blacks Approximation
TO COME
17.4.5 Roll-Geske-Whaley Approximation
TO COME
17.4.6 Other Approximations
Chapter 18
Hints and Solutions
Chapter 1
1.6.1 Let X N(,
2
). Then the distribution function of Y is
F
Y
(y) = P(Y < y) = P(X + < y) = P
_
X <
y
_
=
1
2
_ y
(x)
2
2
2
dx =
1
2
_
y
_
z(+)
_
2
2
2
2
dz
=
1
_
y
(z
)
2
2(
)
2
dz,
with
= +,
= .
1.6.2 (a) Making t = n yields E[Y
n
] = E[e
nX
] = e
n+n
2
2
/2
.
(b) Let n = 1 and n = 2 in (a) to get the rst two moments and then use the formula
of variance.
1.11.5 The tower property
E
_
E[X[(][1
= E[X[1], 1 (
is equivalent with
_
A
E[X[(] dP =
_
X dP, A 1.
Since A (, the previous relation holds by the denition on E[X[(].
1.11.6 (a) Direct application of the denition.
(b) P(A) =
_
A
dP =
_
A
() dP() = E[
A
].
(d) E[
A
X] = E[
A
]E[X] = P(A)E[X].
327
328
(e) We have the sequence of equivalencies
E[X[(] = E[X]
_
A
E[X] dP =
_
A
X dP, A (
E[X]P(A) =
_
A
X dP E[X]P(A) = E[
A
],
which follows from (d).
1.12.4 If = E[X] then
E[(X E[X])
2
] = E[X
2
2X +
2
] = E[X
2
] 2
2
+
2
= E[X
2
] E[X]
2
= V ar[X].
1.12.5 From 1.12.4 we have V ar(X) = 0 X = E[X], i.e. X is a constant.
1.12.6 The same proof as the one in Jensens inequality.
1.12.7 It follows from Jensens inequality or using properties of integrals.
1.12.12 (a) m(t) = E[e
tX
] =
k
e
tk
k
e
k!
= e
(e
t
1)
;
(b) It follows from the rst Cherno bound.
1.12.16 Choose f(x) = x
2k+1
and g(x) = x
2n+1
.
1.13.2 By direct computation we have
E[(X Y )
2
] = E[X
2
] +E[Y
2
] 2E[XY ]
= V ar(X) +E[X]
2
+V ar[Y ] +E[Y ]
2
2E[X]E[Y ]
+2E[X]E[Y ] 2E[XY ]
= V ar(X) +V ar[Y ] + (E[X] E[Y ])
2
2Cov(X, Y ).
1.13.3 (a) Since
E[(X X
n
)
2
] E[X X
n
]
2
= (E[X
n
] E[X])
2
0,
the Squeeze theorem yields lim
n
(E[X
n
] E[X]) = 0.
(b) Writing
X
2
n
X
2
= (X
n
X)
2
2X(X X
n
),
and taking the expectation we get
E[X
2
n
] E[X
2
] = E[(X
n
X)
2
] 2E[X(X X
n
)].
329
The right side tends to zero since
E[(X
n
X)
2
] 0
[E[X(X X
n
)][
_
[X(X X
n
)[ dP
_
_
X
2
dP
_
1/2
_
_
(X X
n
)
2
dP
_
1/2
=
_
E[X
2
]E[(X X
n
)
2
] 0.
(c) It follows from part (b).
(d) Apply Exercise 1.13.2.
1.13.4 Using Jensens inequality we have
E
__
E[X
n
[1] E[X[1]
_
2
= E
__
E[X
n
X[1]
_
2
E
_
E[(X
n
X)
2
[1]
= E[(X
n
X)
2
] 0,
as n 0.
1.15.7 The integrability of X
t
follows from
E[[X
t
[] = E
_
[E[X[T
t
][
E
_
E[[X[ [T
t
]
= E[[X[] < .
X
t
is T
t
-predictable by the denition of the conditional expectation. Using the tower
property yields
E[X
t
[T
s
] = E
_
E[X[T
t
][T
s
= E[X[T
s
] = X
s
, s < t.
1.15.8 Since
E[[Z
t
[] = E[[aX
t
+bY
t
+c[] [a[E[[X
t
[] +[b[E[[Y
t
[] +[c[ <
then Z
t
is integrable. For s < t, using the martingale property of X
t
and Y
t
we have
E[Z
t
[T
s
] = aE[X
t
[T
s
] +bE[Y
t
[T
s
] +c = aX
s
+bY
s
+c = Z
s
.
1.15.9 In general the answer is no. For instance, if X
t
= Y
t
the process X
2
t
is not a
martingale, since the Jensens inequality
E[X
2
t
[T
s
]
_
E[X
t
[T
s
]
_
2
= X
2
s
is not necessarily an identity. For instance B
2
t
is not a martingale, with B
t
the Brownian
motion process.
330
1.15.10 It follows from the identity
E[(X
t
X
s
)(Y
t
Y
s
)[T
s
] = E[X
t
Y
t
X
s
Y
s
[T
s
].
1.15.11 (a) Let Y
n
= S
n
E[S
n
]. We have
Y
n+k
= S
n+k
E[S
n+k
]
= Y
n
+
k
j=1
X
n+j

k
j=1
E[X
n+j
].
Using the properties of expectation we have
E[Y
n+k
[T
n
] = Y
n
+
k
j=1
E[X
n+j
[T
n
]
k
j=1
E[E[X
n+j
][T
n
]
= Y
n
+
k
j=1
E[X
n+j
]
k
j=1
E[X
n+j
]
= Y
n
.
(b) Let Z
n
= S
2
n
V ar(S
n
). The process Z
n
is an T
n
-martingale i
E[Z
n+k
Z
n
[T
n
] = 0.
Let U = S
n+k
S
n
. Using the independence we have
Z
n+k
Z
n
= (S
2
n+k
S
2
n
)
_
V ar(S
n+k
V ar(S
n
)
_
= (S
n
+U)
2
S
2
n
_
V ar(S
n+k
V ar(S
n
)
_
= U
2
+ 2US
n
V ar(U),
so
E[Z
n+k
Z
n
[T
n
] = E[U
2
] + 2S
n
E[U] V ar(U)
= E[U
2
] (E[U
2
] E[U]
2
)
= 0,
since E[U] = 0.
1.15.12 Let T
n
= (X
k
; k n). Using the independence
E[[P
n
[] = E[[X
0
[] E[[X
n
[] < ,
so [P
n
[ integrable. Taking out the predictable part we have
E[P
n+k
[T
n
] = E[P
n
X
n+1
X
n+k
[T
n
] = P
n
E[X
n+1
X
n+k
[T
n
]
= P
n
E[X
n+1
] E[X
n+k
] = P
n
.
331
1.15.13 (a) Since the random variable Y = X is normally distributed with mean
and variance
2
2
, then
E[e
X
] = e
+
1
2
2
.
Hence E[e
X
] = 1 i +
1
2
2
= 0 which has the nonzero solution = 2/
2
.
(b) Since e
X
i
are independent, integrable and satisfy E[e
X
i
] = 1, by Exercise
1.15.12 we get that the product Z
n
= e
Sn
= e
X
1
e
Xn
is a martingale.
Chapter 2
2.1.4 B
t
starts at 0 and is continuous in t. By Proposition 2.1.2 B
t
is a martingale
with E[B
2
t
] = t < . Since B
t
B
s
N(0, [t s[), then E[(B
t
B
s
)
2
] = [t s[.
2.1.10 It is obvious that X
t
= W
t+t
0
W
t
0
satises X
0
= 0 and that X
t
is continuous
in t. The increments are normal distributed X
t
X
s
= W
t+t
0
W
s+t
0
N(0, [t s[).
If 0 < t
1
< < t
n
, then 0 < t
0
< t
1
+ t
0
< < t
n
+ t
0
. The increments
X
t
k+1
X
t
k
= W
t
k+1
+t
0
W
t
k
+t
0
are obviously independent and stationary.
2.1.11 For any > 0, show that the process X
t
=
1
W
t
is a Brownian motion. This
says that the Brownian motion is invariant by scaling. Let s < t. Then X
t
X
s
=
1
(W
t
W
s
)
1
N
_
0, (t s)
_
= N(0, t s). The other properties are obvious.
2.1.12 Apply Property 2.1.8.
2.1.13 Using the moment generating function, we get E[W
3
t
] = 0, E[W
4
t
] = 3t
2
.
2.1.14 (a) Let s < t. Then
E[(W
2
t
t)(W
2
s
s)] = E
_
E[(W
2
t
t)(W
2
s
s)][T
s
_
= E
_
(W
2
s
s)E[(W
2
t
t)][T
s
_
= E
_
(W
2
s
s)
2
_
= E
_
W
4
s
2sW
2
s
+s
2
_
= E[W
4
s
] 2sE[W
2
s
] +s
2
= 3s
2
2s
2
+s
2
= 2s
2
.
(b) Using part (a) we have
2s
2
= E[(W
2
t
t)(W
2
s
s)] = E[W
2
s
W
2
t
] sE[W
2
t
] tE[W
2
s
] +ts
= E[W
2
s
W
2
t
] st.
Therefore E[W
2
s
W
2
t
] = ts + 2s
2
.
(c) Cov(W
2
t
, W
2
s
) = E[W
2
s
W
2
t
] E[W
2
s
]E[W
2
t
] = ts + 2s
2
ts = 2s
2
.
(d) Corr(W
2
t
, W
2
s
) =
2s
2
2ts
=
s
t
, where we used
V ar(W
2
t
) = E[W
4
t
] E[W
2
t
]
2
= 3t
2
t
2
= 2t
2
.
332
2.1.15 (a) The distribution function of Y
t
is given by
F(x) = P(Y
t
x) = P(tW
1/t
x) = P(W
1/t
x/t)
=
_
x/t
0
1/t
(y) dy =
_
x/t
0
_
t/(2)e
ty
2
/2
dy
=
_
x/
t
0
1
2
e
u
2
/2
du.
(b) The probability density of Y
t
is obtained by dierentiating F(x)
p(x) = F
(x) =
d
dx
_
x/
t
0
1
2
e
u
2
/2
du =
1
2
e
x
2
/2
,
and hence Y
t
N(0, t).
(c) Using that Y
t
has independent increments we have
Cov(Y
s
, Y
t
) = E[Y
s
Y
t
] E[Y
s
]E[Y
t
] = E[Y
s
Y
t
]
= E
_
Y
s
(Y
t
Y
s
) +Y
2
s
_
= E[Y
s
]E[Y
t
Y
s
] +E[Y
2
s
]
= 0 +s = s.
(d) Since
Y
t
Y
s
= (t s)(W
1/t
W
0
) s(W
1/s
W
1/t
)
E[Y
t
Y
s
] = (t s)E[W
1/t
] sE[W
1/s
W
1/t
] = 0,
and
V ar(Y
t
Y
s
) = E[(Y
t
Y
s
)
2
] = (t s)
2
1
t
+s
2
(
1
s

1
t
)
=
(t s)
2
+s(t s)
t
= t s.
2.1.16 (a) Applying the denition of expectation we have
E[[W
t
[] =
_

[x[
1
2t
e
x
2
2t
dx =
_

0
2x
1
2t
e
x
2
2t
dx
=
1
2t
_

0
e
y
2t
dy =
_
2t/.
(b) Since E[[W
t
[
2
] = E[W
2
t
] = t, we have
V ar([W
t
[) = E[[W
t
[
2
] E[[W
t
[]
2
= t
2t
= t(1
2
).
2.1.17 By the martingale property of W
2
t
t we have
E[W
2
t
[T
s
] = E[W
2
t
t[T
s
] +t = W
2
s
+t s.
333
2.1.18 (a) Expanding
(W
t
W
s
)
3
= W
3
t
3W
2
t
W
s
+ 3W
t
W
2
s
W
3
s
and taking the expectation
E[(W
t
W
s
)
3
[T
s
] = E[W
3
t
[T
s
] 3W
s
E[W
2
t
] + 3W
2
s
E[W
t
[T
s
] W
3
s
= E[W
3
t
[T
s
] 3(t s)W
s
W
3
s
,
so
E[W
3
t
[T
s
] = 3(t s)W
s
+W
3
s
,
since
E[(W
t
W
s
)
3
[T
s
] = E[(W
t
W
s
)
3
] = E[W
3
ts
] = 0.
(b) Hint: Start from the expansion of (W
t
W
s
)
4
.
2.2.3 Using that e
WtWs
is stationary, we have
E[e
WtWs
] = E[e
W
ts
] = e
1
2
(ts)
.
2.2.4 (a)
E[X
t
[T
s
] = E[e
Wt
[T
s
] = E[e
WtWs
e
Ws
[T
s
]
= e
Ws
E[e
WtWs
[T
s
] = e
Ws
E[e
WtWs
]
= e
Ws
e
t/2
e
s/2
.
(b) This can be written also as
E[e
t/2
e
Wt
[T
s
] = e
s/2
e
Ws
,
which shows that e
t/2
e
Wt
is a martingale.
(c) From the stationarity we have
E[e
cWtcWs
] = E[e
c(WtWs)
] = E[e
cW
ts
] = e
1
2
c
2
(ts)
.
Then for any s < t we have
E[e
cWt
[T
s
] = E[e
c(WtWs)
e
cWs
[T
s
] = e
cWs
E[e
c(WtWs)
[T
s
]
= e
cWs
E[e
c(WtWs)
] = e
cWs
e
1
2
c
2
(ts)
= Y
s
e
1
2
c
2
t
.
Multiplying by e
1
2
c
2
t
yields the desired result.
2.2.5 (a) Using Exercise 2.2.3 we have
Cov(X
s
, X
t
) = E[X
s
X
t
] E[X
s
]E[X
t
] = E[X
s
X
t
] e
t/2
e
s/2
= E[e
Ws+Wt
] e
t/2
e
s/2
= E[e
WtWs
e
2(WsW
0
)
] e
t/2
e
s/2
= E[e
WtWs
]E[e
2(WsW
0
)
] e
t/2
e
s/2
= e
ts
2
e
2s
e
t/2
e
s/2
= e
t+3s
2
e
(t+s)/2
.
334
(b) Using Exercise 2.2.4 (b), we have
E[X
s
X
t
] = E
_
E[X
s
X
t
[T
s
]
_
= E
_
X
s
E[X
t
[T
s
]
_
= e
t/2
E
_
X
s
E[e
t/2
X
t
[T
s
]
_
= e
t/2
E
_
X
s
e
s/2
X
s
]
_
= e
(ts)/2
E[X
2
s
] = e
(ts)/2
E[e
2Ws
]
= e
(ts)/2
e
2s
= e
t+3s
2
,
and continue like in part (a).
2.2.6 Using the denition of the expectation we have
E[e
2W
2
t
] =
_
e
2x
2
t
(x) dx =
1
2t
_
e
2x
2
e
x
2
2t
dx
=
1
2t
_
e
14t
2t
x
2
dx =
1
1 4t
,
if 1 4t > 0. Otherwise, the integral is innite. We used the standard integral
_
e
ax
2
=
_
/a, a > 0.
2.3.4 It follows from the fact that Z
t
is normally distributed.
2.3.5 Using the denition of covariance we have
Cov
_
Z
s
, Z
t
_
= E[Z
s
Z
t
] E[Z
s
]E[Z
t
] = E[Z
s
Z
t
]
= E
_
_
s
0
W
u
du
_
t
0
W
v
dv
_
= E
_
_
s
0
_
t
0
W
u
W
v
dudv
_
=
_
s
0
_
t
0
E[W
u
W
v
] dudv =
_
s
0
_
t
0
minu, v dudv
= s
2
_
t
2

s
6
_
, s < t.
2.3.6 (a) Using Exercise 2.3.5
Cov(Z
t
, Z
t
Z
th
) = Cov(Z
t
, Z
t
) Cov(Z
t
, Z
th
)
=
t
3
3
(t h)
2
(
t
2

t h
6
)
=
1
2
t
2
h +o(h).
(b) Using Z
t
Z
th
=
_
t
th
W
u
du = hW
t
+o(h),
Cov(Z
t
, W
t
) =
1
h
Cov(Z
t
, Z
t
Z
th
)
=
1
h
_
1
2
t
2
h +o(h)
_
=
1
2
t
2
.
335
2.3.7 Let s < u. Since W
t
has independent increments, taking the expectation in
e
Ws+Wt
= e
WtWs
e
2(WsW
0
)
we obtain
E[e
Ws+Wt
] = E[e
WtWs
]E[e
2(WsW
0
)
] = e
us
2
e
2s
= e
u+s
2
e
s
= e
u+s
2
e
min{s,t}
.
2.3.8 (a) E[X
t
] =
_
t
0
E[e
Ws
] ds =
_
t
0
E[e
s/2
] ds = 2(e
t/2
1)
(b) Since V ar(X
t
) = E[X
2
t
] E[X
t
]
2
, it suces to compute E[X
2
t
]. Using Exercise
2.3.7 we have
E[X
2
t
] = E
_
_
t
0
e
Wt
ds
_
t
0
e
Wu
du
_
= E
_
_
t
0
_
t
0
e
Ws
e
Wu
dsdu
_
=
_
t
0
_
t
0
E[e
Ws+Wu
] dsdu =
_
t
0
_
t
0
e
u+s
2
e
min{s,t}
dsdu
=
__
D
1
e
u+s
2
e
s
duds +
_ _
D
2
e
u+s
2
e
u
duds
= 2
__
D
2
e
u+s
2
e
u
duds =
4
3
_
1
2
e
2t
2e
t/2
+
3
2
_
,
where D
1
0 s < u t and D
2
0 u < s t. In the last identity we applied
Fubinis theorem. For the variance use the formula V ar(X
t
) = E[X
2
t
] E[X
t
]
2
.
2.3.9 (a) Splitting the integral at t and taking out the predictable part, we have
E[Z
T
[T
t
] = E[
_
T
0
W
u
du[T
t
] = E[
_
t
0
W
u
du[T
t
] +E[
_
T
t
W
u
du[T
t
]
= Z
t
+E[
_
T
t
W
u
du[T
t
]
= Z
t
+E[
_
T
t
(W
u
W
t
+W
t
) du[T
t
]
= Z
t
+E[
_
T
t
(W
u
W
t
) du[T
t
] +W
t
(T t)
= Z
t
+E[
_
T
t
(W
u
W
t
) du] +W
t
(T t)
= Z
t
+
_
T
t
E[W
u
W
t
] du +W
t
(T t)
= Z
t
+W
t
(T t),
336
since E[W
u
W
t
] = 0.
(b) Let 0 < t < T. Using (a) we have
E[Z
T
TW
T
[T
t
] = E[Z
T
[T
t
] TE[W
T
[T
t
]
= Z
t
+W
t
(T t) TW
t
= Z
t
tW
t
.
2.4.1
E[V
T
[T
t
] = E
_
e
t
0
Wu du+
T
t
Wu du
[T
t
_
= e
t
0
Wu du
E
_
e
T
t
Wu du
[T
t
_
= e
t
0
Wu du
E
_
e
T
t
(WuWt) du+(Tt)Wt
[T
t
_
= V
t
e
(Tt)Wt
E
_
e
T
t
(WuWt) du
[T
t
_
= V
t
e
(Tt)Wt
E
_
e
T
t
(WuWt) du
_
= V
t
e
(Tt)Wt
E
_
e
Tt
0
W d
_
= V
t
e
(Tt)Wt
e
1
2
(Tt)
3
3
.
2.6.1
F(x) = P(Y
t
x) = P(t +W
t
x) = P(W
t
x t)
=
_
xt
0
1
2t
e
u
2
2t
du;
f(x) = F
(x) =
1
2t
e
(xt)
2
2t
.
2.7.2 Since
P(R
t
) =
_

0
1
t
xe
x
2
2t
dx,
use the inequality
1
x
2
2t
< e
x
2
2t
< 1
to get the desired result.
2.7.3
E[R
t
] =
_

0
xp
t
(x) dx =
_

0
1
t
x
2
e
x
2
2t
dx
=
1
2t
_

0
y
1/2
e
y
2t
dy =
2t
_

0
z
3
2
1
e
z
dz
=
2t
_
3
2
_
=
2t
2
.
337
Since E[R
2
t
] = E[W
1
(t)
2
+W
2
(t)
2
] = 2t, then
V ar(R
t
) = 2t
2t
4
= 2t
_
1

4
_
.
2.7.4
E[X
t
] =
E[X
t
]
t
=
2t
2t
=
_
2t
0, t ;
V ar(X
t
) =
1
t
2
V ar(R
t
) =
2
t
(1

4
) 0, t .
By Example 1.13.1 we get X
t
0, t in mean square.
2.8.2
P(N
t
N
s
= 1) = (t s)e
(ts)
= (t s)
_
1 (t s) +o(t s)
_
= (t s) +o(t s).
P(N
t
N
s
1) = 1 P(N
t
N
s
= 0) +P(N
t
N
s
= 1)
= 1 e
(ts)
(t s)e
(ts)
= 1
_
1 (t s) +o(t s)
_
(t s)
_
1 (t s) +o(t s)
_
=
2
(t s)
2
= o(t s).
2.8.6 Write rst as
N
2
t
= N
t
(N
t
N
s
) +N
t
N
s
= (N
t
N
s
)
2
+N
s
(N
t
N
s
) + (N
t
t)N
s
+tN
s
,
then
E[N
2
t
[T
s
] = E[(N
t
N
s
)
2
[T
s
] +N
s
E[N
t
N
s
[T
s
] +E[N
t
t[T
s
]N
s
+tN
s
= E[(N
t
N
s
)
2
] +N
s
E[N
t
N
s
] +E[N
t
t]N
s
+tN
s
= (t s) +
2
(t s)
2
+tN
s
+N
2
s
sN
s
+tN
s
= (t s) +
2
(t s)
2
+ 2(t s)N
s
+N
2
s
= (t s) + [N
s
+(t s)]
2
.
Hence E[N
2
t
[T
s
] ,= N
2
s
and hence the process N
2
t
is not an T
s
-martingale.
338
2.8.7 (a)
m
Nt
(x) = E[e
xNt
] =
k0
e
xh
P(N
t
= k)
=
k0
e
xk
e
t
k
t
k
k!
= e
t
e
te
x
= e
t(e
x
1)
.
(b) E[N
2
t
] = m
Nt
(0) =
2
t
2
+t. Similarly for the other relations.
2.8.8 E[X
t
] = E[e
Nt
] = m
Nt
(1) = e
t(e1)
.
2.8.9 (a) Since e
xMt
= e
x(Ntt)
= e
tx
e
xNt
, the moment generating function is
m
Mt
(x) = E[e
xMt
] = e
tx
E[e
xNt
]
= e
tx
e
t(e
x
1)
= e
t(e
x
x1)
.
(b) For instance
E[M
3
t
] = m
Mt
(0) = t.
Since M
t
is a stationary process, E[(M
t
M
s
)
3
] = (t s).
2.8.10
V ar[(M
t
M
s
)
2
] = E[(M
t
M
s
)
4
] E[(M
t
M
s
)
2
]
2
= (t s) + 3
2
(t s)
2
2
(t s)
2
= (t s) + 2
2
(t s)
2
.
2.11.3 (a) E[U
t
] = E
_
_
t
0
N
u
du
_
=
_
t
0
E[N
u
] du =
_
t
0
udu =
t
2
2
.
(b) E
_
Nt
k=1
S
k
= E[tN
t
U
t
] = tt
t
2
2
=
t
2
2
.
2.11.5 The proof cannot go through because a product between a constant and a
Poisson process is not a Poisson process.
2.11.6 Let p
X
(x) be the probability density of X. If p(x, y) is the joint probability
density of X and Y , then p
X
(x) =
y
p(x, y). We have
y0
E[X[Y = y]P(Y = y) =
y0
_
xp
X|Y =y
(x[y)P(Y = y) dx
=
y0
_
xp(x, y)
P(Y = y)
P(Y = y) dx =
_
x
y0
p(x, y) dx
=
_
xp
X
(x) dx = E[X].
339
2.11.8 (a) Since T
k
has an exponential distribution with parameter
E[e
T
k
] =
_

0
e
x
e
x
dx =

+
.
(b) We have
U
t
= T
2
+ 2T
3
+ 3T
4
+ + (n 2)T
n1
+ (n 1)T
n
+ (t S
n
)n
= T
2
+ 2T
3
+ 3T
4
+ + (n 2)T
n1
+ (n 1)T
n
+nt n(T
1
+T
2
+ +T
n
)
= nt [nT
1
+ (n 1)T
2
+ +T
n
].
(c) Using that the arrival times S
k
, k = 1, 2, . . . n, have the same distribution as the
order statistics U
(k)
corresponding to n independent random variables uniformly dis-
tributed on the interval (0, t), we get
E
_
e
Ut
N
t
= n
_
= E[e
(tNt
N
t
k=1
S
k
)
[N
t
= n]
= e
nt
E[e
n
i=1
U
(i)
] = e
nt
E[e
n
i=1
U
i
]
= e
nt
E[e
U
1
] E[e
Un
]
= e
nt
1
t
_
t
0
e
x
1
dx
1

1
t
_
t
0
e
xn
dx
n
=
(1 e
t
)
n
n
t
n

(d) Using Exercise 2.11.6 we have
E
_
e
Ut
] =
n0
P(N
t
= n)E
_
e
Ut
N
t
= n
_
=
n0
e
t
n
n
n!
(1 e
t
)
n
n
t
n
= e
(1e
t
)/
3.11.6 (a) dt dN
t
= dt(dM
t
+dt) = dt dM
t
+dt
2
= 0
(b) dW
t
dN
t
= dW
t
(dM
t
+dt) = dW
t
dM
t
+dW
t
dt = 0
2.12.6 Use Doobs inequality for the submartingales W
2
t
and [W
t
[, and use that E[W
2
t
] =
t and E[[W
t
[] =
_
2t/, see Exercise 2.1.16 (a).
2.12.7 Divide by t in the inequality from Exercise 2.12.6 part (b).
2.12.10 Let = n and = n + 1. Then
E
_
sup
ntn=1
_
N
t
t

_
2
_

4(n + 1)
n
2

340
The result follows by taking n in the sequence of inequalities
0 E
__
N
t
t

_
2
_
E
_
sup
ntn=1
_
N
t
t

_
2
_

4(n + 1)
n
2

Chapter 3
3.1.2 We have
; () t =
_
, if c t
, if c > t
and use that , T
t
.
3.1.3 First we note that
; () < t =
_
0<s<t
; [W
s
()[ > K. (18.0.1)
This can be shown by double inclusion. Let A
s
= ; [W
s
()[ > K.
Let ; () < t, so infs > 0; [W
s
()[ > K < t. Then exists < u < t
such that [W
u
()[ > K, and hence A
u
.
Let

0<s<t
; [W
s
()[ > K. Then there is 0 < s < t such that [W
s
()[ >
K. This implies () < s and since s < t it follows that () < t.
Since W
t
is continuous, then (18.0.1) can be also written as
; () < t =
_
0<r<t,rQ
; [W
r
()[ > K,
which implies ; () < t T
t
since ; [W
r
()[ > K T
t
, for 0 < r < t. Next we
shall show that P( < ) = 1.
P(; () < ) = P
_
_
0<s
; [W
s
()[ > K
_
> P(; [W
s
()[ > K)
= 1
_
|x|<K
1
2s
e
y
2
2s
dy > 1
2K
2s
1, s .
Hence is a stopping time.
3.1.4 Let K
m
= [a +
1
m
, b
1
m
]. We can write
; t =
m1
_
r<t,rQ
; X
r
/ K
m
T
t
,
since ; X
r
/ K
m
= ; X
r
K
m
T
r
T
t
.
3.1.8 (a) We have ; c t = ; t/c T
t/c
T
t
. And P(c < ) = P( <
) = 1.
341
(b) ; f() t = ; f
1
(t) = T
f
1
(t)
T
t
, since f
1
(t) t. If f is bounded,
then it is obvious that P(f() < ) = 1. If lim
t
f(t) = , then P(f() < ) =
P
_
< f
1
()
_
= P( < ) = 1.
(c) Apply (b) with f(x) = e
x
.
3.1.10 If let G(n) = x; [x a[ <
1
n
, then a =
n1
G(n). Then
n
= inft
0; W
t
G(n) are stopping times. Since sup
n
n
= , then is a stopping time.
3.2.3 The relation is proved by verifying two cases:
(i) If ; > t then ( t)() = t and the relation becomes
M
() = M
t
() +M
() M
t
().
(ii) If ; t then ( t)() = () and the relation is equivalent with the
obvious relation
M
= M
.
3.2.5 Taking the expectation in E[M
[T
] = M
yields E[M
] = E[M
], and then
make = 0.
3.3.9 Since M
t
= W
2
T
t is a martingale with E[M
t
] = 0, by the Optional Stopping
Theorem we get E[M
a
] = E[M
0
] = 0, so E[W
2
a

a
] = 0, from where E[
a
] =
E[W
2
a
] = a
2
, since W
a
= a.
3.3.10 (a)
F(a) = P(X
t
a) = 1 P(X
t
> a) = 1 P( max
0st
W
s
> a)
= 1 P(T
a
t) = 1
2
2
_

|a|/
t
e
y
2
/2
dy
=
2
2
_

0
e
y
2
/2
dy
2
2
_

|a|/
t
e
y
2
/2
dy
=
2
2
_
|a|/
t
0
e
y
2
/2
dy.
(b) The density function is p(a) = F
(a) =
2
2t
e
a
2
/(2t)
, a > 0. Then
E[X
t
] =
_

0
xp(x) dx =
2
2t
_

0
xe
x
2
/(2t)
dy =
_
2t
E[X
2
t
] =
2
2t
_

0
x
2
e
x
2
/(2t)
dx =
4t
_

0
y
2
e
y
2
dy
=
2
2t
1
2
_

0
u
1/2
e
u
du =
1
2t
(3/2) = t
V ar(X
t
) = E[X
2
t
] E[X
t
]
2
= t
_
1
2
_
.
342
3.3.16 It is recurrent since P(t > 0 : a < W
t
< b) = 1.
3.4.4 Since
P(W
t
> 0; t
1
t t
2
) =
1
2
P(W
t
,= 0; t
1
t t
2
) =
1
arcsin
_
t
1
t
2
,
using the independence
P(W
1
t
> 0, W
2
t
) = P(W
1
t
> 0)P(W
2
t
> 0) =
1
2
_
arcsin
_
t
1
t
2
_
2
.
The probability for W
t
= (W
1
t
, W
2
t
) to be in one of the quadrants is
4
2
_
arcsin
_
t
1
t
2
_
2
.
3.5.2 (a) We have
P(X
t
goes up to ) = P(X
t
goes up to before down to ) = lim
e
2
1
e
2
e
2
= 1.
3.5.3
P(X
t
never hits ) = P(X
t
goes up to before down to ) = lim
e
2
1
e
2
e
2
.
3.5.4 (a) Use that E[X
T
] = p
(1 p
); (b) E[X
2
T
] =
2
p
+
2
(1 p
), with
p
=
e
2
1
e
2
e
2
; (c) Use V ar(T) = E[T
2
] E[T]
2
.
3.5.7 Since M
t
= W
2
t
t is a martingale, with E[M
t
] = 0, by the Optional Stopping
Theorem we get E[W
2
T
T] = 0. Using W
T
= X
T
T yields
E[X
2
T
2TX
T
+
2
T
2
] = E[T].
Then
E[T
2
] =
E[T](1 + 2E[X
T
]) E[X
2
T
]
2
.
Substitute E[X
T
] and E[X
2
T
] from Exercise 3.5.4 and E[T] from Proposition 3.5.5.
3.6.11 See the proof of Proposition 3.6.3.
3.6.12
E[T] =
d
ds
E[e
sT
]
|
s=0
=
1
(p
+(1 p
)) =
E[X
T
]
.
3.6.13 (b) Applying the Optional Stopping Theorem
E[e
cM
T
T(e
c
c1)
] = E[X
0
] = 1
E[e
caTf(c)
] = 1
E[e
Tf(c)
] = e
ac
.
343
Let s = f(c), so c = (s). Then E[e
sT
] = e
a(s)
.
(c) Dierentiating and taking s = 0 yields
E[T] = ae
a(0)
(0)
= a
1
f
(0)
= ,
so E[T] = .
(d) The inverse Laplace transform /
1
_
e
a(s)
_
cannot be represented by elementary
functions.
3.8.8 Use that if W
t
is a Brownian motion then also tW
1/t
is a Brownian motion.
3.8.11 Use E[([X
t
[ 0)
2
] = E[[X
t
[
2
] = E[(X
t
0)
2
].
3.8.15 Let a
n
= ln b
n
. Then G
n
= e
ln Gn
= e
a
1
++an
n
e
ln L
= L.
3.9.3 (a) L = 1. (b) A computation shows
E[(X
t
1)
2
] = E[X
2
t
2X
t
+ 1] = E[X
2
t
] 2E[X
t
] + 1
= V ar(X
t
) + (E[X
t
] 1)
2
.
(c) Since E[X
t
] = 1, we have E[(X
t
1)
2
] = V ar(X
t
). Since
V ar(X
t
) = e
t
E[e
Wt
] = e
t
(e
2t
e
t
) = e
t
1,
then E[(X
t
1)
2
] does not tend to 0 as t .
3.11.3 (a) E[(dW
t
)
2
dt
2
] = E[(dW
t
)
2
] dt
2
= 0.
(b) V ar((dW
t
)
2
dt) = E[(dW
2
t
dt)
2
] = E[(dW
t
)
4
2dtdW
t
+dt
2
]
= 3st
2
2dt 0 +dt
2
= 4dt
2
.
Chapter 4
4.2.3 (a) Use either the denition or the moment generation function to show that
E[W
4
t
] = 3t
2
. Using stationarity, E[(W
t
W
s
)
4
] = E[W
4
ts
] = 3(t s)
2
.
4.4.1 (a) E[
_
T
0
dW
t
] = E[W
T
] = 0.
(b) E[
_
T
0
W
t
dW
t
] = E[
1
2
W
2
T

1
2
T] = 0.
(c) V ar(
_
T
0
W
t
dW
t
) = E[(
_
T
0
W
t
dW
t
)
2
] = E[
1
4
W
2
T
+
1
4
T
2
1
2
TW
2
T
] =
T
2
2
.
4.6.2 X N(0,
_
T
1
1
t
dt) = N(0, ln T).
344
4.6.3 Y N(0,
_
T
1
tdt) = N
_
0,
1
2
(T
2
1)
_
4.6.4 Normally distributed with zero mean and variance
_
t
0
e
2(ts)
ds =
1
2
(e
2t
1).
4.6.5 Using the property of Wiener integrals, both integrals have zero mean and vari-
ance
7t
3
3
.
4.6.6 The mean is zero and the variance is t/3 0 as t 0.
4.6.7 Since it is a Wiener integral, X
t
is normally distributed with zero mean and
variance
_
t
0
_
a +
bu
t
_
2
du = (a
2
+
b
2
3
+ab)t.
Hence a
2
+
b
2
3
+ab = 1.
4.6.8 Since both W
t
and
_
t
0
f(s) dW
s
have the mean equal to zero,
Cov
_
W
t
,
_
t
0
f(s) dW
s
_
= E[W
t
,
_
t
0
f(s) dW
s
] = E[
_
t
0
dW
s
_
t
0
f(s) dW
s
]
= E[
_
t
0
f(u) ds] =
_
t
0
f(u) ds.
The general result is
Cov
_
W
t
,
_
t
0
f(s) dW
s
_
=
_
t
0
f(s) ds.
Choosing f(u) = u
n
yields the desired identity.
4.8.6 Apply the expectation to
_
Nt
k=1
f(S
k
)
_
2
=
Nt
k=1
f
2
(S
k
) + 2
Nt
k=j
f(S
k
)f(S
j
).
4.9.1
E
_
_
T
0
e
ks
dN
s
_
=

k
(e
kT
1)
V ar
_
_
T
0
e
ks
dN
s
_
=

2k
(e
2kT
1).
Chapter 5
5.2.1 Let X
t
=
_
t
0
e
Wu
du. Then
dG
t
= d(
X
t
t
) =
tdX
t
X
t
dt
t
2
=
te
Wt
dt X
t
dt
t
2
=
1
t
_
e
Wt
G
t
_
dt.
345
5.3.3
(a) e
Wt
(1 +
1
2
W
t
)dt +e
Wt
(1 +W
t
)dW
t
;
(b) (6W
t
+ 10e
5Wt
)dW
t
+ (3 + 25e
5Wt
)dt;
(c) 2e
t+W
2
t
(1 +W
2
t
)dt + 2e
t+W
2
t
W
t
dW
t
;
(d) n(t +W
t
)
n2
_
(t +W
t
+
n1
2
)dt + (t +W
t
)dW
t
_
;
(e)
1
t
_
W
t
1
t
_
t
0
W
u
du
_
dt;
(f)
1
t
_
e
Wt

t
_
t
0
e
Wu
du
_
dt.
5.3.4
d(tW
2
t
) = td(W
2
t
) +W
2
t
dt = t(2W
t
dW
t
+dt) +W
2
t
dt = (t +W
2
t
)dt + 2tW
t
dW
t
.
5.3.5 (a) tdW
t
+W
t
dt;
(b) e
t
(W
t
dt +dW
t
);
(c) (2 t/2)t cos W
t
dt t
2
sinW
t
dW
t
;
(d) (sin t +W
2
t
cos t)dt + 2 sin t W
t
dW
t
;
5.3.7 It follows from (5.3.9).
5.3.8 Take the conditional expectation in
M
2
t
= M
2
s
+ 2
_
t
s
M
u
dM
u
+N
t
N
s
and obtain
E[M
2
t
[T
s
] = M
2
s
+ 2E[
_
t
s
M
u
dM
u
[T
s
] +E[N
t
[T
s
] N
s
= M
2
s
+E[M
t
+t[T
s
] N
s
= M
2
s
+M
s
+t N
s
= M
2
s
+(t s).
5.3.9 Integrating in (5.3.10) yields
F
t
= F
s
+
_
t
s
f
x
dW
1
t
+
_
t
s
f
y
dW
2
t
.
One can check that E[F
t
[T
s
] = F
s
.
5.3.10 (a) dF
t
= 2W
1
t
dW
1
t
+ 2W
2
t
dW
2
t
+ 2dt; (b) dF
t
=
2W
1
t
dW
1
t
+ 2W
2
t
dW
2
t
(W
1
t
)
2
+ (W
2
t
)
2
.
346
5.3.11 Consider the function f(x, y) =
_
x
2
+y
2
. Since
f
x
=
x
_
x
2
+y
2
,
f
y
=
y
_
x
2
+y
2
, f =
1
2
_
x
2
+y
2
, we get
dR
t
=
f
x
dW
1
t
+
f
y
dW
2
t
+ fdt =
W
1
t
R
t
dW
1
t
+
W
2
t
R
t
dW
2
t
+
1
2R
t
dt.
Chapter 6
6.2.1 (a) Use integration formula with g(x) = tan
1
(x).
_
T
0
1
1 +W
2
t
dW
t
=
_
T
0
(tan
1
)
(W
t
) dW
t
= tan
1
W
T
+
1
2
_
T
0
2W
t
(1 +W
t
)
2
dt.
(b) Use E
_
_
T
0
1
1 +W
2
t
dW
t
_
= 0.
(c) Use Calculus to nd minima and maxima of the function (x) =
x
(1 +x
2
)
2
.
6.2.2 (a) Use integration by parts with g(x) = e
x
and get
_
T
0
e
Wt
dW
t
= e
W
T
1
1
2
_
T
0
e
Wt
dt.
(b) Applying the expectation we obtain
E[e
W
T
] = 1 +
1
2
_
T
0
E[e
Wt
] dt.
If let (T) = E[e
W
T
], then satises the integral equation
(T) = 1 +
1
2
_
T
0
(t) dt.
Dierentiating yields the ODE
(T) =
1
2
(T), with (0) = 1. Solving yields (T) =
e
T/2
.
6.2.3 (a) Apply integration by parts with g(x) = (x 1)e
x
to get
_
T
0
W
t
e
Wt
dW
t
=
_
T
0
g
(W
t
) dW
t
= g(W
T
) g(0)
1
2
_
T
0
g
(W
t
) dt.
(b) Applying the expectation yields
347
E[W
T
e
W
T
] = E[e
W
T
] 1 +
1
2
_
T
0
_
E[e
Wt
] +E[W
t
e
Wt
]
_
dt
= e
T/2
1 +
1
2
_
T
0
_
e
t/2
+E[W
t
e
Wt
]
_
dt.
Then (T) = E[W
T
e
W
T
] satises the ODE
(T) (T) = e
T/2
with (0) = 0.
6.2.4 (a) Use integration by parts with g(x) = ln(1 +x
2
).
(e) Since ln(1 +T) T, the upper bound obtained in (e) is better than the one in (d),
without contradicting it.
6.3.1 By straightforward computation.
6.3.2 By computation.
6.3.13 (a)
e
2
sin(
2W
1
); (b)
1
2
e
2T
sin(2W
3
); (c)
1
2
(e
2W
4
4
1).
6.3.14 Apply Itos formula to get
d(t, W
t
) = (
t
(t, W
t
) +
1
2
2
x
(t, W
t
))dt +
x
(t, W
t
)dW
t
= G(t)dt +f(t, W
t
) dW
t
.
Integrating between a and b yields
(t, W
t
)[
b
a
=
_
b
a
G(t) dt +
_
b
a
f(t, W
t
) dW
t
.
Chapter 7
7.2.1 Integrating yields X
t
= X
0
+
_
t
0
(2X
s
+e
2s
) ds+
_
t
0
b dW
s
. Taking the expectation
we get
E[X
t
] = X
0
+
_
t
0
(2E[X
s
] +e
2s
) ds.
Dierentiating we obtain f
(t) = 2f(t) + e
2t
, where f(t) = E[X
t
], with f(0) = X
0
.
2t
yields (e
2t
f(t))
= 1. Integrating yields
f(t) = e
2t
(t +X
0
).
7.2.4 (a) Using product rule and Itos formula, we get
d(W
2
t
e
Wt
) = e
Wt
(1 + 2W
t
+
1
2
W
2
t
)dt +e
Wt
(2W
t
+W
2
t
)dW
t
.
Integrating and taking expectations yields
E[W
2
t
e
Wt
] =
_
t
0
_
E[e
Ws
] + 2E[W
s
e
Ws
] +
1
2
E[W
2
s
e
Ws
]
_
ds.
348
Since E[e
Ws
] = e
t/2
, E[W
s
e
Ws
] = te
t/2
, if let f(t) = E[W
2
t
e
Wt
], we get by dierentiation
f
(t) = e
t/2
+ 2te
t/2
+
1
2
f(t), f(0) = 0.
t/2
yields (f(t)e
t/2
)
= 1 + 2t. Integrating
yields the solution f(t) = t(1 +t)e
t/2
. (b) Similar method.
7.2.5 (a) Using Exercise 2.1.18
E[W
4
t
3t
2
[T
t
] = E[W
4
t
[T
t
] 3t
2
= 3(t s)
2
+ 6(t s)W
2
s
+W
4
s
3t
2
= (W
4
s
3s
2
) + 6s
2
6ts + 6(t s)W
2
s
,= W
4
s
3s
2
Hence W
4
t
3t
2
is not a martingale. (b)
E[W
3
t
[T
s
] = W
3
s
+ 3(t s)W
s
,= W
3
s
,
and hence W
3
t
is not a martingale.
7.2.6 (a) Similar method as in Example 7.2.4.
(b) Applying Itos formula
d(cos(W
t
)) = sin(W
t
)dW
t
1
2
2
cos(W
t
)dt.
Let f(t) = E[cos(W
t
)]. Then f
(t) =
2
2
f(t), f(0) = 1. The solution is f(t) = e
2
2
t
.
(c) Since sin(t+W
t
) = sin t cos(W
t
)+cos t sin(W
t
), taking the expectation and using
(a) and (b) yields
E[sin(t +W
t
)] = sintE[cos(W
t
)] = e
2
2
t
sin t.
(d) Similarly starting from cos(t +W
t
) = cos t cos(W
t
) sin t sin(W
t
).
7.2.7 From Exercise 7.2.6 (b) we have E[cos(W
t
)] = e
t/2
. From the denition of
expectation
E[cos(W
t
)] =
_

cos x
1
2t
e
x
2
2t
dx.
Then choose t = 1/2 and t = 1 to get (a) and (b), respectively.
7.2.8 (a) Using a standard method involving Itos formula we can get E(W
t
e
bWt
) =
bte
b
2
t/2
. Let a = 1/(2t). We can write
_
xe
ax
2
+bx
dx =
2t
_
xe
bx
1
2t
e
x
2
2t
dx
=
2tE(W
t
e
bWt
) =
2tbte
b
2
t/2
=
_
a
_
b
2a
_
e
b
2
/(4a)
.
349
The same method for (b) and (c).
7.2.9 (a) We have
E[cos(tW
t
)] = E
_
n0
(1)
n
W
2n
t
t
2n
(2n)!
n0
(1)
n
E[W
2n
t
]t
2n
(2n)!
=
n0
(1)
n
t
2n
(2n)!
(2n)!t
n
2
n
n!
=
n0
(1)
n
t
3n
2
n
n!
= e
t
3
/2
.
(b) Similar computation using E[W
2n+1
t
] = 0.
7.3.2 (a) X
t
= 1 + sin t
_
t
0
sins dW
s
, E[X
t
] = 1 + sin t, V ar[X
t
] =
_
t
0
(sin s)
2
ds =
t
2

1
4
sin(2t);
(b) X
t
= e
t
1 +
_
t
0
s dW
s
, E[X
t
] = e
t
1, V ar[X
t
] =
t
2
2
;
(c) X
t
= 1 +
1
2
ln(1 +t
2
) +
_
t
0
s
3/2
dW
s
, E[X
t
] = 1 +
1
2
ln(1 +t
2
), V ar(X
t
) =
t
4
4
.
7.3.4
X
t
= 2(1 e
t/2
) +
_
t
0
e
s
2
+Ws
dW
s
= 1 2e
t/2
+e
t/2+Wt
= 1 +e
t/2
(e
Wt
2);
Its distribution function is given by
F(y) = P(X
t
y) = P(1 +e
t/2
(e
Wt
2) y) = P
_
W
t
ln(2 + (y 1)e
t/2
)
_
=
1
2t
_
ln(2+(y1)e
t/2
)
x
2
2t
dx.
E[X
t
] = 2(1 e
t/2
), V ar(X
t
) = V ar(e
t/2
e
Wt
) = e
t
V ar(e
Wt
) = e
t
1.
7.5.1 (a)
dX
t
= (2W
t
dW
t
+dt) +W
t
dt +tdW
t
= d(W
2
t
) +d(tW
t
) = d(tW
t
+W
2
t
)
so X
t
= tW
t
+W
2
t
. (b) We have
dX
t
= (2t
1
t
2
W
t
)dt +
1
t
dW
t
= 2tdt +d(
1
t
W
t
) = d(t
2
+
1
t
W
t
),
350
so X
t
= t
2
+
1
t
W
t
1 W
1
.
(c) dX
t
=
1
2
e
t/2
W
t
dt +e
t/2
dW
t
= d(e
t/2
W
t
), so X
t
= e
t/2
W
t
. (d) We have
dX
t
= t(2W
t
dW
t
+dt) tdt +W
2
t
dt
= td(W
2
t
) +W
2
t
dt
1
2
d(t
2
)
= d(tW
2
t

t
2
2
),
so X
t
= tW
t
t
2
2
. (e) dX
t
= dt +d(
tW
t
) = d(t +
tW
t
), so X
t
= t +
tW
t
W
1
.
7.6.1 (a) X
t
= X
0
e
4t
+
1
4
(1 e
4t
) + 2
_
t
0
e
4(ts)
dW
s
;
(b) X
t
= X
0
e
3t
+
2
3
(1 e
3t
) +e
3t
W
t
;
(c) X
t
= e
t
(X
0
+ 1 +
1
2
W
2
t

t
2
) 1;
(d) X
t
= X
0
e
4t
t
4

1
16
(1 e
4t
) +e
4t
W
t
;
(e) X
t
= X
0
e
t/2
2t 4 + 5e
t/2
e
t
cos W
t
;
(f) X
t
= X
0
e
t
+e
t
W
t
.
7.8.1 (a) The integrating factor is
t
= e
t
0
dWs+
1
2
t
0

2
ds
= e
Wt+
2
2
t
, which trans-
forms the equation in the exact form d(
t
X
t
) = 0. Then
t
X
t
= X
0
and hence
X
t
= X
0
e
Wt
2
2
t
.
(b)
t
= e
Wt+
2
2
t
, d(
t
X
t
) =
t
X
t
dt, dY
t
= Y
t
dt, Y
t
= Y
0
e
t
,
t
X
t
= X
0
e
t
, X
t
=
X
0
e
(1
2
2
)t+Wt
.
7.8.2 t dA
t
= dX
t
A
t
dt, E[A
t
] = 0, V ar(A
t
) = E[A
2
t
] =
1
t
2
_
t
0
E[X
s
]
2
ds =
X
2
0
t
.
Chapter 8
8.1.2 A =
1
2
=
1
2
n
k=1
2
k
.
8.1.4
X
1
(t) = x
0
1
+W
1
(t)
X
2
(t) = x
0
2
+
_
t
0
X
1
(s) dW
2
(s)
= x
0
2
+x
0
1
W
2
(t) +
_
t
0
W
1
(s) dW
2
(s).
8.4.2 E[] is maximum if b x
0
= x
0
a, i.e. when x
0
= (a + b)/2. The maximum
value is (b a)
2
/4.
351
8.5.1 Let
k
= min(k, ) and k . Apply Dynkins formula for
k
to show that
E[
k
]
1
n
(R
2
[a[
2
),
and take k .
8.5.3 x
0
and x
2n
.
8.6.1 (a) We have a(t, x) = x, c(t, x) = x, (s) = xe
st
and u(t, x) =
_
T
t
xe
st
ds =
x(e
Tt
1).
(b) a(t, x) = tx, c(t, x) = ln x, (s) = xe
(s
2
t
2
)/2
and u(t, x) =
_
T
t
ln
_
xe
(s
2
t
2
)/2
_
ds =
(T t)[ln x +
T
6
(T +t)
t
2
3
].
8.6.2 (a) u(t, x) = x(T t) +
1
2
(T t)
2
. (b) u(t, x) =
2
3
e
x
_
e
3
2
(Tt)
1
_
. (c) We have
a(t, x) = x, b(t, x) = x, c(t, x) = x. The associated diusion is dX
s
= X
s
ds +
X
s
dW
s
, X
t
= x, which is the geometric Brownian motion
X
s
= xe
(
1
2
2
)(st)+(WsWt)
, s t.
The solution is
u(t, x) = E
_
_
T
t
xe
(
1
2
2
)(st)+(WsWt)
ds
_
= x
_
T
t
e
(
1
2
(st))(st)
E[e
(st)
] ds
= x
_
T
t
e
(
1
2
(st))(st)
e
2
(st)/2
ds
= x
_
T
t
e
(st)
ds =
x
_
e
(Tt)
1
_
.
Chapter 9
9.1.7 Apply Example 9.1.6 with u = 1.
9.1.8 X
t
=
_
t
0
h(s) dW
s
N(0,
_
t
0
h
2
(s) ds). Then e
Xt
is log-normal with E[e
Xt
] =
e
1
2
V ar(Xt)
= e
1
2
t
0
h(s)
2
ds
.
9.1.9 (a) Using Exercise 9.1.8 we have
E[M
t
] = E[e
t
0
u(s) dWs
e
1
2
t
0
u(s)
2
ds
]
= e
1
2
t
0
u(s)
2
ds
E[e
t
0
u(s) dWs
] = e
1
2
t
0
u(s)
2
ds
e
1
2
t
0
u(s)
2
ds
= 1.
(b) Similar computation with (a).
9.1.10 (a) Applying the product and Itos formulas we get
d(e
t/2
cos W
t
) = e
t/2
sinW
t
dW
t
.
352
Integrating yields
e
t/2
cos W
t
= 1
_
t
0
e
s/2
sin W
s
dW
s
,
which is an Ito integral, and hence a martingale; (b) Similarly.
9.1.12 Use that the function f(x
1
, x
2
) = e
x
1
cos x
2
satises f = 0.
9.1.14 (a) f(x) = x
2
; (b) f(x) = x
3
; (c) f(x) = x
n
/(n(n 1)); (d) f(x) = e
cx
; (e)
f(x) = sin(cx).
Chapter 10
10.3.4 (a) Since r
t
N(, s
2
), with = b +(r
0
b)e
at
and s
2
=

2
2a
(1 e
2at
). Then
P(r
t
< 0) =
_
0
2s
e
(x)
2
2s
2
dx =
1
2
_
/s
e
v
2
/2
dv = N(/s),
where by computation
s
=
1
_
2a
e
2at
1
_
r
0
+b(e
at
1)
_
.
(b) Since
lim
t
s
=
2a
lim
t
r
0
+b(e
at
1)
e
2at
1
=
b
2a
,
then
lim
t
P(r
t
< 0) = lim
t
N(/s) = N(
b
2a
).
It is worth noting that the previous probability is less than 0.5.
(c) The rate of change is
d
dt
P(r
t
< 0) =
1
2
e

2
2s
2
d
ds
_
s
_
=
1
2
e

2
2s
2
ae
2at
[b(e
2at
1)(e
at
1) r
0
]
(e
2at
1)
3/2
.
10.3.5 By Itos formula
d(r
n
t
) =
_
na(b r
t
)r
n1
t
+
1
2
n(n 1)
2
r
n1
t
_
dt +nr
n
1
2
t
dW
t
.
Integrating between 0 and t and taking the expectation yields
n
(t) = r
n
0
+
_
t
0
[nab
n1
(s) na
n
(s) +
n(n 1)
2

2
n1
(s)] ds
353
n
(t) +na
n
(t) = (nab +
n(n 1)
2

2
)
n1
(t).
nat
[e
nat
n
(t)]
= e
nat
(nab +
n(n 1)
2

2
)
n1
(t).
Integrating yields the following recursive formula for moments
n
(t) = r
n
0
e
nat
+ (nab +
n(n 1)
2

2
)
_
t
0
e
na(ts)
n1
(s) ds
.
10.5.1 (a) The spot rate r
t
follows the process
d(ln r
t
) = (t)dt +dW
t
.
Integrating yields
ln r
t
= ln r
0
+
_
t
0
(u) du +W
t
,
which is normally distributed. (b) Then r
t
= r
0
e
t
0
(u) du+Wt
is log-normally dis-
tributed. (c) The mean and variance are
E[r
t
] = r
0
e
t
0
(u) du
e
1
2
2
t
, V ar(r
t
) = r
2
0
e
2
t
0
(u) du
e
2
t
(e
2
t
1).
10.5.2 Substitute u
t
= ln r
t
and obtain the linear equation
du
t
+a(t)u
t
dt = (t)dt +(t)dW
t
.
The equation can be solved multiplying by the integrating factor e
t
0
a(s) ds
.
Chapter 11
11.1.1 Apply the same method used in the Application 11.1. The price of the bond is:
P(t, T) = e
rt(Tt)
E
_
e
Tt
0
Ns ds
.
11.2.2 The price of an innitely lived bond at time t is given by lim
T
P(t, T). Using
lim
T
B(t, T) =
1
a
, and
lim
T
A(t, T) =
_
_
_
+, if b <
2
/(2a)
1, if b >
2
/(2a)
(
1
a
+t)(b

2
2a
2
)

2
4a
3
, if b =
2
/(2a),
we can get the price of the bond in all three cases.
354
Chapter 12
12.1.1 Dividing the equations
S
t
= S
0
e
(
2
2
)t+Wt
S
u
= S
0
e
(
2
2
)u+Wu
yields
S
t
= S
u
e
(
2
2
)(tu)+(WtWu)
.
Taking the predictable part out and dropping the independent condition we obtain
E[S
t
[T
u
] = S
u
e
(
2
2
)(tu)
E[e
(WtWu)
[T
u
]
= S
u
e
(
2
2
)(tu)
E[e
(WtWu)
]
= S
u
e
(
2
2
)(tu)
E[e
W
tu
]
= S
u
e
(
2
2
)(tu)
e
1
2
2
(tu)
= S
u
e
(tu)
Similarly we obtain
E[S
2
t
[T
u
] = S
2
u
e
2(
2
2
)(tu)
E[e
2(WtWu)
[T
u
]
= S
2
u
e
2(tu)
e
2
(tu)
.
Then
V ar(S
t
[T
u
) = E[S
2
t
[T
u
] E[S
t
[T
u
]
2
= S
2
u
e
2(tu)
e
2
(tu)
S
2
u
e
2(tu)
= S
2
u
e
2(tu)
_
e
2
(tu)
1
_
.
When s = t we get E[S
t
[T
t
] = S
t
and V ar(S
t
[T
t
) = 0.
12.1.2 By Itos formula
d(ln S
t
) =
1
S
t
dS
t
1
2
1
S
2
t
(dS
t
)
2
= (

2
2
)dt +dW
t
,
so ln S
t
= ln S
0
+(
2
2
)t+W
t
, and hence ln S
t
is normally distributed with E[ln S
t
] =
ln S
0
+ (

2
2
)t and V ar(ln S
t
) =
2
t.
355
12.1.3 (a) d
_
1
St
_
= (
2
)
1
St
dt

St
dW
t
;
(b) d
_
S
n
t
_
= n( +
n1
2

2
)dt +nS
n
t
dW
t
;
(c)
d
_
(S
t
1)
2
_
= d(S
2
t
) 2dS
t
= 2S
t
dS
t
+ (dS
t
)
2
2dS
t
=
_
(2 +
2
)S
2
t
2S
t
_
dt + 2S
t
(S
t
1)dW
t
.
12.1.4 (b) Since S
t
= S
0
e
(
2
2
)t+Wt
, then S
n
t
= S
n
0
e
(n
2
2
)t+nWt
and hence
E[S
n
t
] = S
n
0
e
(n
2
2
)t
E[e
nWt
]
= S
n
0
e
(n
2
2
)t
e
1
2
n
2
2
t
= S
n
0
e
(n+
n(n1)
2

2
)t
.
(a) Let n = 2 in (b) and obtain E[S
2
t
] = S
0
e
(2+
2
)t
.
12.1.5 (a) Let X
t
= S
t
W
t
. The product rule yields
d(X
t
) = W
t
dS
t
+S
t
dW
t
+dS
t
dW
t
= W
t
(S
t
dt +S
t
dW
t
) +S
t
dW
t
+ (S
t
dt +S
t
dW
t
)dW
t
= S
t
(W
t
+)dt + (1 +W
t
)S
t
dW
t
.
Integrating
X
s
=
_
s
0
S
t
(W
t
+)dt +
_
s
0
(1 +W
t
)S
t
dW
t
.
Let f(s) = E[X
s
]. Then
f(s) =
_
s
0
(f(t) +S
0
e
t
) dt.
f
(s) = f(s) +S
0
e
s
, f(0) = 0,
with the solution f(s) = S
0
se
s
. Hence E[W
t
S
t
] = S
0
te
t
.
(b) Cov(S
t
, W
t
) = E[S
t
W
t
] E[S
t
]E[W
t
] = E[S
t
W
t
] = S
0
te
t
. Then
Corr(S
t
, W
t
) =
Cov(S
t
, W
t
)
S
t
W
t
=
S
0
te
t
S
0
e
t
_
e
2
t
1
t
=
_
t
e
2
t
1
0, t .
12.1.8 d = S
d
/S
0
= 0.5, u = S
u
/S
0
= 1.5, = 6.5, p = 0.98.
356
12.2.2 Let T
a
be the time S
t
reaches level a for the rst time. Then
F(a) = P(S
t
< a) = P(T
a
> t)
= ln
a
S
0
_

t
1
2
3
3
e
(ln
a
S
0
)
2
2
3
d,
where =
1
2
2
.
12.2.3 (a) P(T
a
< T) = ln
a
S
0
_
T
0
1
2
3
3
e
(ln
a
S
0
)
2
/(2
3
)
d.
(b)
_
T
2
T
1
p() d.
12.4.3 E[A
t
] = S
0
_
1 +
t
2
_
+O(t
2
), V ar(A
t
) =
S
0
3

2
t +O(t
2
).
12.4.6 (a) Since ln G
t
=
1
t
_
t
0
ln S
u
du, using the product rule yields
d(ln G
t
) = d
_
1
t
_
_
t
0
ln S
u
du +
1
t
d
_
_
t
0
ln S
u
du
_
=
1
t
2
_
_
t
0
lnS
u
du
_
dt +
1
t
ln S
t
dt
=
1
t
(ln S
t
ln G
t
)dt.
(b) Let X
t
= ln G
t
. Then G
t
= e
Xt
and Itos formula yields
dG
t
= de
Xt
= e
Xt
dX
t
+
1
2
e
Xt
(dX
t
)
2
= e
Xt
dX
t
= G
t
d(ln G
t
) =
G
t
t
(ln S
t
ln G
t
)dt.
12.4.7 Using the product rule we have
d
_
H
t
t
_
= d
_
1
t
_
H
t
+
1
t
dH
t
=
1
t
2
H
t
dt +
1
t
2
H
t
_
1
H
t
S
t
_
dt
=
1
t
2
H
2
t
S
t
dt,
so, for dt > 0, then d
_
Ht
t
_
< 0 and hence
Ht
t
is decreasing. For the second part, try to
apply lHospital rule.
12.4.8 By continuity, the inequality (12.4.9) is preserved under the limit.
12.4.9 (a) Use that
1
n
t
k
=
1
t
S
t
k
t
n

1
t
_
t
0
S
u
du. (b) Let I
t
=
_
t
0
S
u
du. Then
d
_
1
t
I
t
_
= d
_
1
t
_
I
t
+
1
t
dI
t
=
1
t
_
S
t

I
t
t
_
dt.
357
Let X
t
= A
t
. Then by Itos formula we get
dX
t
=
_
1
t
I
t
_
1
d
_
1
t
I
t
_
+
1
2
( 1)
_
1
t
I
t
_
2
_
d
_
1
t
I
t
_
_
2
=
_
1
t
I
t
_
1
d
_
1
t
I
t
_
=
_
1
t
I
t
_
1
1
t
_
S
t

I
t
t
_
dt
=

t
_
S
t
(A
t
)
11/
A
t
_
dt.
(c) If = 1 we get the continuous arithmetic mean, A
1
t
=
1
t
_
t
0
S
u
du. If = 1 we
obtain the continuous harmonic mean, A
1
t
=
t
t
0
1
Su
du
.
12.4.10 (a) By the product rule we have
dX
t
= d
_
1
t
_
_
t
0
S
u
dW
u
+
1
t
d
_
_
t
0
S
u
dW
u
_
=
1
t
2
_
_
t
0
S
u
dW
u
_
dt +
1
t
S
t
dW
t
= =
1
t
X
t
dt +
1
t
S
t
dW
t
.
Using the properties of Ito integrals, we have
E[X
t
] =
1
t
E[
_
t
0
S
u
dW
u
] = 0
V ar(X
t
) = E[X
2
t
] E[X
t
]
2
= E[X
2
t
]
=
1
t
2
E
__
_
t
0
S
u
dW
u
__
_
t
0
S
u
dW
u
__
=
1
t
2
_
t
0
E[S
2
u
] du =
1
t
2
_
t
0
S
0
e
(2+
2
)u
du
=
S
2
0
t
2
e
(2+
2
)t
1
2 +
2
.
(b) The stochastic dierential equation of the stock price can be written in the form
S
t
dW
t
= dS
t
S
t
dt.
_
t
0
S
u
dW
u
= S
t
S
0
_
t
0
S
u
du.
Dividing by t yields the desired relation.
358
12.5.2 Using independence E[S
t
] = S
0
e
t
E[(1 +)
Nt
]. Then use
E[(1 +)
Nt
] =
n0
E[(1 +)
n
[N
t
= n]P(N
t
= n) =
n0
(1 +)
n
(t)
n
n!
= e
(1+)t
.
12.5.3 E[ln S
t
] = ln S
0
_

2
2
+ln( + 1)
_
t.
12.5.4 E[S
t
[T
u
] = S
u
e
(
2
2
)(tu)
(1 +)
N
tu
.
Chapter 13
13.2.2 (a) Substitute lim
St
N(d
1
) = lim
St
N(d
2
) = 1 and lim
St
K
St
= 0 in
c(t)
S
t
= N(d
1
)
K
S
t
e
r(Tt)
N(d
2
).
(b) Use lim
St0
N(d
1
) = lim
St0
N(d
2
) = 0. (c) It comes from an analysis similar to
the one used at (a).
13.2.3 Dierentiating in c(t) = S
t
N(d
1
) Ke
r(Tt)
N(d
2
) yields
dc(t)
dS
t
= N(d
1
) +S
t
dN(d
1
)
dS
t
Ke
r(Tt)
dN(d
2
)
dS
t
= N(d
1
) +S
t
N
(d
1
)
d
1
S
t
Ke
r(Tt)
N
(d
2
)
d
2
S
t
= N(d
1
) +S
t
1
2
e
d
2
1
/2
1
T t
1
S
t
Ke
r(Tt)
1
2
e
d
2
2
/2
1
T t
1
S
t
= N(d
1
) +
1
2
1
T t
1
S
t
e
d
2
2
/2
_
S
t
e
(d
2
2
d
2
1
)/2
Ke
r(Tt)
_
.
It suces to show
S
t
e
(d
2
2
d
2
1
)/2
Ke
r(Tt)
= 0.
Since
d
2
2
d
2
1
= (d
2
d
1
)(d
2
+d
1
) =
T t
2 ln(S
t
/K) + 2r(T t)
T t
= 2
_
ln S
t
ln K +r(T t)
_
,
then we have
e
(d
2
2
d
2
1
)/2
= e
ln St+ln Kr(Tt)
= e
ln
1
S
t
e
ln K
e
r(Tt)
=
1
S
t
Ke
r(Tt)
.
359
Therefore
S
t
e
(d
2
2
d
2
1
)/2
Ke
r(Tt)
= S
t
1
S
t
Ke
r(Tt)
Ke
r(Tt)
= 0,
which shows that
dc(t)
dSt
= N(d
1
).
13.3.1 The payo is
f
T
=
_
1, if ln K
1
X
T
ln K2
0, otherwise.
where X
T
N
_
ln S
t
+ (

2
2
)(T t),
2
(T t)
_
.
E
t
[f
T
] = E[f
T
[T
t
, = r] =
_

f
T
(x)p(x) dx
=
_
ln K
2
lnK
1
1
T t
e
[xlnS
t
(r
2
2
)(Tt)]
2
2
2
(Tt)
dx
=
_
d
2
(K
2
)
d
2
(K
1
)
1
2
e
y
2
/2
dy =
_
d
2
(K
1
)
d
2
(K
2
)
1
2
e
y
2
/2
dy
= N
_
d
2
(K
1
)
_
N
_
d
2
(K
2
)
_
,
where
d
2
(K
1
) =
ln S
t
ln K
1
+ (r

2
2
)(T t)
T t
d
2
(K
2
) =
ln S
t
ln K
2
+ (r

2
2
)(T t)
T t
.
Hence the value of a box-contract at time t is f
t
= e
r(Tt)
[N
_
d
2
(K
1
)
_
N
_
d
2
(K
2
)
_
].
13.3.2 The payo is
f
T
=
_
e
X
T
, if X
T
> ln K
0, otherwise.
f
t
= e
r(Tt)
E
t
[f
T
] = e
r(Tt)
_

lnK
e
x
p(x) dx.
The computation is similar with the one of the integral I
2
from Proposition 13.2.1.
13.3.3 (a) The payo is
f
T
=
_
e
nX
T
, if X
T
> ln K
0, otherwise.
360
E
t
[f
T
] =
_

ln K
e
nx
p(x) dx =
1
_
2(T t)
_

ln K
e
nx
e
1
2
[xlnS
t
(r
2
/2)(Tt)]
2
2
(tt)
dx
=
1
2
_

d
2
e
n
Tty
e
ln S
n
t
e
n(r
2
2
)(Tt)
e
1
2
y
2
dy
=
1
2
S
n
t
e
n(r
2
2
)(Tt)
_

d
2
e
1
2
y
2
+n
Tty
dy
=
1
2
S
n
t
e
n(r
2
2
)(Tt)
e
1
2
n
2
2
(Tt)
_

d
2
e
1
2
(yn
Tt)
2
dy
=
1
2
S
n
t
e
(nr+n(n1)
2
2
)(Tt)
_

d
2
n
Tt
e
z
2
2
dz
= S
n
t
e
(nr+n(n1)
2
2
)(Tt)
N(d
2
+n
T t).
The value of the contract at time t is
f
t
= e
r(Tt)
E
t
[f
T
] = S
n
t
e
(n1)(r+
n
2
2
)(Tt)
N(d
2
+n
T t).
(b) Using g
t
= S
n
t
e
(n1)(r+
n
2
2
)(Tt)
, we get f
t
= g
t
N(d
2
+n
T t).
(c) In the particular case n = 1, we have N(d
2
+
T t) = N(d
1
) and the value takes
the form f
t
= S
t
N(d
1
).
13.4.1 Since ln S
T
N
_
lnS
t
+
_

2
2
(T t)
_
,
2
(T t)
_
, we have
E
t
[f
T
] = E
_
(ln S
T
)
2
[T
t
, = r
= V ar
_
ln S
T
[T
t
, = r
_
+E[ln S
T
[T
t
, = r]
2
=
2
(T t) +
_
ln S
t
+ (r

2
2
)(T t)
2
,
so
f
t
= e
r(Tt)
_
2
(T t) +
_
ln S
t
+ (r

2
2
)(T t)
2
_
.
13.5.1 ln S
n
T
is normally distributed with
ln S
n
T
= nln S
T
N
_
nln S
t
+n
_

2
2
(T t)
_
, n
2
2
(T t)
_
.
Redo the computation of Proposition 13.2.1 in this case.
13.6.1 Let n = 2.
f
t
= e
r(Tt)
E
t
[(S
T
K)
2
]
= e
r(Tt)
E
t
[S
2
T
] +e
r(Tt)
_
K
2
2K

E
t
[S
T
]
_
= S
2
t
e
(r+
2
)(Tt)
+e
r(Tt)
K
2
2KS
t
.
361
Let n = 3. Then
f
t
= e
r(Tt)
E
t
[(S
T
K)
3
]
= e
r(Tt)
E
t
[S
3
T
3KS
2
T
+ 3K
2
S
T
+K
3
]
= e
r(Tt)
E
t
[S
3
T
] 3Ke
r(Tt)
E
t
[S
2
T
] +e
r(Tt)
(3K
2
E
t
[S
T
] +K
3
)
= e
2(r+3
2
/2)(Tt)
S
3
t
3Ke
(r+
2
)(Tt)
S
2
t
+ 3K
2
S
t
+e
r(Tt)
K
3
.
13.8.1 Choose c
n
= 1/n! in the general contract formula.
13.9.1 Similar computation with the one done for the call option.
13.11.1 Write the payo as a dierence of two puts, one with strike price K
1
and the
other with strike price K
2
. Then apply the superposition principle.
13.11.2 Write the payo as f
T
= c
1
+c
3
2c
2
, where c
i
is a call with strike price K
i
.
Apply the superposition principle.
13.11.4 (a) By inspection. (b) Write the payo as the sum between a call and a put
both with strike price K, and then apply the superposition principle.
13.11.4 (a) By inspection. (b) Write the payo as the sum between a call with strike
price K
2
and a put with strike price K
1
, and then apply the superposition principle.
(c) The strangle is cheaper.
13.11.5 The payo can be written as a sum between the payos of a (K
3
, K
4
)-bull
spread and a (K
1
, K
2
)-bear spread. Apply the superposition principle.
13.12.2 By computation.
13.12.3 The risk-neutral valuation yields
f
t
= e
r(Tt)
E
t
[S
T
A
T
] = e
r(Tt)
E
t
[S
T
] e
r(Tt)
E
t
[A
T
]
= S
t
e
r(Tt)
t
T
A
t
+
1
rT
S
t
_
1 e
r(Tt)
_
= S
t
_
1 +
1
rT

1
rT
e
r(Tt)
_
e
r(Tt)
t
T
A
t
.
13.12.5 We have
lim
t0
f
t
= S
0
e
rT+(r
2
2
)
T
2
+
2
T
6
e
rT
K
= S
0
e
1
2
(r+
2
6
)T
e
rT
K.
13.12.6 Since G
T
< A
T
, then

E
t
[G
T
] <

E
t
[A
T
] and hence
e
r(Tt)
E
t
[G
T
K] < e
r(Tt)
E
t
[A
T
K],
so the forward contract on the geometric average is cheaper.
362
13.12.7 Dividing
G
t
= S
0
e
(
2
2
)
t
2
e
t
0
Wu du
G
T
= S
0
e
(
2
2
)
T
2
e
t
0
Wu du
yields
G
T
G
t
= e
(
Tt
2
)
Tt
2
e
T
0
Wu du
t
0
Wu du
.
An algebraic manipulation leads to
G
T
= G
t
e
(
2
2
)
Tt
2
e
(
1
T
1
t
)
t
0
Wu du
e
T
t
Wu du
.
13.15.7 Use that
P(max
tT
S
t
z) = P
_
sup
tT
[(

2
2
)t +W
t
] ln
z
S
0
_
= P
_
sup
tT
[
1
(

2
2
)t +W
t
]
1
ln
z
S
0
_
.
Chapter 15
15.5.1 P = F
F
S = c N(d
1
)S = Ke
r(Tt)
.
Chapter 17
17.1.9
P(S
t
< b) = P
_
(r

2
2
)t +W
t
< ln
b
S
0
_
= P
_
W
t
< ln
b
S
0
(r

2
2
)t
_
= P
_
W
t
<
_
= P
_
W
t
>
_
e
2
2
2
t
= e
1
2
2
t
[ln(S
0
/b)+(r
2
/2)t]
2
,
where we used Proposition 1.12.11 with = 0.
Bibliography
[1] J. Bertoin. Levy Processes. Cambridge University Press, 121, 1996.
[2] F. Black, E. Derman, and W. Toy. A One-Factor Model of Interest Rates and
Its Application to Treasury Bond Options. Financial Analysts Journal, Jan.-Febr.,
1990, pp.3339.
[3] F. Black and P. Karasinski. Bond and Option Pricing when Short Rates are
Lognormal. Financial Analysts Journal, Jul.-Aug., 1991, pp.5259.
[4] J. R. Cannon. The One-dimensional Heat Equation. Encyclopedia of Mathematics
and Its Applications, 23, Cambridge, 2008.
[5] J. C. Cox, J. E. Ingersoll, and S. A. Ross. A Theory of the term Structure of
Interest Rates. Econometrica, 53, 1985, pp.385407.
[6] W. D. D. Wackerly, Mendenhall, and R. L. Scheaer. Mathematical Statistics with
Applications, 7th ed. Brooks/Cole, 2008.
[7] T.S.Y. Ho and S.B. Lee. Term Structure Movements and Pricing Interest Rate
Contingent Claims. Journal of Finance, 41, 1986, pp.10111029.
[8] J. Hull and A. White. Pricing Interest Rates Derivative Securities. Review of
Financial Studies, 3, 4, 1990, pp.573592.
[9] H.H. Kuo. Introduction to Stochastic Integration. Springer, Universitext, 2006.
[10] M. Abramovitz M. and I. Stegun. Handbook of Mathematical Functions with For-
mulas, Graphs, and Mathematical Tables. New York: Dover, 1965.
[11] R.C. Merton. Option Pricing with Discontinuous Returns. Journal of Financial
Economics, 3, 1976, pp.125144.
[12] R.C. Merton. Theory of Rational Option Pricing. The Bell Journal of Economics
and Management Science, Vol. 4, No. 1, (Spring, 1973), pp. 141-183.
[13] B. ksendal. Stochastic Dierential Equations, An Introduction with Applications,
6th ed. Springer-Verlag Berlin Heidelberg New-York, 2003.
363
364
[14] R. Rendleman and B. Bartter. The Pricing of Options on Debt Securities. Journal
of Financial and Quantitative Analysis, 15, 1980, pp.1124.
[15] S.M. Turnbull and L. M. Wakeman. A quick algorithm for pricing European av-
erage options. Journal of Financial and Quantitative Analysis, 26, 1991, p. 377.
[16] O. A. Vasicek. An Equilibrium Characterization of the Term Structure. Journal
of Financial Economics, 5, 1977, pp.177188.
[17] D.V. Widder. The Heat Equation. Academic, London, 1975.
Index
Brownian motion, 44
distribution function, 45
Dynkins formula, 173
generator, 169
of an Ito diusion, 169
Itos formula, 169
Kolmogorovs
backward equation, 173
365

An Introduction To Stochastic Calculus

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Introduction To Stochastic Calculus

Uploaded by

Copyright:

Available Formats

An Introduction to Stochastic Calculus with Applications

. The reason for this

in the nite and

satises the following properties:

that satises the previous three properties is called a -eld. The

, and probability given by P(A) = [A[/n, A T. Then (, 2

, and hence the measurability of random variables is automatically satised.

, and hence X is a random variable.

X() dP() will be used inter-

X dP is the expectation of the random variable X, given the

continuous. Let = E[X].

is integrable, see Exercise 3.2.4 (a), by

are integrable and

denote the times when the process X

] = 0 and hence E[] =

= x, the previous relation becomes

in the following heuristic

= x yields E[] = x, where

given by (3.5.7), then

t , as t , which proves (3.8.17).

. By Proposition 3.8.10 it suces to show

-measurable random variable X

). Since s, > 0, then c > 0. The previous expression

, with constant, then f

(x) < 0, for 0 x T;

(0) = Mx, (7.2.6)

(x) is a decreasing function. Applying Jensens inequality for

(t) = 0, or T = c constant. Hence f(t, x) = x + e

= b) be the exit probabilities from the interval

be the rst exit time of X

(x) = 0 by looking for a solution of

is totally unpredictable given

< , t > 0 and x R.

1) and a multiple of a Wiener integral, from Proposition (4.6.1) it follows that r

(t) = ab a(t). Multiply by the integrating

. Using Exercise 3.5.2 the previous probability is

. Substituting in the aforemen-

, 0) is called a strangle, see Fig.13.5 a. A long

, where the events occur at times 0 < t

in Corollary 9.2.4 of Girsanovs theorem, it follows that

f(x +z) dz,

into equation (15.4.11) we obtain

(S) g(S) = f(F) Ff

. The general solution is the sum between the par-

u(, x), where = x + , with , constants that will

and collecting the derivatives yields

(t) = f(t). The initial condition is

K, where denotes the exercise

= , which corresponds to the innite

for which f(b) is maximum

(b) = 0 has the solution b

(b) > 0 for b < b

(b) < 0 for b > b

is a maximum point for the function g(b) and

< K is obviously satised, while the condition b

) = 1/e. Then the

= e, and the price of the contract at t = 0 is

from below we need to require

= , and hence, it is never optimal to

, is expected to decrease, so we assume the

, where the maximum is

, is somewhere between 0 and T, with the tendency

You might also like