You are on page 1of 15

Continuous Random Variables

If the cumulative distribution function F (x) of


a random variable X is continuous, then X is
said to be a continuous random variable.

Most continuous random variables that we en-


counter will have distribution function F (x)
which has a derivative (except perhaps a finite
number of points). The function
dF (x)
f (x) =
dx
is called the probability density function of the
random variable X. We have
∫ x
F (x) = f (t) dt
−∞
and the following properties:
• f (t) ≥ 0
∫ ∞
• f (t) dt = 1
−∞
The function f (t) is called the probability density
function of the random variable X.
1
Example Consider


 0,
x<0
F (x) = x, 0 ≤ x ≤ 1 .

 1, x>1
This cumulative Distribution function possesses
a derivative at all points except at x = 0 and
x = 1. The corresponding density function
{
1, 0 ≤ x ≤ 1
f (x) = .
0, otherwise

For any a and b such that a < b,

P (a < X ≤ b) = P (X ≤ b) − P (X ≤ a)
∫ b ∫ a
= f (t) dt − f (t) dt
−∞ −∞
∫ b
= f (t) dt = F (b) − F (a).
a

2
We see that, for a continuous random variable,
probabilities do not change if strict and non-
strict inequalities are interchanged in events.
So, we have

P (a ≤ X ≤ b) = P (a < X ≤ b).

Example Consider the following density func-


tion:
{
kx(1 − x), 0 ≤ x ≤ 1
f (x) = .
0, otherwise
We have
∫ ∞ ∫ 1
1= f (x)dx = kx(1 − x)dx
−∞ 0
( ) 1
x2 x
3 k
=k − =
2 3 0 6
Hence, k = 6, and for instance
( ) ( ) ∫ 1
1 1 3 7
P X< =F = 6x(1 − x)dx =
3 3 0 27
3
In order to obtain an intuitive interpretation of
the density function, we observe that for a very
small ε
( a+ 2 ) ∫ ε
ε ε
P a− ≤X ≤a+ = ε
f (t) dt ≈ εf (a).
2 2 a− 2
So, εf (a) is approximately equal to the proba-
bility that the
( random variable
) takes a value in
the interval a − 2ε , a + 2ε , i.e. εf (a) is a mea-
sure of how likely it is that the random variable
will be near a.
On the other hand, it should be noted that the
values of the probability density function f (x)
are not probabilities, and thus it is perfectly
acceptable if f (x) > 1.

Example Consider a random variable X such


that a < X < b and which is equally likely to
take a value in any interval

(x − dx, x + dx) ⊂ (a, b),


where dx is a very small (infinitesimal) quantity.
4
Then, the condition “equally likely” leads to
the conclusion that the density function f (x)
is a constant on (a, b), say constant c. In other
words, {
c, if a < x < b
f (x) =
0, otherwise
According to the definition of density function
∫ b ∫ ∞
f (x)dx = f (x)dx = 1,
a −∞
on the other hand,
∫ b ∫ b
f (x)dx = c dx = c(b − a),
a a
and therefore,


 1
if a < x < b
f (x) =
 b−a
 0 otherwise
This is a uniform distribution with the (cumu-
lative) distribution function


 0 if x ≤ a

 x−a
F (x) = if a < x < b

 b−a

 1 if x ≥ b
5
Let X be a continuous random variable, having
the probability density function f (x) such that
∫ ∞
|x|f (x) dx < ∞,
−∞
then the expected value of X (often called
mean and denoted by µ) is defined by
∫ ∞
E[X] = xf (x) dx
−∞

Let X be a continuous random variable with


probability density function fX (t). Let Y =
g(X), where g is a differentiable function. For
simplicity we assume that g is strictly increas-
ing. Let FX (x) and FY (y) be the cumulative
distribution functions of X and Y , respectively.
Then the event {g(X) ≤ y} is the same as {X ≤
g −1(y)}, and therefore FY (y) = FX (g −1(y)).
We have
d d
fY (y) = FY (y) = FX (g −1(y))
dy dy
d −1
= fX (g −1(y)) g (y).
dy
6
Hence,
∫ ∞
E[Y ] = yfY (y) dy
−∞
∫ ∞
−1 d −1
= yfX (g (y)) g (y) dy.
−∞ dy
Now we make the change of variables x =
g −1(y) which gives y = g(x) and
d −1
dx = g (y) dy.
dy
Hence, ∫ ∞
E[g(X)] = g(x)fX (x) dx. (1)
−∞
If g is strictly decreasing function, then the
proof of (1) is similar.

The variance of a continuous random variable


X with expected value µ and density function
f (x) is
∫ ∞
V ar[X] = E[(X − µ)2] = (x − µ)2f (x) dx
−∞
To compute the variance, we can use
V ar[X] = E[X 2] − (E[X])2
7
Exponential Distribution

A continuous random variable having density


function
f (x) = λ e−λx, 0≤x<∞
for some λ > 0 is said to be an exponential ran-
dom variable with parameter λ. Its cumulative
distribution function is given by
∫ x ∫ x
F (x) = f (t)dt = λ e−λtdt = 1 − e−λx
0 0

Let X be an exponential random variable with


parameter λ. Integrating by parts,
∫ ∞ ∫ ∞
1
E[X] = xf (x)dx = xλe−λxdx = .
0 0 λ

Similarly,
1
V ar[X] = 2 .
λ

8
No-memory Property of the Exponential
Distribution

Let a > 0 and b > 0, and X be an exponential


random variable with parameter λ. Since for
any c > 0
∫ ∞
P (X > c) = λ e−λxdx = e−λc,
c
from the definition of conditional probability

P (X > a + b | X > a)

P ({X > a + b} ∩ {X > a}) P (X > a + b)


= =
P (X > a) P (X > a)

e−λ(a+b) −λb = P (X > b)


= = e
e−λa
Thus, if X is, for example, the life of some elec-
tronic component, then the memoryless prop-
erty means that, for any age a, the remaining
life distribution is the same as the original life
distribution.
9
In fact the memoryless property is a defining
property of the exponential distribution. Let X
be a continuous random variable that denotes
the lifetime of something that decays sponta-
neously, not as a result of the age. For exam-
ple, isotope carbon 14 decays into nitrogen 14
(this is used in carbon 14 dating). This can
be described by

P (X > x + y|X > y) = P (X > x).


Let F (x) be the cumulative distribution func-
tion of X, and let G(x) = P (X > x). Hence,

G(x) = 1 − F (x)
which is called the tail distribution, or some-
times just the tail of X. In our discussion,
G(x) is the probability to survive the age x
and therefore is called the survival function.
We have

P (X > x) = P (X > x + y|X > y)

10
P ({X > x + y} ∩ {X > y} P (X > x + y)
= = .
P (X > y) P (X > y)
Hence,

P (X > x + y) = P (X > x)P (X > y)


or equivalently

G(x + y) = G(x)G(y).
By subtracting G(y) and dividing by x,
G(x + y) − G(y) G(x) − 1
= G(y)
x x
G(x) − G(0)
= G(y) .
x
Now, taking the limit as x approaches zero,

G′(y) = G′(0)G(y),
and hence,

G(y) = CeG (0)y ,
where C is a constant.
11
Since G(0) = 1, C = 1. Furthermore, let f (x)
be the probability density function of X. Then,
d
G′(x) = (1 − F (x)) = −f (x)
dx
and therefore

G(y) = e−f (0)y


which gives

F (y) = 1 − G(y) = 1 − e−f (0)y .


Hence, X has exponential distribution.

12
Joint Distribution

Let X and Y be continuous random variables


on the same sample space. We also assume
the same probability measure. Then, the event
{X ≤ x and Y ≤ y } consists of all sample points
s ∈ S such that X(s) ≤ x and Y (s) ≤ y. Con-
sider a function f (x, y) such that
• f (x, y) ≥ 0
∫ ∞ ∫ ∞
• f (u, v) dv du = 1
−∞ −∞
and F (x, y) = P (X ≤ x, Y ≤ y)
∫ x ∫ y
= f (u, v) dv du
−∞ −∞
The function f (x, y) is called the joint
(compound) probability density function of X
and Y , and F (x, y) is called joint
cumulative distribution function of X and Y .

We have
∫ b∫ d
P (a < X < b, c < Y < d) = f (x, y) dy dx
a c
13
The intuition is the same as in the one-dimensional
case: f (x, y) is used to measure how likely the
pair X and Y is in the neighbourhood of the
point (x, y).

Example Consider random variables X and Y


with joint probability density function
{
1, 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
f (x, y) =
0, otherwise
Then, the probability that X > Y is
∫ ∞ ∫ x
P (X > Y ) = f (u, v) dv du
−∞ −∞
∫ 1∫ x
1
= dv du =
0 0 2

When y → ∞, the event {X ≤ x and Y ≤ y}


“approaches” the event {X ≤ x}. Since
∫ x ∫ ∞
FX (x) = f (u, y) dy du,
−∞ −∞
the marginal density
14
∫ ∞
fX (x) = f (x, y) dy.
−∞
Similarly
∫ ∞
fY (y) = f (x, y) dx.
−∞

We have
∫ ∞ ∫ ∞
E(X + Y ) = (x + y)f (x, y) dy dx
−∞ −∞
∫ ∞ ∫ ∞
= x fX (x) dx + y fY (y) dy
−∞ −∞

= E(X) + E(Y ).

15

You might also like