You are on page 1of 4

4.

5 Covariance and Correlation


In earlier sections, we have discussed the absence or presence of a relationship between two
random variables, Independence or nonindependence. But if there is a relationship, the
relationship may be strong or weak. In this section, we discuss two numerical measures of
the strength of a relationship between two random variables, the covariance and correlation.

2
Throughout this section, we will use the notation EX = X , EY = Y , VarX = X , and
VarY = Y2 .

Definition 4.5.1 The covariance of X and Y is the number defined by

Cov(X, Y ) = E((X X )(Y Y )).

Definition 4.5.2 The correlation of X and Y is the number defined by


Cov(X, Y )
XY = .
X Y
The value XY is also called the correlation coefficient.

Theorem 4.5.3 For any random variables X and Y ,

Cov(X, Y ) = EXY X Y .

Theorem 4.5.5 If X and Y are independent random variables, then Cov(X, Y ) = 0 and
XY = 0.

Theorem 4.5.6 If X and Y are any two random variables and a and b are any two constants,
then
Var(aX + bY ) = a2 VarX + b2 VarY + 2abCov(X, Y ).

If X and Y are independent random variables, then

Var(AX + bY ) = a2 VarX + b2 VarY.

Theorem 4.5.7 For any random variables X and Y ,

1
a. 1 XY 1.

b. |XY | = 1 if and only if there exist numbers a 6= 0 and b such that P (Y = aX + b) = 1.


If XY = 1, then a > 0, and if XY = 1, then a < 0.

Proof: Consider the function h(t) defined by

h(t) = E((X X )t + (Y Y ))2

= t2 X
2
+ 2tCov(X, Y ) + Y2 .
Since h(t) 0 and it is quadratic function,

(2Cov(X, Y ))2 4X
2 2
Y 0.

This is equivalent to
X Y Cov(X, Y ) X Y .

That is,
1 XY 1.

Also, |XY | = 1 if and only if the discriminant is equal to 0, that is, if and only if h(t) has a
single root. But since ((X X )t + (Y Y ))2 0, h(t) = 0 if and only if

P ((X X )t + (Y Y ) = 0) = 1.

This P (Y = aX + b) = 1 with a = t and b = X t + Y , where t is the root of h(t). Using


2
the quadratic formula, we see that this root is t = Cov(X, Y )/X . Thus a = t has the
same sign as XY , proving the final assertion.

Example 4.5.8 (Correlation-I) Let X have a uniform(0,1) distribution and Z have a uni-
form(0,0.1) distribution. Suppose X and Z are independent. Let Y = X + Z and consider
the random vector (X, Y ). The joint pdf of (X, Y ) is

f (x, y) = 10, 0 < x < 1, x < y < x + 0.1

Note f (x, y) can be obtained from the relationship f (x, y) = f (y|x)f (x). Then

Cov(X, Y ) = EXY = (EX)(EY )

= EX(X + Z) (EX)(E(X + Z))


2 1
= X =
12
2
1 1
The variance of Y is Y2 = VarX + VarZ = 12
+ 1200
. Thus
r
1/12 100
XY = p p = .
1/12 1/12 + 1/1200 101

The next example illustrates that there may be a strong relationship between X and Y , but
if the relationship is not linear, the correlation may be small.

Example 4.5.9 (Correlation-II) Let X U nif (1, 1), Z U nif (0, 0.1), and X and Z be
independent. Let Y = X 2 + Z and consider the random vector (X, Y ). Since given X = x,
Y U nif (x2 , x2 + 0.1). The joint pdf of X and Y is
1
f (x, y) = 5, 1 < x < 1, x2 < y < x 2 + .
10

Cov(X, Y ) = E(X(X 2 + Z)) (EX)(E(X 2 + Z))

= EX 3 + EXZ 0E(X 2 + Z)

=0
Thus, XY = Cov(X, Y )/(X Y ) = 0.

Definition 4.5.10 Let < X < , < Y < , 0 < X , 0 < Y , and 1 < < 1 be
2
five real numbers. The bivariate normal pdf with means X and Y , variances X and Y2 ,
and correlation is the bivariate pdf given by
1 1 x X 2 x X y Y y Y 2
f (x, y) = p exp ( ) 2( )( )+( )
2x Y 1 2 2(1 2 ) X X Y Y

for < x < and < y < .

The many nice properties of this distribution include these:

2
a. The marginal distribution of X is N (X , X ).

b. The marginal distribution of Y is N (Y , Y2 ).

c. The correlation between X and Y is XY = .

d. For any constants a and b, the distribution of aX + bY is N (aX + bY , a2 X


2
+ b2 Y2 +
2abX Y ).

3
Assuming (a) and (b) are true, we will prove (c). Let

x X y Y x X
s=( )( ) and t = ( ).
X Y X

Then x = X t+X , y = (Y s/t)+Y , and the Jacobian of the transformation is J = X Y /t.


With this change of variables, we obtain
Z Z
Y s X Y
XY = sf (X t + X , + Y )| |dsdt
t t
Z Z p 1 s X Y
= s(2X Y 1 2 )1 exp 2
(t2 2s + ( )2 ) dsdt
2(1 ) t |t|
Z Z
1 t2 s (s t2 )2
= exp( )dt p exp ds
2 2 2 (1 2 )t2 2(1 2 )t2

The inner integral is ES, where S is a normal random variable with ES = t2 and VarS =
(1 2 )t2 . Thus, Z
t2
XY = exp{t2 /2}dt = .
2

You might also like