Professional Documents
Culture Documents
Background Material
John Stapleton
()
Background Material
1 / 85
Table of Contents
1.1 A review of some basic statistical concepts
1.2 Random regressors
1.3 Modelling the conditional mean
1.3.1 Specifying a functional form for the conditional mean
1.3.2 Choosing the regressors
Background Material
2 / 85
E (x ) = xi f (x i ).
i =1
E (x ) =
xf (x )dx.
For any set of random variables x , y and z, the expectations operator satises the
following rules:
(ETC3410)
Background Material
3 / 85
(ETC3410)
Background Material
4 / 85
Denition (1.2)
The variance of the random variable x , which we denote by Var (x ), is dened
as:
Var (x ) = E f[x
=
=
=
=
E (x )]2 g
E [x 2 + E (x )2
2xE (x )]
2E (x )E (x )
2E (x )2
E (x ) + E (x )
E (x ) + E (x )
E (x 2 )
E (x )2 .
Informally, Var (x ) measures how tightly the values of x are clustered around the
mean.
(ETC3410)
Background Material
5 / 85
E (x )][y
E (y )]g.
E (x )][y
E (y )]g
= E [xy xE (y ) yE (x ) + E (x )E (y )]
= E (xy ) E (x )E (y ) E (y )E (x ) + E (x )E (y )
= E (xy ) E (x )E (y ).
(ETC3410)
Background Material
6 / 85
R6 Var(a) = 0.
R7 Var(ax) = a2 Var (x ).
R8 Var(ax + by) = a2 Var (x ) + b 2 Var (y ) + 2abCov (x, y ).
(ETC3410)
Background Material
7 / 85
(ETC3410)
Background Material
8 / 85
Denition (1.4)
Let x and y be two random variables. Then the correlation between x and y ,
which we denote by Corr (x, y ), is dened as:
Corr (x, y ) =
Cov (x, y )
,
SD (x )SD (y )
where
(ETC3410)
Corr (x, y )
Background Material
1.
9 / 85
(ETC3410)
Background Material
10 / 85
yi = 0 + 1 xi + ui
yi and ui are assumed to be random variables, but xi is assumed to be a
xed number which does not change in value from sample to sample.
While this assumption is useful for pedagogical purposes because it simplies
the analysis, it is inappropriate for the nonexperimental data with which
we typically work in disciplines such as economics and nance.
Nonexperimental data is data that is not generated by performing a
controlled experiment.
(ETC3410)
Background Material
11 / 85
(yi , xi ), i = 1, 2, ..., N
we are eectively making a drawing from the joint probability distribution
of the random variables
(yi , xi ).
Consider the multivariate linear regression model
yi = 0 + 1 xi 1 + 2 xi 2 + ... + k xik + ui .
(1.1)
Let
Background Material
12 / 85
fJ (yi , xi 1 , ...., xik j) = fC (yi jxi 1 , ...., xik , )fJ (xi 1 , ...., xik j), (1.2)
where:
fJ (yi , xi 1 , ...., xik j) is the joint probability distribution of
(yi , xi 1 , ...., xik ).
fC (yi jxi 1 , ...., xik , ) is the probability distribution of yi conditional on
(xi 1 , ...., xik ).
fJ (xi 1 , ...., xik j) is the joint probability distribution of (xi 1 , ...., xik ).
(ETC3410)
Background Material
13 / 85
fJ (yi , xi 1 , ...., xik j) = fC (yi jxi 1 , ...., xik , )fJ (xi 1 , ...., xik j), (1.2)
this strategy obviously means that we ignore fJ (xi 1 , ...., xik j), and lose
any information that it contains regarding the parameter vector .
(ETC3410)
Background Material
14 / 85
= ( , ),
where is the vector of parameters of interest, and assume that
fJ (yi , xi 1 , ...., xik j) = fC (yi jxi 1 , ...., xik , )fJ (xi 1 , ...., xik j)
(1.3)
Notice that in (1.3) the parameter vector of interest appears only in the
conditional distribution of yi .
When (1.3) holds, (xi 1 , ...., xik ) are said to be weakly exogenous with
respect to , and there is no loss of information as a result of ignoring
fJ (xi 1 , ...., xik j) and focusing exclusively on fC (yi jxi 1 , ...., xik , ).
(ETC3410)
Background Material
15 / 85
u=y
E (y jx1 , x2 , ..., xk ).
(1.4)
y = E (y jx1 , x2 , ..., xk ) + u.
(1.5)
(ETC3410)
Background Material
16 / 85
E (u jx1 , x2 , ..., xk ) = 0.
(1.6)
Equations (1.5) and (1.6) together imply that we can always express y as
the sum of its true conditional mean and a random error term, which itself
has a conditional mean of zero.
(ETC3410)
Background Material
17 / 85
E (y jx1 , x2 , ..., xk )
.
xj
(1.7)
Background Material
18 / 85
Background Material
19 / 85
E (y jx1 , x2 ) = + 1 x1 + 2 x2 ,
(1.8)
= E (y jx1 , x2 ) + u
= + 1 x1 + 2 x2 + u.
(1.9)
E (y jx1 , x2 )
= j , j = 1, 2.
xj
(1.10)
Background Material
20 / 85
(1.11)
(ETC3410)
= E (y jx1 , x2 ) + u
= + 1 x12 + 2 x22 + u.
Background Material
(1.12)
21 / 85
E (y jx1 , x2 )
E (y jx1 , x2 )
= 21 x1 ,
= 22 x2 .
x1
x2
(1.13)
(ETC3410)
Background Material
22 / 85
In some cases, a model specication that allows some of the marginal eects
to vary, such as M2, may be more realistic than one that constrains all the
marginal eects to be constant. For example, if we wished to study the
eect of education on average wages, we might specify the conditional mean
of wages as
Background Material
23 / 85
E (ln y jx1 , x2 ) = + 1 ln x1 + 2 ln x2 ,
(1.15)
ln y
(ETC3410)
= E (ln y jx1 , x2 ) + u
= + 1 ln x1 + 2 ln x2 + u.
Background Material
(1.16)
24 / 85
ln y
= j , j = 1, 2.
ln xj
(1.17)
Background Material
25 / 85
ln y
= lim
ln x !0
ln x
ln y
ln x
ln y
, for small ln x.
ln x
Let
(ETC3410)
ln y
= ln y1
ln y0
ln x
= ln x1
ln x0 .
Background Material
26 / 85
Then
ln y
= ln y1
= ln
= ln
= ln
= ln
ln y0
y1
y0
y1
y0
y1
y0
y1
1+1
y0
y0
+1
y0
y0
+1
y1
100 ln y
(ETC3410)
y0
for small changes in y
y0
y1 y0
100
y0
% change in y.
Background Material
27 / 85
ln(N + 1)
ln(0.2 + 1) = 0.18
0.2.
100 ln x
(ETC3410)
% change in x.
Background Material
28 / 85
ln y
ln x
ln y
100 ln y
=
ln x
100 ln x
% change in y
.
% change in x
For example, if
1 = 2
in M3, then a one percent increase in x1 , holding x2 xed, is associated
with a two percent increase in y.
M4 The conditional mean of the log of the dependent variable is assumed to
be linear in the parameters and in the level of the regressors. (log-level
model)
(ETC3410)
Background Material
29 / 85
E (ln y jx1 , x2 ) = + 1 x1 + 2 x2 ,
(1.18)
ln y
= E (ln y jx1 , x2 ) + u
= + 1 x1 + 2 x2 + u.
(1.19)
ln y
= j , j = 1, 2.
xj
(1.20)
(ETC3410)
Background Material
30 / 85
ln y
xj
ln y
100
xj
100 ln y
xj
= 100
% change in y
.
xj
For example, if
1 = 0.2
(ETC3410)
Background Material
31 / 85
(ETC3410)
Background Material
32 / 85
y = e [ + 1 x 1 + 2 x 2 ] e u
(1.21)
ln y = + 1 x1 + 2 x2 + u,
(1.22)
Background Material
33 / 85
E (y jx1 , x2 ) =
1
1+e
( + 1 x 1 + 2 x 2 )
(1.23)
This model is known as the logit model and is studied in topic 2. The logit
model is intrinsically nonlinear since it cannot be made linear in the
parameters by applying a mathematical transformation.
Intrinsically nonlinear models cannot be estimated by OLS. They are
typically estimated by using the method of maximum likelihood or, less
commonly, the method of nonlinear least squares.
(ETC3410)
Background Material
34 / 85
(ETC3410)
Background Material
35 / 85
y = + 1 x1 + 2 x2 + ..... + k
1 xk 1
+ k xk + u.
(1.24)
(ETC3410)
Background Material
36 / 85
y = + 1 x1 + 2 x2 + ..... + k
1 xk 1
+ k xk + u
(1.24)
+ v.
(1.25)
but we estimate
y = + 1 x1 + 2 x2 + ..... + k
1 xk 1
v = k xk + u.
(1.26)
Background Material
37 / 85
(ETC3410)
Background Material
38 / 85
For example, suppose that we are interested in estimating the marginal eect
of education on an individual wage, controlling for experience, race, gender,
experience and ability. In this case the conditional mean of interest is
+ 6 ability + u.
(ETC3410)
Background Material
(1.28)
39 / 85
In (1.28)
v = 6 ability + u.
(ETC3410)
Background Material
40 / 85
We will see in Topic 3 that if, as we suspect, education and ability are
correlated, the OLS estimator of 1 in equation (1.28a) will no longer be
"reliable" even in very large samples. More specically, the OLS estimator of
1 will be an inconsistent estimator of the marginal eect of education on
the average wage controlling for dierences in experience, race, gender and
ability.. (The concept of consistency is discussed in section 1.4 below).
Informally, if we estimate (1.28a) by OLS, the OLS estimate of 1 will be an
"unreliable" estimate of the marginal eect of education on wages,
controlling for exper, race, gender and ability.
In Topic 4 we will discuss how to deal with the problem of endogenous
regressors.
(ETC3410)
Background Material
41 / 85
Background Material
42 / 85
(ETC3410)
Background Material
43 / 85
Pr (jbn
(1.29)
p lim(bn ) = .
(1.30)
Background Material
44 / 85
p lim (x 1n ) = x 1 , p lim (x 2n ) = x 2 .
That is, the random variables x1n and x2n converge in probability to the
random variables x1 and x2 respectively. Then the following properties can
be shown to hold:
(ETC3410)
Background Material
45 / 85
1
, x 1 6= 0.
x1
p lim
(ETC3410)
x1n
x2n
p lim (x 1n )
x1
, = , x2 6= 0.
p lim (x 2n )
x2
Background Material
46 / 85
Although P1, P2, P3 and P4 above have been stated for scalar random
variables, they can be generalized to random vectors and random matrices.
(That is, vectors and matrices whose elements are random variables).
(ETC3410)
Background Material
47 / 85
(ETC3410)
Background Material
48 / 85
p lim(bn ) = ,
which means that bn collapses to a single point as n goes to innity, in
which case the limiting distribution of bn is degenerate.
In order to obtain a non-degenerate limiting distribution for a consistent
estimator we "normalize" bn as described below.
Formally, we say that bn has a limiting normal distribution if
n ( bn
) ! N (0, V ),
(1.31)
Background Material
49 / 85
n ( bn
n ( bn
N (0, V )
N (0, V )
(1.32)
n ( bn
Background Material
E (c + dx ) = c + dE (x )
var (c + dx ) = d 2 var (x ).
Using these results it follows that if
n ( bn
then
bn
bn
(ETC3410)
N (0, V )
1
p N (0, V ) ,
n
N
0,
Background Material
V
n
,
51 / 85
+N
bn
bn
0,
V
n
V
n
(1.33)
bn
asy
V
n
(1.34)
Background Material
52 / 85
In summary, whenever
n ( bn
) ! N (0, V ),
(1.31)
bn
asy
V
n
(1.34)
(ETC3410)
Background Material
53 / 85
Notice that the asymptotic distribution (1.34) is derived from the limiting
distribution (1.31) by assuming that the latter is approximately true in large
nite samples.
Obviously, the larger the sample size the more likely it is that the asymptotic
distribution is a good approximation to the true nite sample distribution of
bn .
Note:
Most estimators used in econometrics satisfy
n ( bn
) ! N (0, V ).
(1.31)
The results stated in (1.31) and (1.34) generalize to the case in which
bn is a kx1 vector rather than a scalar, as assumed above.
In the case in which bn is a kx1 vector, is also a kx1 vector and Vn is
a kxk variance matrix.
(ETC3410)
Background Material
54 / 85
(ETC3410)
Background Material
55 / 85
Avar (bn )
Avar (bn )
(ETC3410)
Background Material
56 / 85
(ETC3410)
Background Material
57 / 85
(ETC3410)
Background Material
58 / 85
MEexp
2 + 25 ,
the hypothesis that the marginal eect of educ is equal but opposite in
sign to the marginal eect of exper implies that
1 =
(ETC3410)
( 2 + 25 ), or 1 + 2 + 25 = 0.
Background Material
59 / 85
Since
MEgender
MErace
3 ,
the hypothesis that the marginal eect of gender is twice that of race
implies that
4 = 23 , or 4 23 = 0.
Notice that each of these economic hypotheses has been expressed as a
restriction on the parameters of the model.
(ETC3410)
Background Material
60 / 85
The two hypotheses we wish to test impose the following two linear
restrictions on the parameters of the wage equation
1 + 2 + 25 = 0
(1.35)
4
23 = 0
= r ,
(2x 6 )(6x 1 )
(ETC3410)
(2x 1 )
Background Material
(1.36)
61 / 85
0 1 1
0 0 0
0
0
0 0 2
2 1 0
, =
R = r
)
0 1 1
0 0 0
(ETC3410)
0 0 2
2 1 0
Background Material
2
6
6
6
6
6
6
4
1
2
3
4
5
7
7
7
7=
7
7
5
0
0
62 / 85
1 + 2 + 25
23 + 4
0
0
)
1 + 2 + 25 = 0
(1.35)
4
23 = 0
= r .
(qxk )(kx 1 )
(qx 1 )
(1.37)
Background Material
63 / 85
r = 0.
(1.38)
where b
is our estimator of .
(ETC3410)
Rb
r = 0,
Background Material
64 / 85
Then,
asy
b
Rb
(ETC3410)
asy
RN
V
n
Background Material
(1.39)
V
n
65 / 85
Rb
Rb
Rb
asy
asy
asy
R ,
RVR 0
,
n
R ,
RVR 0
n
r,
RVR 0
.
n
(1.40)
In going from the second line to the third line of the derivation we used the
result that
Var (R b
) = RVar (b
)R 0
RVR 0
=
.
n
(ETC3410)
Background Material
66 / 85
r = 0,
R
Rb
asy
0,
RVR 0
.
n
(1.41)
In principal, we could use (1.41) as our test statistic. However, if we did so,
the critical value for our test would depend on R, and there would be a
dierent critical value for each possible choice of R.
We can eliminate the dependence on R of the critical value for our test
statistic by transforming our test statistic from a normal random variable
into a chi-square variable. The transformation is achieved by appealing to
the following well known theorem in mathematical statistics.
(ETC3410)
Background Material
67 / 85
Theorem
1. Let Z be a kx1 random vector. If
Z
asy
N (0, ),
then
Z 0
asy
2 (q ),
(ETC3410)
Background Material
68 / 85
Rb
asy
0,
RVR 0
,
n
(1.41)
with R b
r playing the role of Z, we conclude that, under the null
hypothesis
Rb
RVR 0
n
r = 0,
1
Rb
asy
2 (q ),
(1.42)
(1.43)
(ETC3410)
Background Material
69 / 85
W = Rb
b R0
RV
n
Rb
asy
2 (q ),
(1.44)
b) = V.
p lim( V
(ETC3410)
Background Material
70 / 85
(ETC3410)
b R0
RV
Background Material
Rb
r .
(1.45)
71 / 85
V
n
(1.39)
r =0
(1.42)
if
Background Material
72 / 85
q = rank
bR
RV
n
W = Rb
b R0
RV
n
Rb
asy
2 (q )
(1.44)
(ETC3410)
r =0
Background Material
(1.42)
73 / 85
p
where
W
q
asy
F (q, n
k ),
(1.46)
where n is the sample size and k denotes the number of regressors in the
model (including the constant).
(ETC3410)
Background Material
74 / 85
Fcalc
Wcalc
> F0.95 (q, n
q
k ),
Background Material
75 / 85
H0 : k = 0
in the linear regression equation
y = 1 + 2 x2 + ..... + k + u
(i.e. testing the individual signicance of xk ), it can be shown that he test
statistic
W = Rb
(ETC3410)
b R0
RV
n
Background Material
Rb
asy
2 (q )
(1.44)
76 / 85
Wz =
b
k
se (b
k )
asy
N (0, 1).
(1.47)
r =0
(1.42)
LR
2(lu
lr )
asy
2 (q ),
(1.48)
Background Material
77 / 85
(1.49)
or in matrix notation
y = X + u.
(1.50)
(1.51)
(1.52)
and
where
Background Material
78 / 85
6
6
Var (u jX ) = 6
6
4
(nxn )
When
2 0 . . 0
0 2 0 . 0
.
. . . .
.
. . . .
0
. . . 2
6
7
6
7
7 = 2 6
6
7
4
5
1 0 . . 0
0 1 0 . 0
. . . . .
. . . . .
0 . . . 1
Var (u jX ) = 2 In ,
7
7
7 = 2 In .
7
5
(1.53)
the errors in (1.50) are said to be "spherical", and when (1.53) is violated
they are said to be "non-spherical".
Notice that when the errors are spherical, the error covariance matrix is a
scalar identity matrix, that is, an identity matrix multiplied by a scalar 2 .
Assumption (1.51) is usually unrealistic for cross-section data, and
assumption (1.52) is usually unrealistic for time series data.
(ETC3410)
Background Material
79 / 85
Var (u jX ) = 6= 2 In ,
(1.54)
where the precise form of depends on the nature of the departure from
sphericity. For example, in the case of conditionally uncorrelated,
heteroskedastic errors
6
6
=6
6
4
(ETC3410)
21 0 . . 0
0 22 0 . 0
.
. . . .
.
. . . .
0
. . . 2n
Background Material
7
7
7.
7
5
80 / 85
y = X + u.
(1.50)
(1.50)
1/2
y =
1/2
X+
1/2
u,
or
y = X +u ,
(1.55)
where
y
(ETC3410)
1/2
y, X
1/2
Background Material
X, u
1/2
u.
81 / 85
Notice that
Var (u jX ) = Var (
=
=
=
=
=
1/2
1/2
1/2
1/2
u)
Var (u jX )
1/2
1/2
1/2
(using (1.54))
1/2
1/2
0 0
In .
(1.56)
(ETC3410)
Background Material
82 / 85
(1.55)
b
= (X 0 X ) 1 X 0 y
h
i 1
= ( 1/2 X )0 ( 1/2 X
( 1/2 X )0 y
h
i 1
= X 0 1/2 1/2 X
X 0 1/2 1/2 y
(1.57)
we obtain
X 0
X 0
y.
y = X + u,
(ETC3410)
Background Material
(1.50)
83 / 85
b
GLS = X 0
X 0
y.
(1.58)
y = X + u,
(1.50)
is
b
OLS = (X 0 X )
X 0y ,
X 0
b
GLS = X 0
(ETC3410)
Background Material
y.
(1.58)
84 / 85
b
OLS = (X 0 X )
X 0y ,
= Var (u jX )
b
GLS is not a feasible estimator, since it depends on the unknown matrix
1 . A feasible GLS (FGLS) estimator is given by
h
i 1
b
b 1X
b 1y ,
FGLS = X 0
X 0
(1.59)
where
b = .
p lim
b is a consistent estimator of .
That is,
Many of the estimators that we will discuss in this unit are FGLS estimators.
(ETC3410)
Background Material
85 / 85