You are on page 1of 4

Cheat Sheet for Final Exam

Simple linear regression (SLR): ^ 1 =

sxy
s2x

= rxy ssxy , ^ 0 = y

^ 1x

Simple linear regression (SLR) with intercept only: ^ 0 = y


Multiple linear regression (MLR):
Assumptions:
MLR.1: y = 0 + 1 x1 +
+ k xk + u (population model)
MLR.2: Random sample f(yi ; xi1 ; : : : ; xik )gni=1 from the population model
MLR.3: In the sample, none of the independent variables is constant and
there are no exact linear relationships among the independent variables.
MLR.4: E(ujx1 ; : : : ; xk ) = 0 for all x
MLR.5: V ar(ujx1 ; : : : ; xk ) = 2 for all x
MLR.6: ujx N (0; 2 ) (u is normal with constant variance)
For j = 1; : : : ; k:
V ar( ^ j ) =

SSTj (1

Rj2 )

; se( ^ j ) = q

^
SSTj (1

Rj2 )

P
where SSTj = i (xij xj )2 = (n 1)s2xj , Rj2 is the R-squared from the
regression of xj on the other x variables, and ^ 2 = nSSR
.
k 1
Omitted-variables bias: If ~ 1 is the slope estimate from OLS of y on x1 but
the true regression model is y = 0 + 1 x1 + 2 x2 + u, the expected value of
~ 1 is
E( ~ 1 ) = 1 + 2 ~1 ;
where ~1 is the slope from the simple linear regression of x2 on x1 .
R-squared:
SSE
=1
SST

R2 =
where SST =

i (yi

SSR
2
= ry2^;y (or rx;y
in SLR)
SST
P
P
y)2 , SSE = i (^
yi y)2 , SSR = i (yi
1

y^i )2

Test statistics
1. t-statistic (for testing H0 :

= a)
t =

^j

se( ^ j )

(works for robust or non-robust se)


2. 100(1- )% Condence Interval is,
^

se( ^ j )t

=2 (n

1) <

< ^ j + se( ^ j )t

=2 (n

1)

3. F -statistic (ur= unrestricted, r= restricted, q = # of restrictions


in H0 )
2
Rr2 )=q
(Rur
F =
2 )=(n
(1 Rur
k 1)
or
(SSRr SSRur )=q
F =
SSRur =(n k 1)
(only works under nite-sample (normality) assumptions)
Prediction:
In the MLR model in MLR.1, if one were to predict E(yjx0 ) then one would
use
y^ = ^ 0 + ^ 1 x1 +
+ ^ k xk
Considered as an estimate of E(yjx0 ) = 0 + 1 x1 + + k xk the variance of
this prediction is V ( ^ 0 + ^ 1 x1 + + ^ k xk ) = V (^
y ). Considered as an estimate
of y0 = 0 + 1 x1 +
+ k xk + u0 its variance is actually V (^
y ) + V (u0 ) =
2
V (^
y) +
under MLR.5

Time Series
AR(1) model
yt =

1 yt 1

E(ut jyt 1 ; yt 2 ; :::) = 0

+ ut

An AR(1) model is weakly stationary and weakly dependent if j


forecasts from a stationary AR(1) model are of the form,
yn+k =

0 (1

2
1

+ ::: +

k 1
1 )

1j

< 1. The

k
1 yn

and as k grows this gets closer to


0

Random Walk Model


yt =

+ yt

+ ut where E(ut jyt 1 ; yt 2 ; :::) = 0

so that if also V (ut jyt 1 ; yt 2 ; :::) =

E(yt ) = t
V (yt ) = t

0
2

Lagged Eects Model


yt =

1 yt 1

1 zt 1

2 zt 2 :::: p zt p

+ ut

Instant eect of a change of z on y is zero, eect of a one time change in


z felt q < p periods later is q , long run eect of permanent change in z is
1 + 2 :::: + p

Fixed-eects model for panel data


yit =

1 xit1

k xitk

+ ai + uit

uit uncorrelated with x variables, ai allowed to be correlated with x variables


Linear Probability Model
For a binary outcome yi we have,

and,

P (yi
yi
E(ui jx1i ; :::; xki )
V (ui jx1i ; :::; xki )

= 1jx1i ; :::; xki ) = 0 + 1 x1i +


+ k xki
= 0 + 1 x1i +
+ k xki + ui
= 0
= P (yi = 1jx1i ; :::; xki )(1 P (yi = 1jx1i ; :::; xki ))
P (yi = 1jx1i ; :::; xki ) =

xji

Proxy Variable
In the model,
yi =

1 x1i

2 x2i

+ ui

if x2i is not observed but we see a variable zi such that,


E(x2i jx1i ; zi ) = 0 +
E(ui jx1i ; zi ) = 0

1 zi

then we can use zi in place of x2i and estimate the coe cient
yi = (

0 2)

1 x1i

2 1 zi

that is,

+ errori

is a valid regression model.


Instrumental Variables
If we have a violation of ZCM in the model,
yi =

1 xi

+ ui

so that E(ui xi ) 6= 0 but we can nd a variable zi such that,


E(zi ui ) = 0
E(xi jzi ) = 0 +

1 zi

with

6= 0

then zi can be used as an instrument for xi and,


Pn
^
^ 1 = Pni=1 (zi z)(yi y) = 1
z)(xi x)
^1
i=1 (zi

where ^1 is the slope in the regression of y on z and ^ 1 is the slope in the


regression of x on z.
4

You might also like