You are on page 1of 35

Three Variable Regression Model

• Yi = B1+B2X2i+B3X3i_ Nonstochastic form,


PRF

• Yi = B1+B2X2i+B3X3i+ui  stochastic

• B2, B3 called partial regression or partial


slope coefficients

• B2 measures the change in mean value of Y,


per unit change in X2 holding the value of
X3 constant

• Yi = b1+b2X2i+b3X3i+ei  SRF
Assumptions
• Linear relationship
• Xs are non-stochastic variables.
• No linear relationship exists between two or
more independent variables (no multi-
collinearaity). Ex:X2i = 3 +2X3
• Error has zero expected value, constant
variance and normally distributed

• RSS = ∑e2 = ∑(Yi – Ŷi)2


= ∑(Yi – b1-b2X2i-b3X3i)2
Least squire estimators
• Like 2-variable case, we can derive formulae
for var(b1), var(b2) & var(b3) and hence their
S.E.s

• We can also estimate σ2 as


̂ = ∑e2/(n – 3)
2

• Goodness of fit, R2 = ESS/RSS


R2 = [b2∑yix2i+b3 ∑yix3i]/∑yi2

• 0 ≤ R2 ≤ 1
Testing of hypothesis, t-test

• Say, Ŷi = -1336.09 + 12.7413X2i+85.7640X3i


(175.2725) (0.9123) (8.8019)
p=0.000 0.000 0.000
R2 = 0.89, n =32

• H0: B1=0, b1/se(b1)~ t(n-3)


• H0: B2=0, b2/se(b2)~ t(n-3)
• H0: B3=β, (b3 - β) /se(b3)~ t(n-3)
Testing Joint Hypothesis, F Test
H0 : B 2 = B 3 = 0
Or, H0 : R2 = 0
• X2 & X3 explain zero percent of the
variation of Y
H1: At least one B ≠ 0
• A test of either hypothesis is called a test
of overall significance of the estimated
multiple regression
• We know, TSS = ESS + RSS
F test
• If computed F value exceeds critical F value, we
reject the null hypothesis that the impact of
explanatory variables is simultaneously equal to zero

• Otherwise we cannot reject the null hypothesis

• It may happen that not all the explanatory


variables individually have much impact on dependent
variable (i.e., some of the t values may be
statically insignificant) yet all of them collectively
influence dependent variable (H0 is rejected in F
test)

• This happen only we have the problem of


multicollinearity
Specification error
• In this example we have seen that
both the explanatory variables are
individually and collectively different
from zero

• If we omit any one of these


explanatory variable from our model,
then there would be specification
error
• What would be b1, b2 & R2 in 2-
variable model?
Specification error
• Ŷi = -1336.09 + 12.7413X2i+85.7640X3i
(175.2725) (0.9123) (8.8019)
p=0.000 0.000 0.000
R2 = 0.89, n =32

• Ŷi = -191.66 + 10.48X2
(264.43) (1.79)
R2 = 0.53

• Ŷi = 807.95 + 54.57X3i
(231.95) (23.57)
R2 = 0.15
R2 versus Adjusted R2
• Larger the number of explanatory variables
in the model, the higher the R2 will be

• However, R2 does not take into account dof

• Therefore, comparing R2 values of the two


models with same dependent variable but
different numbers of explanatory variables
is essentially like comparing apples and
bananas

• We need a measure of fit that is adjusted


for the no. of explanatory variables in the
model
R2 versus Adjusted R2
• Such a measure is called Adj R2
(n  1)
R  1  (1  R )
2 2

(n  k )
• If k > 1, Adj R2 ≤ R2, as the no of explanatory
variables increases in the model, Adj R2
becomes increasingly smaller than R2
• It enable us to compare two models that have
same dependent variable but different
numbers of independent variables

• In our example, it can be shown that


Adj R2=0.88 < 0.89 (R2)
When to add an additional variable?

• We often faced with problem of


deciding among several competing
explanatory variables

• Common practice is to add variables as


long as Adj R2 increases even though its
numerical value may be smaller than R2
Computer output & Reporting
The Chicken Consumption
Example

• Explain US Consumption of Chicken

• Time Series Observations - 1950-1984


Variable Definitions
• CHCONS - Chicken consumption in the
US

• LDY - Log of disposable income in the


US

• PC/PB - Price of Chicken relative to the


Price of ‘Best Red Meat’
Data Time plots
Actual plots of the data over time
follows
• Note the trends and cycles
• What are the relationships between
the variables?
• Are movements in CHCONS related to
movements in LDY and PC/PB?
CHCONS

10.0
20.0
30.0
40.0
50.0
60.0

0.0
1950

1952

1954

1956

1958

1960

1962

1964

1966
Time plot - CHCONS Actual Data

YEAR
1968

1970

1972

1974

1976

1978

1980

1982

1984
Timeplot-LDY Actual Data

10.0000

9.0000

8.0000

7.0000

6.0000
LDY

5.0000

4.0000

3.0000

2.0000

1.0000

0.0000

Year
Timeplot-PC/PB Actual Data

1.6000

1.4000

1.2000

1.0000
PC/PB

0.8000

0.6000

0.4000

0.2000

0.0000
1950

1953

1956

1959

1962

1965

1968

1971

1974

1977

1980

1983
Year
Chicken Consumption vs.
Income
• There may be a relationship between
CHCONS and LDY

• A simple plot of the two variables


seems to reveal this

• Note the positive relationship


Scatter Plot - CHCONS vs. LYD

60.0

50.0

40.0
CHCONS

30.0

20.0

10.0

0.0
7.0000 7.5000 8.0000 8.5000 9.0000 9.5000

LYD
Chicken Consumption vs.
Relative Price of Chicken
• There may also be a relationship
between CHCONS and PC/PB

• A plot of these two variables shows


the relationship

• Note the negative relationship


Scatter Plot - CHCONS vs PC/PB

60.0

50.0

40.0
CHCONS

30.0

20.0

10.0

0.0
0.0000 0.2000 0.4000 0.6000 0.8000 1.0000 1.2000 1.4000 1.6000

PC/PB
CHCONS = f(LDY)
• Simple linear regression captures the
relationship between CHCONS and
LDY, assuming no other relationships

• This regression explains much of the


change in CHCONS, but not everything

• The plotted regression line shows the


hypothesized relationship and the
actual data
CHCONS = f(LDY)
LDY Const.
Coeff 15.86 -92.17
SE(b) 0.53 4.34

R2 = 0.9641 SE(y) = 2.03


F = 879.05 df = 33
SSReg= 3639.12 SSResid = 136.61
(also called SSE) (also called SSR)
Regression Line - CHCONS = f(LYD)

60.00

50.00

40.00
CHCONS

30.00 CHCONS = f(LYD) Actual Data

20.00

10.00

0.00
7.0000 7.5000 8.0000 8.5000 9.0000 9.5000

LYD
CHCONS = f(PC/PB)
• Another simple regression examines the
relationship between CHCONS and
PC/PB

• While the line explains some of the


variation of CHCONS, there is more
unexplained error
CHCONS = f(PC/PB)
PC/PB Const.
Coeff -28.83 50.77
SEb 2.93 1.75

R2 = 0.746 SE(y) = 5.39


F = 97.14 df = 33
SSReg = 2818.32 SSResid = 957.42
(also called ESS) (also called RSS)
Regression Line - CHCONS = f(PC/PB)

60.00

50.00

40.00
CHCONS

30.00 CHCONS=f(PC/PB) Actual Data

20.00

10.00

0.00
0.0000 0.5000 1.0000 1.5000

PC/PB
CHCONS = f(LDY,PC/PB)
LDY PC/PB Const.
Coeff 12.79 -8.08 -63.19
SEb 0.54 1.12 4.84

R2 = .986 SEy = 1.27


F = 1149.89 df = 32
SSReg = 3723.92 SSResid = 51.82
(SSE) (SSR)
Actual vs. Predicted

60.0

50.0

40.0
CHCONS

30.0 Actual CHCONS=f(LDY,PC/PB)

20.0

10.0

0.0

YEAR
• Table 7.8 Gujarati: US Defense budget
outlays 1962 – 1981

Yt= Defense budget outlays for year t ($ Bn)


X2t=GNP for year t ($ Bn)
X3t=US military sales/assistance ($ Bn)
X4t=Aerospace industry sales ($ Bn)
X5t= Military conflicts involving troops
=0, if troops < 100000
=1, if troops > 100000
• Table 8.10, Gujarati
Table gives data used by a telephone cable
manufacturer to predict sales to a major
consumer for the period 1968 – 1983

Y=annual sales in MPF (million paired feet)


X2=GNP (billion $)
X3=housing starts (1000 of units)
X4=Unemployment rate (%)
X5=Prime rate lagged 6 months
X6= Customer line gains (%)
• Introduce later
• Table 7.10, Gujarati
Consider following demand function for
money in US for 1980 – 1998
M t  b1Yt rt e
b2 b3 ut

Where, M = Real money demand


Y = Real GDP
r = Interest rate
LTRATE: Long term interest rate (30
yr tr bond)
TBRATE: 3 months tr bill rate

You might also like