Solution PR Cat Ice Multiple Regression

Q1.
We estimate a model relating median housing price (price) in the community t

o various community characteristics: nox is the amount of nitrogen oxide in the
air, in parts per million; dist is a weighted distance of the community from fiv
e employment centers, in miles; rooms is the average number of rooms in houses i
n the community; and stratio is the average student-teacher ratio of schools in
the community. We run the following model:
. reg price nox dist rooms stratio
Source |
SS
df
MS
-------------+-----------------------------Model | 2.6544e+10
4 6.6359e+09
Residual | 1.6282e+10 501 32498608.9
-------------+-----------------------------Total | 4.2826e+10 505
84803032
Number of obs
F( 4, 501)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
506
204.19
0.0000
0.6198
0.6168
5700.8
-----------------------------------------------------------------------------price |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------nox | -3044.913 353.6792
-8.61 0.000
-3739.79 -2350.036
dist | -965.4921 191.4962
-5.04 0.000
-1341.727 -589.2575
rooms | 6808.769 401.3554
16.96 0.000
6020.222
7597.316
stratio | -1269.168 127.3659
-9.96 0.000
-1519.405
-1018.93
_cons | 23716.16 5120.564
4.63 0.000
13655.73
33776.58
-----------------------------------------------------------------------------(a) There are ten missing values in the output. Fill in the blanks.
(b) Which varibale(s) is(are) individaully significant at 5% level? Why?
All independent variables are highly significant as all the p-values are less th
an 0.01.[0 is not contained in any of the 95% CIs]
(c) Test whether the independent variables are jointly significant in explaining
median housing prices.
The independent variables are also jointly significant as F=204.19 > 2.45 (table
F value at alpha=0.05, v1=4, v2=120; accept 2.37 as well)
*************************************
Q2. Regression analysis can be used to test whether the market efficiently uses
information in valuing stocks. For concreteness, let return be the total return
from holding the firm?s stock over the four year period from the end of 1990 to
the end of 1994.The efficient markets hypothesis says that these returns should
not be systematically related to information known in 1990. If firm characterist
ics known at the beginning of the period help to predict stock returns, then we
could use this information in choosing stocks.
For 1990, let dkr be a firm?s debt to capital ratio, let eps denotes the earning
s per share, let netinc denote net income, and let salary denote total compensat
ion for the CEO.Using data, we estimate:
. reg return dkr eps netinc salary
Source |
SS
df
MS
Number of obs =
142
-------------+-----------------------------Model | 8649.26028
4 2162.31507
Residual | 210446.917 137 1536.10888
-------------+-----------------------------Total | 219096.178 141 1553.8736
F( 4, 137)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
1.41
0.2347
0.0395
0.0114
39.193
-----------------------------------------------------------------------------return |
Coef. Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------dkr | .3205444 .2009106
1.60 0.113
-.0767426
.7178314
eps | .0426986 .0781384
0.55 0.586
-.1118147
.1972119
netinc | -.0051086 .0046748
-1.09 0.276
-.0143526
.0041354
salary | .0034993 .0021935
1.60 0.113
-.0008382
.0078369
_cons | -14.37022 6.893616
-2.08 0.039
-28.00187 -.7385647
(b) Which varibale(s) is(are) individaully significant at 5% level?
None of the variables are statistically significant.
(c)Test whether the explanatory variables are jointly significant.
The explanatory variables are also jointly insignificant at 5% level as F = 1.41
< 2.45 (table value at alpha =0.05, v1=4 and v2=120; also accept 2.37)
(d)Is the evidence of predictability of stock returns strong or weak? (1 mark)
It seems very weak. Both t and F statisitcs are insignificant. Plus, less than 4
% of the variation in return is explained by the independent variables.
*****************************************
Q3. State what is multicollinearity? What is the effect of multicollinearity on
efficiency or precision of your estimates? How would you check for it?
Multicollinearity is high correlation (linear relationship) between the independ
ent variables; it reduces the precision or efficincy of our estimates; one way o
f investigating it is to find correlations between the independent variables
*******************************************
Q4. Let rdintens be expenditures on research and development as a percentage of
sales. Using data for 32 firms in the chemical industry, we regress rdintens on
sales and a quadratic term in sales:
. reg rdintens sales salessq
Source |
SS
df
MS
-------------+-----------------------------Model | 16.1532567
2 8.07662836
Residual | 92.6802136
29 3.19586944
-------------+-----------------------------Total | 108.83347
31 3.51075711
Number of obs
F( 2,
29)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
32
2.53
0.0973
0.1484
0.0897
1.7877
-----------------------------------------------------------------------------rdintens |
Coef. Std. Err.
t
P>|t|
-------------+----------------------------------------------------------------
sales | .0003006 .0001393

2.16 0.039
.0000157
.0005855
salessq | -6.95e-09 3.73e-09
-1.86 0.072
-1.46e-08
6.75e-10
_cons | 2.612512 .4294418
6.08 0.000
1.734205
3.490819
(b) Write the least squares prediction equation.
saleshat = 2.612512 + 0.0003006*sales - -0.00000000695*salessq
(c) Test at 5% level to see if the quadratic term, salessq, is statistically sig
nificant.
The p-value is 0.072 which is bigger than 0.05, so the quadratic term, salessq,
is not statistically significant at 5%.[It is statistically significant at 10% l
evel, though]
(d) At what point does the marginal effect of sales on rdintens becomes negative
?
At sales = -(0.0003006)/2(-0.00000000695) = 21626
(e) Suggest another variable, besides sales, that could be useful in explaining
rdintens.
(f) Suppose the variable you suggested in part (e) is highly correlated with sal
es. If you included that variable in your regression model, what problem may ari
se? What would be its effect on the precision of your estimates? (2 marks)
Multicollinearity. High multicollinearity between the independent variables redu
ces the precision or efficiency of our estimates.
(g) Suppose the variable you suggested in part (e) is highly correlated with sal
es and you do not include it in your regression model. Would your estimates be u
nbiased/accurate?
In this case, our assumption E(e/x1,x2) = E(e) will not hold because error term
will be strongly related with our independent variable. So, our estimates will n
ot be unbiased or accurate.
[Here we have a tradeoff between bias and efficency (accuracy and precision). In
cluding a relevant variable, in this case, reduces efficiency via multicollineaa
rity wheras omitting it biases our estimates.]

Solution PR Cat Ice Multiple Regression

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solution PR Cat Ice Multiple Regression

Uploaded by

Copyright:

Available Formats

Q1.

We estimate a model relating median housing price (price) in the community t

sales | .0003006 .0001393

You might also like