4.14 a) The expected relationships between CM and each of the other variables are 1. CM has negative relationship with FLR. That is when FLR increase, CM will decrease, vice versa. And, 2. CM also has negative relationship with PGNP. That is when per capita GNP increase, CM will decrease, vice versa. But, 3. CM has positive relationship with TFR. That is when TFR increase, CM will also increase.
b) CM = 263.8635153 - 2.390496025*FLR
Variable Coefficient Std. Error t-Statistic Prob.
C 263.8635 12.22499 21.58395 0.0000 FLR -2.390496 0.213263 -11.20917 0.0000
R-squared 0.669590 Mean dependent var
141.5000 Adjusted R-squared 0.664261 S.D. dependent var
75.97807 S.E. of regression 44.02399 Akaike info criterion
10.43810 Sum squared resid 120163.0 Schwarz criterion
10.50556 Log likelihood -332.0191 F-statistic
125.6455 Durbin-Watson stat 2.314744 Prob(F- statistic)
0.000000
c) CM = 263.6415856 - 2.231585732*FLR - 0.005646594817*PGNP
141.5000 Adjusted R-squared 0.734740 S.D. dependent var
75.97807 S.E. of regression 39.13127 Akaike info criterion
10.23218 Sum squared resid 91875.38 Schwarz criterion
10.36711 Log likelihood -323.4298 F-statistic
59.16767 Durbin-Watson stat 2.170318 Prob(F- statistic)
0.000000
Analysis of Variance
Source of Variation
df Sum of Sq. Mean Sq.
ESS
3 271802.6082 90600.8942 RSS
60 91875.38 1531.25633
TSS
63 363677.9882
e) Given the regression results, model d) is the best among other models, because the adjusted R- squared of model d) is higher than other models. Specifically, Adjusted R2 of model d) is 73.47%. This means that 73.47% of the total variation in the dependant variable CM explained jointly by variables FLR, PGNP, and TFR. This value is higher than the adjusted R2 of model b) and c). Therefore, by having more independent variable such like in d) can explain CM better. Moreover, this conclusion is confirmed by the F-statistic. Its value is significant. That is by we should include variables FLR, PGNP, and TFR into the model.
f) By choosing other model rather than d), we are subjected to the specification error. That is we have not included all relevant variables into the model. Therefore, this means that estimated results are subjected to error, and might not correctly predict the estimation.
g) We would add more variables as long as the adjusted R2 increases, or as the absolute t value of the coefficient of the added variable is larger than 1. For example, in c) we add variable PGNP into the model. We can easy conclude that we should add PGNP by looking at PGNPs t value. In this case, it is - 2.818703. Thus, its absolute value is larger than 1. So adding PGNP into the model, the adjusted R2 will increase. In this case, it increases from 0.664261 to 0.698081. This means that adding PGNP into the model, the model can explain the changes in dependant variable CM better.
4.18 a) ASP = 60188.46737 - 464.0484243*RANK + 0.5395448963*TUITION + 11372.83469*RECRUITER - 13899.25105*ACCEPTANCE
104870.3 Adjusted R-squared 0.869326 S.D. dependent var
15867.67 S.E. of regression 5735.986 Akaike info criterion
20.24336 Sum squared resid 1.45E+09 Schwarz criterion
20.43640 Log likelihood -490.9623 F-statistic
80.83148 Durbin-Watson stat 2.419268 Prob(F- statistic)
0.000000
b) If we include GPA and GMAT into the model, we might miss interpret the average starting pay of MBA graduates, because as according to correlation between GPA and ASP, and GMAT and ASP are positive. But as according to the regression result, it suggests that they have negative relationship to ASP. Therefore, such results conflict to the correlation and general expectation.
Correlation ASP RANK GPA GMAT EMPGRAD TUITION RECRUITER ACCEPTANCE ASP 1 -0.8889 0.41545 0.75303 0.44888 0.77345 0.8743 -0.7368
c) No, it does not mean that. The result suggests that, holding other variables constant, on average the more expensive business school, the likely the average starting pay will be. The proxy for variable Tuition is that
d) Since both variable GPA and GMAT are linearly related, we cannot assess the individual effect of GPA and GMAT on ASP.
e) Analysis of Variance
Source of Variation
df Sum of Sq. Mean Sq.
ESS
4 1.09E+10 2.72E+09 RSS
44 1.45E+09 3.37E+09
TSS
48 1.23E+10 F statistic = 80.83148
Therefore, the F value is significant. We reject the null hypothesis that all partial slope coefficients are zero.