You are on page 1of 9

5)

Here is the summary of the data


> summary(teengamb)
sex
status
income
verbal
gamble
Min. :0.0000 Min. :18.00 Min. : 0.600 Min. : 1.00 Min. : 0.0
1st Qu.:0.0000 1st Qu.:28.00 1st Qu.: 2.000 1st Qu.: 6.00 1st Qu.: 1.1
Median :0.0000 Median :43.00 Median : 3.250 Median : 7.00 Median : 6.0
Mean :0.4043 Mean :45.23 Mean : 4.642 Mean : 6.66 Mean : 19.3
3rd Qu.:1.0000 3rd Qu.:61.50 3rd Qu.: 6.210 3rd Qu.: 8.00 3rd Qu.: 19.4
Max. :1.0000 Max. :75.00 Max. :15.000 Max. :10.00 Max. :156.0

Here are graphs of the data:

b)
Here is the summary of the fit:
> summary(fit)

Call:
lm(formula = teengamb$gamble ~ teengamb$income)
Residuals:
Min
1Q Median
3Q
Max
-46.020 -11.874 -3.757 11.934 107.120
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
-6.325
6.030 -1.049
0.3
teengamb$income 5.520
1.036 5.330 3.05e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 24.95 on 45 degrees of freedom
Multiple R-squared: 0.387,
Adjusted R-squared: 0.3734
F-statistic: 28.41 on 1 and 45 DF, p-value: 3.045e-06

C)
The LS estimators are the same.
> bhat
[,1]
(Intercept)
-6.324559
teengamb$income 5.520485

They are:

D)
38.7% is explained by the covariate
E)
Observation 24 has the largest residual of 107.1197.
F)
The mean and the median are
> mean(resid)
[1] -5.203801e-16
> median(resid)
[1] -3.757382

The mean is near zero which is one of our assumptions for residuals. The mean and the median are a
little far from each other, this goes against our assumption of normality because in a normal distribution
the mean and the median are the same.
G)
The multiple correlation coefficient is
> sqrt(0.387)
[1] 0.6220932

This is the square root of R^2


H)
The 99% confidence interval is
> confint(fit,level=.99)
0.5 % 99.5 %
(Intercept)
-22.542419 9.893300
teengamb$income 2.734687 8.306283

I)
The plot for the elliptical confidence region is

J)

6)
Here are the histograms for the data sets. From the graphs we can see that salary and experience are
symmetric about their mean values.

B)
Here are the plots of salary vs the other covariates. There is a clear relationship between grants and
salary. The relationship between the other graphs seem to be weaker.

C)
Here is the fitted data
> summary(salfit)

Call:
lm(formula = salary ~ pubqual + exp + grant)
Residuals:
Min
1Q Median
3Q
Max
-1.2045 -0.7203 -0.2228 0.6446 1.7378
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.64250 2.00988 -1.812 0.08426 .
pubqual
-0.06757 0.03482 -1.940 0.06587 .
exp
-0.25975 0.20295 -1.280 0.21455
grant
0.31023 0.08449 3.672 0.00142 **
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.9318 on 21 degrees of freedom
Multiple R-squared: 0.5332, Adjusted R-squared: 0.4666

F-statistic: 7.997 on 3 and 21 DF, p-value: 0.0009608

D)
It seems to be fitted pretty well. Adjusted R^2 is 47%.
E)
As we saw in the graphs, the number success in obtaining grants is the only one that is significant.
F)
The confidence intervals are shown below:
> confint(salfit,level=.90)
5%
95 %
(Intercept) -7.1009842 -0.184022886
pubqual
-0.1274903 -0.007651218
exp
-0.6089731 0.089474792
grant
0.1648437 0.455616502

We could have predicted that the p value is very small.


G)
Here is the plot:

The hypothesis test that we have to look at here is B1=0 and B2=0. Since they are in the confidence

region we fail to reject the null that the coefficients are not zero.
H)
> p1
$fit
1
2
3
4

fit
lwr
upr
-7.624190 -17.47420 2.2258216
-9.979022 -23.74413 3.7860911
-5.269358 -11.50307 0.9643537
-14.931345 -36.87058 7.0078888

$se.fit
1
2
3
4
4.736462 6.619072 2.997533 10.549667
$df
[1] 21
$residual.scale
[1] 0.9318155
>
>
>
>
1
2
3
4

lower.bound <- p1$fit[,1]-sqrt(2*qf(.95,2,length(x)-2))*p1$se.fit


upper.bound <- p1$fit[,1]+sqrt(2*qf(.95,2,length(x)-2))*p1$se.fit
# Simultaneous C.I.
cbind(prediction=p1$fit[,1],lower.bound,upper.bound)
prediction lower.bound upper.bound
-7.624190 -19.61468 4.366296
-9.979022 -26.73539 6.777344
-5.269358 -12.85770 2.318982
-14.931345 -41.63812 11.775434

I)
We cannot say anything about the salary. The model does not make sense.
> p2
$fit
fit
lwr
upr
1 -4.262172 -10.3172 1.792859
$se.fit
[1] 2.75848
$df
[1] 21
$residual.scale
[1] 0.9318155

You might also like