You are on page 1of 73

Engineering Statistics 5th edition October 01, 2010

CHAPTER 6

Note to Instructor: For computer exercises, the procedure Regression under Stat in Minitab can be used for the
regression analysis except for computing confidence intervals on the regressor variables.

Sections 6-2

6-1. a) The regression equation is


Thermal = 0.0249 + 0.129 Density

Predictor Coef StDev T P


Constant 0.024934 0.001786 13.96 0.000
Density 0.128522 0.007738 16.61 0.000
S = 0.0005852 R-Sq = 98.6% R-Sq(adj) = 98.2%

Analysis of Variance
Source DF SS MS F P
Regression 1 0.000094464 0.000094464 275.84 0.000
Residual Error 4 0.000001370 0.000000342
Total 5 0.000095833

y 0.0249  0.129x

b) 0.0005747
-0.0007088
0.0001486
-0.0004799
-0.0000644
0.0005298

c) SSE = 0.000001370 V 2 = 0.000000342


d) se( E1 ) = 0.007738, se( E 0 ) = 0.001786

e) SST = 0.000095833
SSR = 0.000094464, SSE = 0.000001370, and SSR + SSE = 0.000095834
Therefore, SST = SSR + SSE

f) R2 = 98.6%. This is interpreted as 98.6% of the total variability in thermal conductivity can be explained by the
fitted regression model.

g) See the Minitab output given in part a.


Based on the t-tests, we conclude that the slope and intercept are nonzero.
Based on P-values, both the intercept and slope have P-values equal to 0.000 which are less than  = 0.05. We
conclude that the intercept and slope are significantly different from zero.

h) See the Minitab output in part a). Based on the analysis of variance, we can reject the null hypothesis and conclude
that the regression is significant because p-value is 0.000 less than  = 0.05.

i) E0: 0.024934 r 2.776(0.001786); 0.02, 0.03


E1: 0.128522 r 2.776(0.007738); 0.107, 0.15
Zeros are not included in CIs, so both the intercept and slope are significantly different from zero. The conclusions
from part (g), (h) and (i) are the same.

1
Engineering Statistics 5th edition October 01, 2010

j) Residual plots appear reasonable. The model provides an adequate fit.

Residuals Versus Density


(response is Conduct)

0.0005
Residual

0.0000

-0.0005

0.18 0.23 0.28

Density

Residuals Versus the Fitted Values


(response is Conduct)

0.0005
Residual

0.0000

-0.0005

0.050 0.055 0.060

Fitted Value

2
Engineering Statistics 5th edition October 01, 2010

Normal Probability Plot of the Residuals


(response is Conduct)

1
Normal Score

-1

-0.0005 0.0000 0.0005

Residual

k) r = 0.993, P-value = 0. Therefore, there is a significant correlation between density and conductivity.
Because the slope is significantly different from zero based on the conclusions from parts (g) and (h), the correlation
coefficient is also significantly different from zero.

6-2. a)

y 8.51  9.25x

3
Engineering Statistics 5th edition October 01, 2010

b)

c) SSE = 84 V 2 = 8

d) se( E1 ) = 2.648, se( E 0 ) = 0.05328

e) SST = 252383
SSR = 252299, SSE = 38, and SSR + SSE = 280621
SST = SSR + SSE

f) R2 = 100%. This is interpreted as 100% of the total variability in Usage can be explained by the fitted regression
model.

g) See the Minitab output given in part a).


Based on the t-tests, we conclude that the slope and intercept are nonzero.
Based on P-values, the test of the intercept has P-value = 0.003 and the test of the slope has P-value = 0.000 and
both are less than  = 0.05. We can conclude that the intercept and slope are significantly different from zero.

h) See the Minitab output in part a). Based on the analysis of variance, we can reject the null hypothesis and conclude
that the regression is significant because The P-value = 0.000 < 0.05.

i) E0: 8.51 r 2.228(2.648); 2.61, 14.41


E1: 9.25 r 2.228(0.05328); 9.37, 9.13
Zeros are not included in CIs, so both the intercept and slope are significantly different from zero. The conclusions
from part (g), (h) and (i) are the same.

j) Residual plots appear reasonable, so the model provides an adequate fit.

4
Engineering Statistics 5th edition October 01, 2010

5
Engineering Statistics 5th edition October 01, 2010

k) r = 1.000, P-value = 0, Therefore, we conclude there is a significant correlation between temperature and usage.
Because the slope is significantly different from zero based on the conclusions from parts (g) and (h), the correlation
coefficient is also significantly different from zero.

6-3. a) The regression equation is


Deflect = 0.393 + 0.00333 Temp

Predictor Coef SE Coef T P


Constant 0.39346 0.04258 9.24 0.000
Temp 0.0033285 0.0005815 5.72 0.000

S = 0.006473 R-Sq = 64.5% R-Sq(adj) = 62.6%

Analysis of Variance
Source DF SS MS F P
Regression 1 0.0013727 0.0013727 32.76 0.000
Residual Error 18 0.0007542 0.0000419
Total 19 0.0021270

y 0.393  0.00333 x

6
Engineering Statistics 5th edition October 01, 2010

b) -0.0054488
0.0072519
0.0065614
-0.0127685
0.0069249
-0.0004269
-0.0027627
-0.0044372
0.0002431
-0.0074196
0.0015643
0.0078738
0.0035833
-0.0077656
-0.0011131
-0.0024386
0.0105570
-0.0054342
-0.0014357
0.0068913
c) SSE = 0.0007542 V 2 = 0.0000419
d) se( E1 ) = 0.0005815, se( E 0 ) = 0.04258
e) SST = 0.0020402
SSR = 0.0013149, SSE = 0.0007253, and SSR + SSE = 0.0020402
SST = SSR + SSE
f) R2 = 64.5%. This is interpreted as 64.5% of the total variability in deflection can be explained by the fitted
egression model.
g) See the Minitab output given in part a).
Based on the t-tests, we conclude that the slope and intercept are nonzero.
Based on P-values, both the intercept and slope have P-values = 0.000 <  = 0.05. We conclude that the intercept and
slope are significantly different from zero.
h) See the Minitab output in part a). Based on the analysis of variance, we can reject the null hypothesis and conclude
that the regression is significant because the p-value is 0.000 less than  = 0.05.
i) E0: 0.39346 r 2.101(0.04258); 0.304, 0.483
E1: 0.0033285 r 2.101(0.0005815); 0.00211, 0.00455
Zeros are not included in the CIs, so both intercept and slope are significantly different from zero. The conclusions
from part (g), (h) and (i) are the same.
j) Residual plots appear reasonable, so the model provides an adequate fit.

7
Engineering Statistics 5th edition October 01, 2010

Residuals Versus Temp


(response is Deflect)

0.01
Residual

0.00

-0.01

68 73 78

Temp

Residuals Versus the Fitted Values


(response is Deflect)

0.01
Residual

0.00

-0.01

0.62 0.63 0.64 0.65

Fitted Value

8
Engineering Statistics 5th edition October 01, 2010

Normal Probability Plot of the Residuals


(response is Deflect)

1
Normal Score

-1

-2
-0.01 0.00 0.01

Residual

k) r = 0.803, P-value = 0, Therefore, we conclude there is a significant correlation between temperature and deflection.
Because the slope is significantly different from zero based on the conclusions from parts (g) and (h), the correlation
coefficient is also significantly different from zero.

6-4. a)

9
Engineering Statistics 5th edition October 01, 2010

b)

c) SSE = 44758 V 2 = 3730


d) se( E ) = 211.3, se( E ) = 8.476
1 0
e) SST = 70917
SSR = 26159, SSE = 44758, and SSR + SSE = 70917
SST = SSR + SSE
f) R2 = 36.9%. This is interpreted as 40.6% of the total variability in turbidity can be explained by the fitted regression
model.
g) See the Minitab output given in part a).
Based on the t-tests, we conclude that the slope and intercept are significantly different from zero.
Based on P-values, the test for the intercept has P-value = 0.065 and the test for the slope has P-value = 0.021 which
are less than  = 0.05. We can conclude that the intercept and slope are significantly different from zero.
h) See the Minitab output in part a). Based on the analysis of variance, we can reject the null hypothesis and conclude
that the regression is significant because the P-value is 0.021 <  = 0.05.
i) E0: 430 r 2.179(211.3); 30.42, 890.42
E1: 22.4 r 2.179(8.476); 3.93, 40.87
Zeros are not included in CIs, so both intercept and slope are significantly different from zero. The conclusions from
part (g), (h) and (i) are the same.

j) The normal probability plot of residuals appears reasonable. The plots of residuals against yi and xi seem to have
funnel pattern, so the model does not provide an adequate fit.

10
Engineering Statistics 5th edition October 01, 2010

k) r = 0.607, P-value = 0.021. Based on this test there is a significant correlation between temperature and turbidity.
However, the residual plots indicate the model might not be valid and then these tests are invalid. Because the slope
is significantly different from zero based on the conclusions from parts (g) and (h), the correlation coefficient is also
significantly different from zero.

11
Engineering Statistics 5th edition October 01, 2010

6-5. a) The regression equation is


permeability = 40.6 - 2.12 strength

Predictor Coef SE Coef T P


Constant 40.5536 0.7509 54.00 0.000
strength -2.1232 0.2313 -9.18 0.000

S = 1.038 R-Sq = 86.6% R-Sq(adj) = 85.6%

Analysis of Variance
Source DF SS MS F P
Regression 1 90.759 90.759 84.28 0.000
Residual Error 13 13.999 1.077
Total 14 104.757

y 40.6  2.12 x

b) -0.97179
0.00066
1.56517
0.35430
0.21735
0.99417
0.79921
0.83762
0.24199
-1.22252
-0.49351
-0.38411
-0.74715
-2.15947
0.96808

c) SSE = 13.999 V 2 = 1.077


d) se( E1 ) = 0.2313, se( E 0 ) = 0.7509
e) SST = 104.757
SSR = 90.759, SSE = 13.999, and SSR + SSE = 104.757
SST = SSR + SSE
f) R2 = 86.6%. This is interpreted as 86.6% of the total variability in permeability can be explained by the fitted
regression model.
g) See the Minitab output given in part a).
Based on the t-tests, we conclude that the slope and intercept are nonzero.
Based on P-values, both the test that the intercept is zero and the test that the slope is zero have P-values = 0.000 <  =
0.05. We can conclude that the intercept and slope are significantly different from zero.
H) See the Minitab output in part a). Based on the analysis of variance, we can reject the null hypothesis and conclude
that the regression is significant because the p-value is 0.000 less than  = 0.05.
i) E0: 40.5536 r 2.16(0.7509); 38.93, 41.18
E1: -2.1232 r 2.16(0.2313); -2.62, -1.62
Zeros are not included in the CIs, so both the intercept and the slope are significantly different from zero. The
conclusions from part (g), (h) and (i) are the same.
j) Residual plots appear reasonable, so the model provides an adequate fit.

12
Engineering Statistics 5th edition October 01, 2010

Residuals Versus strength


(response is permeabi)

1
Residual

-1

-2

1 2 3 4 5

strength

Residuals Versus the Fitted Values


(response is permeabi)

1
Residual

-1

-2

29 30 31 32 33 34 35 36 37 38

Fitted Value

13
Engineering Statistics 5th edition October 01, 2010

Normal Probability Plot of the Residuals


(response is permeabi)

1
Normal Score

-1

-2
-2 -1 0 1 2

Residual

k) r = -0.931, P-value = 0; Therefore, we conclude there is a significant correlation between strength and
temperature. Because the slope is significantly different from zero based on the conclusions from parts (g)
and (h), the correlation coefficient is also significantly different from zero.

6-6. a) The plot below implies that a simple linear regression seems reasonable in this situation.

14
Engineering Statistics 5th edition October 01, 2010

9
8
7
6
5
y

4
3
2
1
0

60 70 80 90 100
x

b) The regression equation is


y = - 10.1 + 0.174 x

Predictor Coef StDev T P


Constant -10.132 1.995 -5.08 0.000
x 0.17429 0.02383 7.31 0.000

S = 1.318 R-Sq = 74.8% R-Sq(adj) = 73.4%

Analysis of Variance

Source DF SS MS F P
Regression 1 92.934 92.934 53.50 0.000
Residual Error 18 31.266 1.737
Total 19 124.200
An estimate of V 2 = 1.737

c) y 10.1  0.174(85) 4.69. The predicted mean rise in blood pressure level associated with a sound pressure level
of 85 decibels is 4.69 millimeters of mercury.

6-7. a) 0.055215
b) (0.05304, 0.05738)
c) (0.052505, 0.057925)
d) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-8. a) 472.499
b) (471.183, 473.816)
c) (467.975, 477.024)
d) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-9. The regression equation is y 0.393  0.00333 x


a) 0.63976 = 0.393 + 0.0033(74)
b) (0.6366, 0.6430)

15
Engineering Statistics 5th edition October 01, 2010

c) (0.6256, 0.6537)
d) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-10. The regression equation is y 430  22.4x


a) 354 = 430 + 22.4 (35)
b) (163.2, 544.8)
c) (121.4, 586.6)
d) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-11. a) 36.095
b) (35.059, 37.131)
c) (32.802, 39.388)
d) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-12. a) 4.683
b) (4.055, 5.312)
c) (1.844, 7.523)
d) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-13. a)
The regression equation is
Deflection = 32.0 - 0.277 Stress level

Predictor Coef SE Coef T P


Constant 32.049 2.885 11.11 0.000
Stress level -0.27712 0.04361 -6.35 0.000

S = 1.05743 R-Sq = 85.2% R-Sq(adj) = 83.1%

Analysis of Variance

Source DF SS MS F P
Regression 1 45.154 45.154 40.38 0.000
Residual Error 7 7.827 1.118
Total 8 52.981

V 2 1.118

Graph of Stress level vs Deflection

75

70
Stress level

65

60

55

50
11 12 13 14 15 16 17 18 19
Deflection

b) y 32.05  0.277(75) 11.275


c) (-0.277)(5) = -1.385

16
Engineering Statistics 5th edition October 01, 2010

1
d) 3.61
0.277
e) y 32.05  0.277(78) 10.444 . There are no residuals because there are no observations at x = 78%

6-14. a) The regression equation is


BOD = 0.658 + 0.178 Time

Predictor Coef SE Coef T P


Constant 0.6578 0.1657 3.97 0.003
Time 0.17806 0.01400 12.72 0.000

S = 0.287281 R-Sq = 94.7% R-Sq(adj) = 94.1%


Analysis of Variance
Source DF SS MS F P
Regression 1 13.344 13.344 161.69 0.000
Residual Error 9 0.743 0.083
Total 10 14.087
y 0.658  0.178 x
V 2 0.083

b) y 0.658  0.178(13) 2.972

c) 0.178(4) = 0.712

d) y 0.658  0.178(8) 2.082


e y  y 2.1  2.082 0.018

e) Fitted yi :
0.83585
1.01391
1.37002
1.72613
2.08225
2.43836
2.79447
3.15058
3.50670
3.86281
4.21892

Scatterplot of y-hat vs y
4.5

4.0

3.5

3.0
y-hat

2.5

2.0

1.5

1.0

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0


y

17
Engineering Statistics 5th edition October 01, 2010

All the points would lie along the 45 degree line y y . That is, the regression model would estimate the values
exactly. At this point, the graph of observed vs. predicted indicates that the simple linear regression model provides a
reasonable fit to the data.

6-15. a) MINITAB output

Predictor Coef SE Coef T P


Constant 0.6649 0.1594 4.17 0.001
x 0.83075 0.08552 9.71 0.000

S = 0.197 R-Sq = 88.7% R-Sq(adj) = 87.8%

Analysis of Variance

Source DF SS MS F P
Regression 1 3.6631 3.6631 94.37 0.000
Residual Error 12 0.4658 0.0388
Total 13 4.1289
Coef 0.83075
Tx 9.7141
SE Coef 0.08552
P-valuex = 2*P(t > |9.71|): for degrees of freedom of 12 we obtain 2*(P-value < 0.0005) = P-value < 0.001
SS /(n  p) 0.4658 / 12
2
R Adjusted 1 E 1 87.78%
SS T /(n  1) 4.1289 / 13
SS Total SS Re g  SS E 3.6631  0.4658 4.1289
SS Error 0.4658
MS Error 0.0388
DFError 12
S MS Error 0.0388 0.197
MS Re g 3.6631
F 94.41
MS Error 0.0388
P  valueregression 2 P( F1,12 ! 94.41) 2( P  value  0.01) P  value  0.02

b) V 2 0.0388
c) Based on the P-values from the F-test in the ANOVA table and the t-test for X in the output in part (a), 1 is
significantly different from zero. There P-values are always the same for simple linear regression.
d)
E1  t 0.025,12 se( E1 ) d E 1 d E1  t 0.025,12 se( E1 )
0.83075  2.179(0.08552) d E 1 d 0.83075  2.179(0.08552)
0.83075  0.18635 d E 1 d 0.83075  0.18635
0.6444 d E 1 d 1.0171
Because zero is not included in the 95%CI, the estimated coefficient (1) is significantly different from zero.
e) The results from part (c) and (d) are the same whenever the confidence level = 1 D.
f) y 0.6649  0.83075 x
y 0.6649  0.83075(2.18) 2.476
e y  y 2.8  2.476 0.324
g) P Y 1.5 0.6649  0.83075(1.5) 1.911

1 ( x0  x ) 2 1 (1.5  1.76) 2
se( P Y 1.5 ) V 2  0.0388  0.057
n S xx 14 5.326191

95%CI

18
Engineering Statistics 5th edition October 01, 2010

P Y x  t 0.025,12 se( P Y x ) d P Y x d P Y x  t 0.025,12 se( P Y x )


0 0 0 0 0

1.911  2.179(0.057) d P Y x d 1.911  2.179(0.057)


0

1.911  0.124 d P Y x d 1.911  0.124


0

1.787 d P Y x d 2.035 0

95%PI
1 ( x0  x ) 2 1 (1.5  1.76) 2
V 2 1   0.03881   0.205
n S xx 14 5.326191

1 (x  x)2 1 ( x0  x ) 2
y 0  t 0.025,12 V 2 1   0 d Y0 d y 0  t 0.025,12 V 1  
2

n S xx n S xx
1.911  2.179(0.205) d Y0 d 1.911  2.179(0.205)
1.911  0.447 d Y0 d 1.911  0.447
1.464 d Y0 d 2.358

The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-16. a) MINITAB output

Predictor Coef SE Coef T P


Constant 0.9788 0.3367 2.91 0.011
x -8.3088 0.5725 -14.51 0.000

S = 0.632 R-Sq = 93.8% R-Sq(adj) = 93.3%

Analysis of Variance

Source DF SS MS F P
Regression 1 84.106 84.106 210.63 0.000
Residual Error 14 5.590 0.3993
Total 15 89.696

Coef 8.3088
Tx 14.5132
SE Coef 0.5725
P-valuex = 2*P(t > |-14.51|): for degrees of freedom of 14 we obtain P-value < 2(0.0005) = P-value < 0.001
SS /(n  p) 5.590 / 14
2
R Adjusted 1 E 1 93.32%
SS T /(n  1) 89.696 / 15
SS Total SS Re g  SS E 84.106  5.590 89.696
SS Error 5.590
MS Error 0.3993
DFError 14
S MS Error 0.3993 0.632

MS Re g 84.106
F 210.634
MS Error 0.3993
P  value 2 P( F1,14 ! 210.63) 2( P  value  0.01) P  value  0.02

b) V 2 0.3993
c) Based on the P-values from the F-test in the ANOVA table and the t-test for X in the output in part (a), 1 is
significantly different from zero. There P-values are always the same for simple linear regression.
d)

19
Engineering Statistics 5th edition October 01, 2010

E1  t 0.025,14 se( E1 ) d E 1 d E1  t 0.025,14 se( E1 )


 8.3088  2.145(0.5725) d E 1 d 8.3088  2.145(0.5725)
 8.3088  1.228 d E 1 d 8.3088  1.228
 9.5368 d E 1 d 7.0808
Because zero is not included in the 95% CI, the coefficient (1) is significantly different from zero.
e) The results from part (c) and (d) are the same whenever the confidence level = 1 D.
f) y 0.9788  8.3088 x
y 0.9788  8.3088(0.58) 3.84
e y  y 3.30  (3.84) 0.54
g) P Y |0.6 0.9788  8.3088(0.6) 4.01

1 ( x0  x ) 2 1 (0.6  0.52) 2
se( P Y 0.6 ) V 2  0.3993  0.164
n S xx 16 1.218294

95%CI
P Y x  t 0.025,14 se( P Y x ) d P Y x d P Y x  t 0.025,14 se( P Y x )
0 0 0 0 0

 4.01  2.145(0.164) d P Y x d 4.01  2.145(0.164)


0

 4.01  0.352 d P Y x d 4.01  0.352


0

 4.362 d P Y x d 3.658 0

95%PI
1 ( x0  x ) 2 1 (0.6  0.52) 2
V 2 1   0.39931   0.653
n S xx 16 1.218294

1 (x  x)2 1 ( x0  x ) 2
y 0  t 0.025,14 V 2 1   0 d Y0 d y 0  t 0.025,14 V 1  
2

n S xx n S xx
 4.01  2.145(0.653) d Y0 d 4.01  2.145(0.653)
 4.01  1.401 d Y0 d 4.01  1.401
 5.411 d Y0 d 2.609
The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

Section 6-3

6-17. a) The regression equation is


y = 351 - 1.27 x1 - 0.154 x2

Predictor Coef SE Coef T P VIF


Constant 350.99 74.75 4.70 0.018
x1 -1.272 1.169 -1.09 0.356 2.6
x2 -0.15390 0.08953 -1.72 0.184 2.6

S = 25.50 R-Sq = 86.2% R-Sq(adj) = 77.0%

Analysis of Variance

Source DF SS MS F P
Regression 2 12161.6 6080.8 9.35 0.051
Residual Error 3 1950.4 650.1
Total 5 14112.0

Source DF Seq SS
X1 1 10240.4
x2 1 1921.2

20
Engineering Statistics 5th edition October 01, 2010

b) -24.9866
24.3075
11.8203
-20.4595
12.8296
-3.5113
c) SSE = 1950.4 V 2 = 650.1

d) R-Sq = 86.2%, R-Sq(adj) = 77.0%; R-Sq(adj) is less than R-Sq because the model contains terms that are not
contributing significantly to the model. The adjusted R2 value will penalize the user for adding terms to the model that
are not significant.

e) See part a). Based on the P-value from the ANOVA table, the regression model is significant at the 0.10 level of
significance.
f) se( E 0 ) = 74.75, se( E1 ) = 1.169, se( E 2 ) = 0.08953

g) See part a). Based on the P-values for each coefficient, each regressor is not significantly different from zero at the
0.05 level of significance.

h) E0: 350.99 r 3.182(74.74); 113.17, 588.81


E1: -1.272 r 3.182(1.169); -4.99, 2.45
E2: -0.1539 r 3.182(0.08953); -0.439, 0.131

i) Obs SRES1 COOK1


1 -1.65529 1.69265
2 1.35770 0.63183
3 0.51526 0.02083
4 -1.05590 0.27192
5 1.09375 1.48548
6 -0.18436 0.00898

Residuals Versus x1
(response is y)

30

20

10
Residual

-10

-20

-30
0 10 20 30 40

x1

21
Engineering Statistics 5th edition October 01, 2010

Residuals Versus the Fitted Values


(response is y)

30

20

10
Residual

-10

-20

-30
100 150 200

Fitted Value

Normal Probability Plot of the Residuals


(response is y)

1
Normal Score

-1

-30 -20 -10 0 10 20 30

Residual

j) The VIFs are 2.6. There is no indication of a problem with multicollinearity.

6-18. a) The regression equation is


MPG-y = 38.4 - 0.00165 Weight-x1 - 0.0403 Horsepower-x2

Predictor Coef StDev T P VIF


Constant 38.387 3.719 10.32 0.000
Weight-x -0.001648 0.001325 -1.24 0.245 1.8
Horsepow -0.040308 0.006299 -6.40 0.000 1.8

22
Engineering Statistics 5th edition October 01, 2010

S = 2.135 R-Sq = 91.2% R-Sq(adj) = 89.2%

Analysis of Variance

Source DF SS MS F P
Regression 2 423.41 211.70 46.44 0.000
Residual Error 9 41.03 4.56
Total 11 464.44

Source DF Seq SS
Weight-x 1 236.70
Horsepow 1 186.71

b) 0.16464
-1.41661
2.33925
-1.31445
1.58629
-1.08273
1.12759
-3.77526
1.79269
0.63652
-2.14613
2.08818
c) SSE = 41.03 V 2 = 4.56

d) R-Sq = 91.2%, R-Sq(adj) = 89.2%. R-Sq(adj) is slightly less than R-Sq. The adjusted R2 value will penalize the user
for adding terms to the model that are not significant.

e) See part a). Based on the P-value from the ANOVA table, the regression model is significant at the 0.10 level of
significance.
E E E
f) se( 0 ) = 3.719, se( 1 ) = 0.0013, se( 2 ) = 0.0063

g) See part a). Based on the P-values for each coefficient, only x1 is not significantly different from zero at the 0.05
level of significance.

h) E0: 38.387 r 2.262(3.791); 29.8118, 46.9622


E1: -0.0016 r 2.262(0.0013); -0.0045, 0.0013
E2 : -0.0403 r 2.262(0.0063); -0.0546,-0.0260

i) Obs SRES2 COOK2


1 0.09379 0.00141
2 -0.78074 0.07819
3 1.31029 0.24630
4 -0.64814 0.01519
5 0.80217 0.03558
6 -0.56461 0.02547
7 0.57213 0.01895
8 -1.84772 0.10479
9 0.97075 0.10581
10 0.35819 0.01897
11 -1.18468 0.18208
12 1.53073 1.13238

23
Engineering Statistics 5th edition October 01, 2010

Residuals Versus Horsepow


(response is MPG-y)

Residual
0

-1

-2

-3

-4
100 200 300 400 500 600

Horsepow

Residuals Versus Weight-x


(response is MPG-y)

1
Residual

-1

-2

-3

-4
2500 3500 4500

Weight-x

Residuals Versus the Order of the Data


(response is MPG-y)

1
Residual

-1

-2

-3

-4
2 4 6 8 10 12

Observation Order

24
Engineering Statistics 5th edition October 01, 2010

Residuals Versus the Fitted Values


(response is MPG-y)

Residual
0

-1

-2

-3

-4
10 20 30

Fitted Value

Normal Probability Plot of the Residuals


(response is MPG-y)

1
Normal Score

-1

-2
-4 -3 -2 -1 0 1 2 3

Residual

j) The VIFs are 1.8. There is no indication of a problem with multicollinearity.

6-19. a) The results from Minitab follow.

Regression Analysis: EX6-19Sat versus EX6-19Age, EX6-19Sev, ...

25
Engineering Statistics 5th edition October 01, 2010

The regression equation is


Sat = 136 - 0.701 Age - 0.670 Sev - 0.79 Surg 0.90 Anx

b)

26
Engineering Statistics 5th edition October 01, 2010

27
Engineering Statistics 5th edition October 01, 2010

c) SSE = 3083.8 V 2 = 154.2

d) R-Sq = 71.2%, R-Sq(adj) = 65.5%. R-Sq(adj) is less than R-Sq because the regression equation contains terms that
are not contributing significantly to the model. The adjusted R2 value will penalize the user for adding terms to the
model that are not significant.

e) See part a). Based on the P-value from the ANOVA table, the regression model is significant at the 0.05 level of
significance.

f) se(E 0 ) 10.63 , se(E1 ) 0.2336 , se(E 2 ) 0.2185 , se(E 3 ) 5.55 , and se(E 4 ) 1.842

g) See part a). Based on the P-values for each coefficient, only Age is significantly different from zero at
the 0.05 level of significance.

h) E0: 136.21r 2.201 (10.63); 112.813, 159.607


E1: 0.7006 r 2.201 (0.2336); 1.2148, 0.19
E2 : 0.6703 r 2.201 (0.2185); 1.1512, 0.19
E3: 0.788 r 2.201 (5.550); 13.004, 11.428
E4 : 0.899 r 2.201 (1.842); 4.953, 3.155

i)
SRES COOK

28
Engineering Statistics 5th edition October 01, 2010

29
Engineering Statistics 5th edition October 01, 2010

30
Engineering Statistics 5th edition October 01, 2010

31
Engineering Statistics 5th edition October 01, 2010

j) The VIFs are all less than 10. There is no indication of a problem with multicollinearity.

6-20. a) The results from Minitab follow.

Regression Analysis: Density versus Cont, Loss

32
Engineering Statistics 5th edition October 01, 2010

The regression equation is


Density = - 0.110 + 0.407 Cont + 2.11 Loss

Predictor Coef SE Coef T P VIF


Constant -0.1105 0.2501 -0.44 0.670
Cont 0.4072 0.1682 2.42 0.042 390.1
Loss 2.108 5.834 0.36 0.727 390.1

S = 0.00883422 R-Sq = 99.7% R-Sq(adj) = 99.7%

Analysis of Variance

Source DF SS MS F P
Regression 2 0.23563 0.11782 1509.64 0.000
Residual Error 8 0.00062 0.00008
Total 10 0.23626

Source DF Seq SS
Cont 1 0.23562
Loss 1 0.00001

The regression equation is y 0.110  0.407 x1  2.11x2 where


x1 DielectricConst x2 LossFactor

b)
-0.0089354
-0.0090847
-0.0030180
0.0025153
0.0074740
0.0087559
0.0079298
0.0098885
0.0008422
-0.0051274
-0.0112404

c) SSE = 0.00062 V 2 = 0.00008

d) R-Sq = 99.7%, R-Sq(adj) = 99.7%. R-Sq(adj) is equal to R-Sq.

e) See part a). Based on the P-value from the ANOVA table, the regression model is significant at the 0.05 level of
significance.

f) se( E0 ) 0.2501 , se( E1 ) 0.1682 , and se( E 2 ) 5.834

g) See part a). Based on the P-values for each coefficient, only Cont is significantly different from zero at the 0.05
level of significance.

h) E0: -0.1105 r 2.306(0.2501); -0.6872, 0.4662


E1: 0.4072 r 2.306 (0.1682); 0.01933, 0.7951
E2 : 2.108 r 2.306 (5.834); -11.345, 15.561

i)
SRES COOK

33
Engineering Statistics 5th edition October 01, 2010

-1.23832 0.255007
-1.44997 0.692448
-0.37215 0.008618
0.32815 0.011784
0.92827 0.058551
1.08437 0.077203
1.02796 0.109710
1.35664 0.287682
0.11001 0.001337
-0.67570 0.054084
-1.59530 0.485253

Residuals Versus Loss


(response is Density)

0.010

0.005
Residual

0.000

-0.005

-0.010

0.015 0.020 0.025 0.030 0.035 0.040 0.045


EX6-20Loss

Residuals Versus Cont


(response is Density)

0.010

0.005
Residual

0.000

-0.005

-0.010

2.0 2.2 2.4 2.6 2.8 3.0


EX6-20Cont

34
Engineering Statistics 5th edition October 01, 2010

Residuals Versus the Fitted Values


(response is Density)

0.010

0.005

Residual
0.000

-0.005

-0.010

0.7 0.8 0.9 1.0 1.1 1.2


Fitted Value

Residuals Versus the Order of the Data


(response is Density)

0.010

0.005
Residual

0.000

-0.005

-0.010

1 2 3 4 5 6 7 8 9 10 11
Observation Order

Normal Probability Plot of the Residuals


(response is Density)
99

95
90

80
70
Percent

60
50
40
30
20

10
5

1
-0.02 -0.01 0.00 0.01 0.02
Residual

j) The VIFs are 390.1. There is an indication of a problem with multicollinearity.

6-21. a) The results from Minitab follow.

Regression Analysis: EX6-21Y versus E6-21X1, EX6-21X2, ...

The regression equation is

35
Engineering Statistics 5th edition October 01, 2010

Y = - 103 + 0.605 X1 + 8.92 X2 + 1.44 X3 + 0.014 X4

Predictor Coef SE Coef T P VIF


Constant -102.7 207.9 -0.49 0.636
X1 0.6054 0.3689 1.64 0.145 2.3
X2 8.924 5.301 1.68 0.136 2.2
X3 1.437 2.392 0.60 0.567 1.3
X4 0.0136 0.7338 0.02 0.986 1.0

S = 15.5793 R-Sq = 74.5% R-Sq(adj) = 59.9%

Analysis of Variance

Source DF SS MS F P
Regression 4 4957.2 1239.3 5.11 0.030
Residual Error 7 1699.0 242.7
Total 11 6656.3

Source DF Seq SS
X1 1 3758.9
X2 1 1109.4
X3 1 88.9
X4 1 0.1

The regression equation is y 103  0.605 x1  8.92 x2  1.44 x3  0.014 x4


b) -18.7580
1.8862
23.3109
-8.9565
9.1852
6.6436
4.8136
-0.1568
-17.8502
-12.9376
6.6216
6.1980

c) SSE = 1699.0 , V 2 = 242.7

d) R-Sq = 74.5%, R-Sq(adj) = 59.9%; R-Sq(adj) is less than R-Sq because the regression equation contains terms that are
not contributing significantly to the model. The adjusted R2 value will penalize the user for adding terms to the model that
are not significant.

e) See part a). Based on the P-value from the ANOVA table, the regression model is significant at the 0.05 level of
significance.

f) se( E 0 ) = 207.9, se( E1 ) = 0.3689, se( E 2 ) = 5.301, se( E 3 ) = 2.392, se( E 4 ) = 0.7338

g) See part a). Based on the P-values for each coefficient, the regressors are not significant.

h) E0: -102.7 r 2.365(207.9); -594.38, 388.98


E1: 0.6054 r 2.365(0.3689); -0.267, 1.478
E2 : 8.924 r 2.365(5.301); -3.613, 21.461
E3: 1.437 r 2.365(2.392); -4.22, 7.094
E4: 0.0136 r 2.365(0.7338); -1.722, 1.75

36
Engineering Statistics 5th edition October 01, 2010

i)
SRES COOK
-1.62309 0.430586
0.29990 0.092382
2.08556 0.820139
-0.83969 0.159823
0.67773 0.029524
0.72611 0.200284
0.36035 0.009353
-0.01285 0.000021
-1.67866 0.646149
-0.93670 0.047782
0.47341 0.010787
0.44599 0.010213

Residuals Versus X1
(response is Y)

20

10
Residual

-10

-20
20 30 40 50 60 70 80 90
X1

Residuals Versus X2
(response is Y)

20

10
Residual

-10

-20
21 22 23 24 25 26
X2

37
Engineering Statistics 5th edition October 01, 2010

Residuals Versus X3
(response is Y)

20

10

Residual

-10

-20
85 86 87 88 89 90 91 92 93 94
X3

Residuals Versus X4
(response is Y)

20

10
Residual

-10

-20
90 95 100 105 110
X4

Residuals Versus the Fitted Values


(response is Y)

20

10
Residual

-10

-20
230 240 250 260 270 280 290 300 310
Fitted Value

38
Engineering Statistics 5th edition October 01, 2010

Residuals Versus the Order of the Data


(response is Y)

20

10

Residual

-10

-20
1 2 3 4 5 6 7 8 9 10 11 12
Observation Order

Normal Probability Plot of the Residuals


(response is Y)
99

95
90

80
70
Percent

60
50
40
30
20

10
5

1
-30 -20 -10 0 10 20 30
Residual

j) The VIFs are all less than 10. There is no indication of a problem with multicollinearity.

6-22. a) The regression equation is


HFE = 47.2 - 9.74 Emitter + 0.428 Base + 18.2 EtoB

Predictor Coef StDev T P VIF


Constant 47.17 49.58 0.95 0.356
Emitter -9.735 3.692 -2.64 0.018 6.6
Base 0.4283 0.2239 1.91 0.074 2.5
EtoB 18.237 1.312 13.90 0.000 9.3

S = 3.480 R-Sq = 99.4% R-Sq(adj) = 99.3%

Analysis of Variance

Source DF SS MS F P
Regression 3 30532 10177 840.55 0.000
Residual Error 16 194 12
Total 19 30725

39
Engineering Statistics 5th edition October 01, 2010

Source DF Seq SS
Emitter 1 23959
Base 1 4233
EtoB 1 2340

b) -0.90039
1.83266
-0.31872
-6.78384
-2.18117
-1.51602
1.90876
2.29305
2.01911
-5.96711
-2.21540
3.41999
3.16536
-0.57066
-1.96916
2.64163
-0.93420
6.68822
1.14227
-1.75438
c) SSE = 194 V 2 = 12

d) R-Sq = 99.4%, R-Sq(adj) = 99.3%. R-Sq(adj) is almost equal to R-Sq.

e) See part a). Based on the P-value from the ANOVA table, the regression model is significant at the 0.10 level of
significance.
E
f) se( 0 ) = 49.58, se( E E ) = 3.692, se( E B ) = 0.0039, se( E EtoB ) = 0.2239

g) See part a). Based on the P-values for each coefficient, Emitter and EtoB are significantly different from zero at the
0.05 level of significance.

h) E0: 47.17 r 2.120(49.58); -57.9396, 152.2796


EE: -9.735 r 2.120(3.692); -17.5620, -1.9080
EB : 0.4283 r 2.120(0.2239); -0.0464, 0.9030
EEtoB : 18.237 r 2.120(1.312); 15.4556, 21.0184

i) SRES COOK
-0.27777 0.002938
0.59321 0.023627
-0.10665 0.001012
-2.08750 0.159577
-0.67100 0.016418
-0.45280 0.004106
0.63705 0.035375
0.68266 0.008518
0.67761 0.041745
-1.77388 0.055072
-0.68134 0.016855
1.12620 0.099229
0.93545 0.012570
-0.17625 0.001204
-0.59731 0.010172
0.86378 0.054946
-0.40464 0.052052

40
Engineering Statistics 5th edition October 01, 2010

2.16785 0.319631
0.54024 0.124644
-0.53676 0.009606

Residuals Versus EtoB


(response is HFE)

Residual

-5

3 4 5 6 7 8 9 10 11

EtoB

Residuals Versus Base


(response is HFE)

5
Residual

-5

220 230 240

Base

41
Engineering Statistics 5th edition October 01, 2010

Residuals Versus Emitter


(response is HFE)

Residual
0

-5

14 15 16

Emitter

Residuals Versus the Order of the Data


(response is HFE)

5
Residual

-5

2 4 6 8 10 12 14 16 18 20

Observation Order

Residuals Versus the Fitted Values


(response is HFE)

5
Residual

-5

50 100 150 200

Fitted Value

42
Engineering Statistics 5th edition October 01, 2010

Normal Probability Plot of the Residuals


(response is HFE)

Normal Score
0

-1

-2
-5 0 5

Residual

j) All the VIFs are less than 10. There is no indication of a problem with multicollinearity.

6-23. a) 128.1
b) (49.0, 207.3)
c) (40.5, 296.8)
d) The prediction interval is wider than the confidence interval. The prediction interval is wider than the confidence
interval because it predicts a range for a future observation whereas the confidence interval predicts a range for the
mean response.

6-24. a) 27.398
b) (25.226, 29.570)
c) (22.091, 32.705)
d) The prediction interval is wider than the confidence interval. The prediction interval is wider than the confidence
interval because it predicts a range for a future observation whereas the confidence interval predicts a range for the
mean response.

6-25. a) 98.35
b) (87.99, 108.71)
c) (76.69, 120.02)
d) The prediction interval is wider than the confidence interval. The prediction interval is wider than the confidence
interval because it predicts a range for a future observation whereas the confidence interval predicts a range for the
mean response.

6-26. a) 0.97068
b) (0.95909, 0.98226)
c) (0.94724, 0.99411)
d) The prediction interval is wider than the confidence interval. The prediction interval is wider than the confidence
interval because it predicts a range for a future observation whereas the confidence interval predicts a range for the
mean response.

6-27. a) 287.56
b) (263.77, 311.35)
c) (243.69, 331.44)
d) The prediction interval is wider than the confidence interval. The prediction interval is wider than the confidence
interval because it predicts a range for a future observation whereas the confidence interval predicts a range for the
mean response.

6-28. a) 91.424.
b) (85.953, 96.895)
c) (83.249, 99.599)

43
Engineering Statistics 5th edition October 01, 2010

d) The prediction interval is wider than the confidence interval. The prediction interval is wider than the confidence
interval because it predicts a range for a future observation whereas the confidence interval predicts a range for the
mean response.

6-29. The regression equation is


Useful range (ng) = 239 + 0.334 Brightness (%) - 2.72 Contrast (%)

Predictor Coef SE Coef T P


Constant 238.56 45.23 5.27 0.002
Brightness (%) 0.3339 0.6763 0.49 0.639
Contrast (%) -2.7167 0.6887 -3.94 0.008

S = 36.3493 R-Sq = 75.6% R-Sq(adj) = 67.4%

Analysis of Variance

Source DF SS MS F P
Regression 2 24518 12259 9.28 0.015
Residual Error 6 7928 1321
Total 8 32446

a) y 238.56  0.3339 x1  2.7167 x 2


where x1 % Brightness x 2 %Contrast
b) V 2 1321
se( E 0 ) 45.23 , se( E1 ) 0.6763 , and se( E 2 ) 0.6887

c) Based on the P-values from the t-test for each coefficient, the estimated coefficient of brightness (1) is not
significantly different from zero while contrast (2) is significant at the 0.05 level of significance.

d) y 238.56  0.3339(70)  2.7167(65) 85.3475


se(P Y 70,65 ) 14.5

y 0  t0.025,6 V 2  [ se(P Y 70,65 )]2 d Y0 d y 0  t0.025,6 V 2  [ se(P Y 70,65 )]2

85.3475  2.447 * 1321  (14.5) 2 d Y0 d 85.3475  2.447 * 1321  (14.5) 2


85.3475  95.7540 d Y0 d 85.3475  95.7540
10.4065 d Y0 d 181.1015

e) P Y 70,65 238.56  0.3339(70)  2.7167(65) 85.3475


se(P Y 80,75 ) 14.5
y 0  t0.025,6 se(P Y 70,65 ) d Y0 d y 0  t0.025,6 se(P Y 70,65 )
85.3475  2.447(14.5) d Y0 d 85.3475  2.447(14.5)
85.3475  35.4815 d Y0 d 85.3475  35.4815
49.866 d Y0 d 120.829
f) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response for the same values of x1 and x2.

6-30. The regression equation is


y = - 171 + 7.03 x1 + 12.7 x2

Predictor Coef SE Coef T P


Constant -171.26 28.40 -6.03 0.001
x1 7.029 1.539 4.57 0.004
x2 12.696 1.539 8.25 0.000

S = 3.07827 R-Sq = 93.7% R-Sq(adj) = 91.6%

Analysis of Variance

44
Engineering Statistics 5th edition October 01, 2010

Source DF SS MS F P
Regression 2 842.37 421.18 44.45 0.000
Residual Error 6 56.85 9.48
Total 8 899.22

a) y 171.26  7.029 x1  12.696 x 2

b) V 2 9.48
se( E0 ) 28.40 , se( E1 ) 1.539 , and se( E 2 ) 1.539

c) Based on the P-values from the t-test for each coefficient, both regressors appear to be significant at the 0.05 level of
significance.

d) y 171.26  7.029(14.5)  12.696(12.5) 89.3605



se P Y 14.5,12.5 1.50

y 0  t 0.025, 6 V 2  [ se ( P Y 14.5,12.5 )] 2 d Y0 d y 0  t 0.025, 6 V 2  [ se ( P Y 14.5,12.5 )]2

89.3605  2.447 * 9.48  (1.50) 2 d Y0 d 89.3605  2.447 * 9.48  (1.50) 2


89.3605  8.381 d Y0 d 89.3605  8.381
80.980 d Y0 d 97.742
e) PY 80 , 75  171.26  7.029(14.5)  12.696(12.5) 89.3605
se( P Y 80 , 75 ) 1.50
y 0  t 0.025, 6 se ( P Y 80, 75 ) d Y0 d y 0  t 0.025, 6 se ( P Y 80 , 75 )
89.3605  2.447(1.50) d Y0 d 89.3605  2.447(1.50)
89.3605  3.6705 d Y0 d 89.3605  3.6705
85.690 d Y0 d 93.031

f) The prediction interval is wider than the confidence interval because it predicts a range for a future observation
whereas the confidence interval predicts a range for the mean response.

6-31. a) MINITAB output

Predictor Coef SE Coef T P


Constant 3.318 1.007 3.29 0.003
x1 0.7417 0.5768 1.29 0.210
x2 9.1142 0.6571 13.87 0.000

S = 0.832643 R-Sq = 88.5% R-Sq(adj) = 87.6%

Analysis of Variance

Source DF SS MS F P
Regression 2 133.366 66.683 96.18 0.000
Residual Error 25 17.332 0.693
Total 27 150.698

Coef x1 0.7417
Tx1 1.2859
SE Coef x1 0.5768
Coef x 2 9.1142
Tx 2 13.8682
SE Coef x 2 0.6571
P-valuex1 = 2P(t > |1.29|) and for degrees of freedom of 25 we obtain
2(0.10) <P-value < 2(0.25) = 0.20 < P-value < 0.50

45
Engineering Statistics 5th edition October 01, 2010

P-valuex2 = 2P(t > |13.87|) and for degrees of freedom of 25 we obtain P-value < 2(0.0005) = P-value < 0.001
SS 17.332
R2 1 E 1 88.50%
SS T 150.698
DFError DFTotal  DFRe g 27  2 25
SS Error 17.332
MS Error 0.6933
DFError 25
S MS Error 0.6933 0.832646
MS Re g 66.683
F 96.182
MS Error 0.6933
P  valueregression 2 P( F2, 25 ! 96.18) 2( P  value  0.01) P  value  0.02
b) V 2
0.693

c) Based on the ANOVA from MINITAB output in part (a), F0 96.18 ! F0.05, 2, 25 3.39 and P-value = 0.000
<  = 0.05. We reject the null hypothesis and conclude that the regression is significant.
d) 1: T0 1.29  t 0.025, 25 2.060 and P-value = 0.210 >  = 0.05. There is not sufficient evidence to conclude that 1
differs from zero.
2: T0 13.87 ! t 0.025, 25 2.060 and P-value = 0.000 <  = 0.05, we reject the null hypothesis and conclude that 2 is
different from zero.

e) 95% CI for 1

E1  t 0.025, 25 se E1 d E 1 d E1  t 0.025, 25 se E1
0.7417  2.060(0.5768) d E 1 d 0.7417  2.060(0.5768)
0.7417  1.188 d E 1 d 0.7417  1.188
 0.4463 d E 1 d 1.9297
Because zero is included in the 95% CI, we fail to reject the null hypothesis that 1 differs from zero.

f) 95% CI for 2

E 2  t 0.025, 25 se E 2 d E 2 d E 2  t 0.025, 25 se E 2
9.1142  2.060(0.6571) d E 2 d 9.1142  2.060(0.6571)
9.1142  1.354 d E 2 d 9.1142  1.354
7.7602 d E 2 d 10.4682
Because zero is not included in the 95% CI, we reject the null hypothesis that 2 equals zero.

g) Only 2 is significantly different from zero (parts d) and f)), and the regression is significant (part c)). The next step
would be to fit a model with x2 as the sole predictor. A residual analysis should be carried out to make sure that the
chosen model is adequate.

6-32. a) MINITAB Output


Predictor Coef SE Coef T P
Constant 6.188 2.704 2.29 0.027
x1 9.6864 0.4989 19.42 0.000
x2 -0.3796 0.2339 -1.62 0.112
x3 2.9447 0.2354 12.51 0.000

S = 0.912336 R-Sq = 90.8% R-Sq(adj) = 90.2%

Analysis of Variance

Source DF SS MS F P
Regression 3 363.01 121.00 145.37 0.000
Residual Error 44 36.62 0.83
Total 47 399.63

46
Engineering Statistics 5th edition October 01, 2010

Coef x1 9.6864
Tx1 19.4156
SE Coef x1 0.4989
Coef x 2 0.3796
Tx 2 1.6229
SE Coef x 2 0.2339
Coef x 3 2.9447
Tx 3 12.5093
SE Coef x 3 0.2354
P-valuex1 = 2P(t > |19.42|): for degrees of freedom of 44 we obtain P-value < 2(0.0005) = P-value < 0.001
P-valuex2 = 2P(t > |-1.62|): for degrees of freedom of 44 we obtain 2(0.05) < P-value < 2(0.1) = 0.1 < P-value < 0.2
P-valuex2 = 2P(t > |12.51|): for degrees of freedom of 44 we obtain P-value < 2(0.0005) = P-value < 0.001
SS 36.62
R2 1 E 1 90.84%
SS T 399.63
SS Error 36.62
MS Error 0.8323
DFError 44
S MS Error 0.8323 0.912305
MS Re g 121.00
F 145.38
MS Error 0.8323
P  value 2 P( F3, 44 ! 145.38) 2( P  value  0.01) P  value  0.02
b) V 2
0.83

c) Based on the ANOVA from MINITAB output in part (a), F0 145.37 ! F0.05,3, 44 2.824 (interpolated) and
P-value = 0.000 <  = 0.05. We reject null hypothesis and conclude that the regression is significant.

d) 1: T0 19.42 ! t 0.025, 44 2.017 (interpolated) and P-value = 0.000 <  = 0.05. We reject the null hypothesis and
conclude that 1 is different from zero.
2: T0 1.62  t 0.025, 44 2.017 (interpolated) and P-value = 0.112 >  = 0.05, we fail to reject the null hypothesis that
2 equals zero.
3: T0 12.51 ! t 0.025, 44 2.017 (interpolated) and P-value = 0.000 <  = 0.05, we reject the null hypothesis that 3
equals zero.

e) 95%CI for 1

E1  t 0.025, 44 se E1 d E 1 d E1  t 0.025, 44 se E1
9.6864  2.017(0.4989) d E 1 d 9.6864  2.017(0.4989)
9.6864  1.006 d E 1 d 9.6864  1.006
8.6804 d E 1 d 10.6924
Because zero is not included in the CI, we reject the null hypothesis that 1 equals zero.

f) 95%CI for 2

E 2  t 0.025, 44 se E 2 d E 2 d E 2  t 0.025, 44 se E 2
 0.3796  2.017(0.2339) d E 2 d  0.3796  2.017(0.2339)
 0.3796  0.472 d E 2 d 0.3796  0.472
 0.8516 d E 2 d 0.0924
Because zero is included in the 95% CI, we fail to reject the null hypothesis that 2 equals zero.

g) 95%CI for 3

E3  t 0.025, 44 se E3 d E 3 d E3  t 0.025, 44 se E3
2.9947  2.017(0.2345) d E 3 d 2.9947  2.017(0.2345)
2.9947  0.473 d E 3 d 2.9947  0.473
2.5217 d E 3 d 3.4677
Because zero is not included in the CI, we reject the null hypothesis that 3 equals zero.

47
Engineering Statistics 5th edition October 01, 2010

h) The tests for 1 and 3 are significant while 2 is not significant. Also, the regression is significant (part (c)). A
model with x2 omitted should be considered. Residual analysis should be applied to make sure that model is adequate.

Section 6-4

6-33. a) The regression equation is


y = 643 + 11.4 x1 - 0.933 x2 - 0.0106 x1x2 - 0.0272 x1^2 +0.000471 x2^2

Predictor Coef StDev T P VIF


Constant 642.685 0.000 * *
x1 11.3862 0.0000 * * 2675.3
x2 -0.933346 0.000000 * * 1283.4
x1x2 -0.0106334 0.0000000 * * 8342.1
x1^2 -0.0271620 0.0000000 * * 502.4
x2^2 0.00047076 0.00000000 * * 3301.5

S = *
Analysis of Variance

Source DF SS MS F P
Regression 5 14112.00 2822.40 * *
Residual Error 0 * *
Total 5 14112.00

Source DF Seq SS
x1 1 10240.37
x2 1 1921.21
x1x2 1 827.86
x1^2 1 1056.39
x2^2 1 66.17

b) Because VIFs are much greater than 10, multicollinearity is present in the second-order model.
c) Because SSE(Full Model) is not available the test statistic can not be computed.

6-34. a) The regression equation is


MPG = 53.5 - 0.0074 W - 0.101 HP +0.000001 W*HP +0.000001 W^2 +0.000079 HP^2

Predictor Coef StDev T P VIF


Constant 53.49 27.17 1.97 0.096
W -0.00736 0.02063 -0.36 0.733 496.2
HP -0.10098 0.08450 -1.19 0.277 368.3
W*HP 0.00000146 0.00002559 0.06 0.956 639.0
W^2 0.00000104 0.00000371 0.28 0.788 767.9
HP^2 0.00007854 0.00005398 1.46 0.196 65.7

S = 1.977 R-Sq = 95.0% R-Sq(adj) = 90.7%

Analysis of Variance

Source DF SS MS F P
Regression 5 440.988 88.198 22.57 0.001
Residual Error 6 23.449 3.908
Total 11 464.438

Source DF Seq SS
W 1 236.699
HP 1 186.707
W*HP 1 9.307
W^2 1 0.001
HP^2 1 8.274

b) Because the VIFs are much greater than 10, multicollinearity is present in the second-order model.
c) The regression equation is

48
Engineering Statistics 5th edition October 01, 2010

MPG = 38.4 - 0.00165 W - 0.0403 HP

Predictor Coef StDev T P VIF


Constant 38.387 3.719 10.32 0.000
W -0.001648 0.001325 -1.24 0.245 1.8
HP -0.040308 0.006299 -6.40 0.000 1.8

S = 2.135 R-Sq = 91.2% R-Sq(adj) = 89.2%

Analysis of Variance

Source DF SS MS F P
Regression 2 423.41 211.70 46.44 0.000
Residual Error 9 41.03 4.56
Total 11 464.44

[41.03  23.449] /(9  6) 5.8603


f0 1.4995
23.449 / 6 3.908
This results in P-value = P(f 3,6 > 1.4995) = 0.3073. Because the P-value > 0.05, we fail to reject H0 and conclude
that the second-order terms do not significantly contribute to the model.

6-35. a) All possible regressions.

Response is Sat

S
A S u A
Mallows g e r n
Vars R-Sq R-Sq(adj) C-p S e v g x
1 82.1 80.8 4.4 9.3577 X
1 57.0 54.0 27.3 14.487 X
2 87.9 86.1 1.1 7.9723 X X
2 83.0 80.4 5.5 9.4476 X X
3 88.0 85.0 3.0 8.2768 X X X
3 87.9 84.9 3.0 8.2942 X X X
4 88.0 83.6 5.0 8.6446 X X X X

b) Forward selection. Alpha-to-Enter: 0.25

Response is Sat on 4 predictors, with N = 16

Step 1 2
Constant 136.2 146.7

Age -1.43 -1.12


T-Value -8.01 -5.76
P-Value 0.000 0.000

Sev -0.56
T-Value -2.51
P-Value 0.026

S 9.36 7.97
R-Sq 82.07 87.92
R-Sq(adj) 80.79 86.06
Mallows C-p 4.4 1.1

c) Backward elimination. Alpha-to-Remove: 0.1

Response is Sat on 4 predictors, with N = 16

Step 1 2 3
Constant 146.2 146.2 146.7

Age -1.12 -1.12 -1.12

49
Engineering Statistics 5th edition October 01, 2010

T-Value -5.25 -5.51 -5.76


P-Value 0.000 0.000 0.000

Sev -0.59 -0.59 -0.56


T-Value -2.11 -2.22 -2.51
P-Value 0.058 0.046 0.026

Surg 0.1
T-Value 0.03
P-Value 0.979

Anx 0.5 0.6


T-Value 0.22 0.25
P-Value 0.832 0.809

S 8.64 8.28 7.97


R-Sq 87.98 87.98 87.92
R-Sq(adj) 83.61 84.97 86.06
Mallows C-p 5.0 3.0 1.1

d) Model with only age and severity seems to be the best among all. It has a large R-Sq(adj) and small Cp and
values.

6-36. a) All possible regressions.

Response is Density

C L
o o
Mallows n s
Vars R-Sq R-Sq(adj) C-p S t s
1 99.7 99.7 1.1 0.0083967 X
1 99.5 99.5 6.9 0.010964 X
2 99.7 99.7 3.0 0.0088342 X X

b) Forward selection. Alpha-to-Enter: 0.25

Response is Density on 2 predictors, with N = 11

Step 1
Constant -0.2005

Cont 0.4679
T-Value 57.81
P-Value 0.000

S 0.00840
R-Sq 99.73
R-Sq(adj) 99.70
Mallows C-p 1.1

c) Backward elimination. Alpha-to-Remove: 0.1

Response is Density on 2 predictors, with N = 11

Step 1 2
Constant -0.1105 -0.2005

Cont 0.4072 0.4679


T-Value 2.42 57.81
P-Value 0.042 0.000

Loss 2.1
T-Value 0.36
P-Value 0.727

50
Engineering Statistics 5th edition October 01, 2010

S 0.00883 0.00840
R-Sq 99.74 99.73
R-Sq(adj) 99.67 99.70
Mallows C-p 3.0 1.1

d) Model with only Dielectric constant seems to be the best among all. It has a high R-Sq(adj) and small Cp and S
values.

6-37. a) All possible regressions.

Response is y

Adj. x x x x
Vars R-Sq R-Sq C-p s 1 2 3 4

1 64.5 60.9 1.7 15.381 X


1 56.5 52.1 3.9 17.022 X
2 73.1 67.2 1.4 14.095 X X
2 64.6 56.8 3.7 16.173 X X
3 74.5 64.9 3.0 14.573 X X X
3 73.2 63.1 3.4 14.944 X X X
4 74.5 59.9 5.0 15.579 X X X X

b) Forward selection. Alpha-to-Enter: 0.25

Response is y on 4 predictors, with N = 12

Step 1 2
Constant -90.1607 0.5287

x2 15.2 10.3
T-Value 4.26 2.36
P-Value 0.002 0.042

x1 0.50
T-Value 1.71
P-Value 0.122

S 15.4 14.1
R-Sq 64.46 73.14
R-Sq(adj) 60.90 67.17
C-p 1.7 1.4

c) Backward elimination. Alpha-to-Remove: 0.1

Response is y on 4 predictors, with N = 12

Step 1 2 3 4
Constant -102.7132-101.6100 0.5287 -90.1607

x1 0.61 0.61 0.50


T-Value 1.64 1.76 1.71
P-Value 0.145 0.117 0.122

x2 8.9 8.9 10.3 15.2


T-Value 1.68 1.80 2.36 4.26
P-Value 0.136 0.109 0.042 0.002

x3 1.4 1.4
T-Value 0.60 0.65
P-Value 0.567 0.536

x4 0.01
T-Value 0.02
P-Value 0.986

51
Engineering Statistics 5th edition October 01, 2010

S 15.6 14.6 14.1 15.4


R-Sq 74.47 74.47 73.14 64.46
R-Sq(adj) 59.89 64.90 67.17 60.90
C-p 5.0 3.0 1.4 1.7

d) The model with only x1 and x2 seems to be the best among all. It has a high R-Sq(adj) and small Cp value.

6-38. a) All possible regressions

Response is y

x x x
Vars R-Sq R-Sq(adj) C-p S 1 2 3

1 99.1 99.0 7.1 3.9403 X


1 78.0 76.8 542.9 19.389 X
2 99.2 99.1 5.7 3.7418 X X
2 99.1 99.0 9.0 4.0433 X X
3 99.4 99.3 4.0 3.4796 X X X

b) Forward selection. Alpha-to-Enter: 0.25

Response is y on 3 predictors, with N = 20

Step 1 2 3
Constant -23.62 66.13 47.17

x3 21.51 20.12 18.24


T-Value 44.28 21.59 13.90
P-Value 0.000 0.000 0.000

x1 -5.4 -9.7
T-Value -1.72 -2.64
P-Value 0.103 0.018

x2 0.43
T-Value 1.91
P-Value 0.074

S 3.94 3.74 3.48


R-Sq 99.09 99.23 99.37
R-Sq(adj) 99.04 99.13 99.25
C-p 7.1 5.7 4.0

c) Backward elimination. Alpha-to-Remove: 0.1

Response is y on 3 predictors, with N = 20

Step 1
Constant 47.17

x1 -9.7
T-Value -2.64
P-Value 0.018

x2 0.43
T-Value 1.91
P-Value 0.074

x3 18.2
T-Value 13.90
P-Value 0.000

S 3.48
R-Sq 99.37

52
Engineering Statistics 5th edition October 01, 2010

R-Sq(adj) 99.25
C-p 4.0

d) The model that contains all the first-order terms seems to be the best among all. It has the highest R-Sq(adj) and
smallest Cp value.

6-39. a) Note that x2 = 0 if using tool type 302 and x2 = 1 if using tool type 416.

The regression equation is


y = 49.2 + 0.143 x1 - 0.117 x2

Predictor Coef SE Coef T P


Constant 49.244 2.262 21.77 0.000
x1 0.142851 0.008297 17.22 0.000
x2 -0.116925 0.002677 -43.67 0.000

S = 0.6802 R-Sq = 99.3% R-Sq(adj) = 99.2%

Analysis of Variance

Source DF SS MS F P
Regression 2 1083.55 541.77 1171.07 0.000
Residual Error 17 7.86 0.46
Total 19 1091.41

Source DF Seq SS
x1 1 201.09
x2 1 882.46

Unusual Observations
Obs x1 y Fit SE Fit Residual St Resid
7 248 37.520 36.030 0.240 1.490 2.34R

R denotes an observation with a large standardized residual

Because the P-value < 0.01, the regression model is significant at 0.01.

b) Regression model for tool type 302 is

The regression equation is


y-Tool1 = 11.3 + 0.154 Tool1

Predictor Coef SE Coef T P


Constant 11.266 1.324 8.51 0.000
Tool1 0.154042 0.005533 27.84 0.000

S = 0.3671 R-Sq = 99.0% R-Sq(adj) = 98.9%

Analysis of Variance

Source DF SS MS F P
Regression 1 104.46 104.46 775.21 0.000
Residual Error 8 1.08 0.13
Total 9 105.54

The regression model for tool type 302 is significant at 0.01.

Regression model for tool type 416 is

The regression equation is


y-Tool2 = 5.60 + 0.122 Tool2

Predictor Coef SE Coef T P


Constant 5.603 3.944 1.42 0.193
Tool2 0.12160 0.01673 7.27 0.000

53
Engineering Statistics 5th edition October 01, 2010

S = 0.8053 R-Sq = 86.9% R-Sq(adj) = 85.2%

Analysis of Variance

Source DF SS MS F P
Regression 1 34.277 34.277 52.85 0.000
Residual Error 8 5.189 0.649
Total 9 39.466
Unusual Observations
Obs Tool2 y-Tool2 Fit SE Fit Residual St Resid
7 248 37.520 35.760 0.332 1.760 2.40R

R denotes an observation with a large standardized residual

The regression model for tool type 416 is significant at 0.01.

Supplemental Exercises

6-40. a)

No, a linear relationship does not appear plausible.

b) The regression equation is kWh = 2.44 + 0.009 Dollars


c) Analysis of Variance

Source DF SS MS F P

Regression 1 0.00008 0.00008 0.00 0.950


Residual Error 13 0.26309 0.02024
Total 14 0.26317
The test statistic is
SSR / k
f0
SSE / ( n  p)
Reject H0 if f0 > fD,1,8 where f0.05,1,13 = 4.67
Using the results from the ANOVA table

0.00008 /1
f0 0.004
0.2631/13

54
Engineering Statistics 5th edition October 01, 2010

Because 0.01 < 4.67 we fail to reject H0. The regression model is not significant at D = 0.05.
P-value > 0.10 (from computer output the P-value = 0.950

d) Predictor Coef StDev T P


Constant 2.4422 0.6087 4.01 0.001
Dollars 0.0089 0.1387 0.06 0.950

0.0089  t0.025,13 (0.1387) d E1 d 0.0089  t0.025,13 (0.1387)


0.0089  2.160(0.1387) d E1 d 0.0089  2.160(0.1387)
0.291 d E1 d 0.308

e) The test statistic is


E 1
t0
se(E 1 )

Reject H0 if t0 < tD/2,n-2 where t0.025,13 = 2.160 or t0 > t0.025,13 = 2.160


Using the results from the table above

0.0089
t0 0.0642
0.1387

Because 2.160 < 0.0642 < 2.160 we fail to reject H 0 There is not sufficient evidence to conclude that Dollars is a predictor of
electrical usage at D = 0.05.

f) The test statistic is


E 0
t0
se(E 0 )
Reject H0 if t0 < tD/2,n-2 where t0.025,13 = 2.160 or t0 > t0.025,13 = 2.160
Using the results from the table above
2.4422
t0 4.012
0.6087
Because 4.012 > 2.160 reject H 0 and conclude the intercept differs from zero at D = 0.05.

SSE
6-41. Using R 2 1 Syy
,

( n  2) 1  S E
SS
yy Syy  SSE Syy  SSE
F0 SS E SS E
V 2
S yy n2

Also,
SSE ( yi  E 0  E 1xi ) 2

( y  y  E ( x  x )) 2
i 1 i

( yi  y)  E 12 ( xi  x ) 2  2E 1 ( y i  y)( x i  x )

( yi  y) 2  E 12 ( xi  x ) 2
SST  SSE E 12 ( xi  x ) 2
E 12
Therefore, F0 t 20
V 2 / S xx

Because the square of a t random variable with n - 2 degrees of freedom is an F random variable with 1 and n-2 degrees of
freedom, the usual t-test that compares | t 0 | to t D / 2 ,n 2 is equivalent to comparing f 0 t 20 to f D ,1,n 2 t D / 2 ,n  2 .

55
Engineering Statistics 5th edition October 01, 2010

0.9(23)
6-42. a) From the previous exercise f 0 207 . Reject H 0 :E1 0.
1  0.9
23R 2
b) Because f0.05,1,23 4.28, H 0 is rejected if ! 4.28 .
1 R2
That is, H 0 is rejected if
23R 2 ! 4.28(1  R 2 )
27.28R 2 ! 4.28
R 2 ! 0.157

6-43. For two random variables X 1 and X 2 ,


V( X1  X 2 ) V( X1 )  V( X 2 )  2Cov( X1 , X 2 )
Then,
V(Yi  Y ) V( Y )  V( Y )  2Cov(Y , Y
i i i i i)

V 2  V(E 0  E 1xi )  2V 2 n1 
( xi  x)2
S xx
( xi  x)2 21 ( xi  x) 2
V 2  V 2 n1   2V n 
S xx S xx

( x  x) 2
V 2 1  ( n1  iS )
xx
a) Because ei is divided by its standard error (when V is known), ri has a unit standard deviation.

b) No, the term in brackets in the denominator is necessary for the standardized residuals to have unit standard
deviation.

c) If xi is near x and n is reasonably large, ri is approximately equal to the standardized residual

d) If xi is far from x , the standard error of ei is small. Consequently, extreme points are better fit by least squares
regression than points near the middle range of x. Because the studentized residual at any point has variance of
approximately one, the studentized residuals can be used to compare the fit of points to the regression line over the
range of x.

6-44. a)

2
DCoutput

0
2 3 4 5 6 7 8 9 10
WindVel

The scatter diagram shows definite curvature. A higher-order polynomial model or a transformation of variables may
be appropriate.

b) The regression equation is


DCoutput = 0.131 + 0.241 WindVel

56
Engineering Statistics 5th edition October 01, 2010

Predictor Coef SE Coef T P


Constant 0.1309 0.1260 1.04 0.310
WindVel 0.24115 0.01905 12.66 0.000

S = 0.2361 R-Sq = 87.4% R-Sq(adj) = 86.9%

y 0.131  0.241x

c) Analysis of Variance
Source DF SS MS F P
Regression 1 8.9296 8.9296 160.26 0.000
Residual Error 23 1.2816 0.0557
Total 24 10.2112

The P-value from the ANOVA table is approximately zero. Therefore, reject H0 and conclude that the regression model
is significant at D = 0.01. The test can also be conducted in more detail as follows:

The test statistic is


SS R / k
f0
SS E / ( n  p)
Reject H0 if f0 > fD,1,23 where f0.05,1,23 = 4.28
Using the results from the ANOVA table
8.92961 / 1
f0 160.257
.
128157 / 23
Because 160.257 > 4.28 reject H0 and conclude that the regression model is significant at D = 0.05.

d)
Residual Plot for DC Output Residual Plot for DC Output

0.4 0.4

0.2 0.2
Residuals

Residuals

0 0

-0.2 -0.2

-0.4 -0.4

-0.6 -0.6

0.7 1.1 1.5 1.9 2.3 2.7 0 2 4 6 8 10 12


Predicted Wind Velocity

The plots exhibit nonrandom patterns that indicate model inadequacy.

e) Examining the residual plots in part d), a transformation on the x-variable, y-variable, or both would be appropriate. A
simple linear regression of y on the transformed variable 1/x may be satisfactory.

f) The following analysis employs the transformed variable, 1/x

The regression equation is


DCoutput = 2.98 - 6.93 1/WindVel

Predictor Coef SE Coef T P


Constant 2.97886 0.04490 66.34 0.000
1/WindVe -6.9345 0.2064 -33.59 0.000

57
Engineering Statistics 5th edition October 01, 2010

S = 0.09417 R-Sq = 98.0% R-Sq(adj) = 97.9%

y 2.98  6.93x where x* = 1/x

Analysis of Variance

Source DF SS MS F P
Regression 1 10.007 10.007 1128.43 0.000
Residual Error 23 0.204 0.009
Total 24 10.211
The P-value from the ANOVA table is approximately zero. Therefore, reject H0 and conclude that the regression model
is significant at D = 0.05. The test can also be conducted in more detail as follows:

The test statistic is


SS R / k
f0
SS E / ( n  p)
Reject H0 if f0 > fD,1,23 where f0.05,1,23 = 4.28
Using the results from the ANOVA table
10.0072 / 1
f0 1128.43
0.203970 / 23
Because 1128.43 > 4.28 reject H0 and conclude that the regression model is significant at D = 0.05.

Residual Plot for DC Output Residual Plot for DC Output

0.19 0.19

0.09 0.09
Residuals

Residuals

-0.01
-0.01

-0.11
-0.11

-0.21
-0.21
0 0.4 0.8 1.2 1.6 2 2.4
0 0.1 0.2 0.3 0.4 0.5
Predicted
1/WindVel
Using 1/WindVel

From the random appearance of the residuals in the plots we conclude that he model is adequate. The transformation,
1/(Wind Velocity), appears to be satisfactory as a regressor of DC Output.

6-45. a) p = k + 1 = 2 + 1 = 3
Average size = p/n = 3/25 = 0.12

b) Leverage point criteria:


h ii ! 2( p / n)
h ii ! 2(0.12)
h ii ! 0.24
h17 ,17 0.2593
h18,18 0.2929
Points 17 and 18 are leverage points

6-46. a) The regression equation is


y = 3829 - 0.215 x3 + 21.2 x4 + 1.66 x5

58
Engineering Statistics 5th edition October 01, 2010

Predictor Coef SE Coef T P


Constant 3829 2262 1.69 0.099
x3 -0.2149 0.1088 -1.97 0.056
x4 21.2134 0.9050 23.44 0.000
x5 1.6566 0.5502 3.01 0.005

S = 43.66 R-Sq = 99.3% R-Sq(adj) = 99.3%

y 3829.26  0.215x 3 21213


. x 4  1657
. x5

b) Analysis of Variance

Source DF SS MS F P
Regression 3 9863398 3287799 1724.42 0.000
Residual Error 36 68638 1907
Total 39 9932036

The P-value from the ANOVA table is approximately zero. Therefore, reject H0 and conclude that the regression model
is significant at D = 0.01. The test can also be conducted in more detail as follows:

H 0 :E 3 E4 E5 0
H 1:E j z 0 for at least one j
The test statistic is
SS R / k
f0
SS E / ( n  p)

Reject H0 if f0 > fD,3,36 where f0.01,3,36 = 4.38


Using the results from the ANOVA table

9863398 / 3
f0 1724.42
68638.2 / 36
Because 1724.42 > 4.38 reject H0 and conclude that the regression model is significant at D = 0.01. The P-value <
0.00001

c) All at D = 0.01 t0.005,36 = 2.72


H 0 :E 3 0 H 0 :E 4 0 H 0 :E 5 0
H 1:E 3 z 0 H 1:E 4 z 0 H 1:E 5 z 0
t0 = -1.97 t0 =23.44 t0 = 3.01
| t 0 | ! t D / 2 ,36 | t 0 | ! t D / 2 ,36 | t 0 | ! t D / 2 ,36
Fail to reject H 0 Reject H 0 Reject H 0
Potentially the x3 term can be removed from the model.

d) R2 = 0.993 R 2adj 0.9925


The slight decrease in R 2adj may be reflective of the insignificant x3 term.
e)

59
Engineering Statistics 5th edition October 01, 2010

Normal Probability Plot

99.9

99

95

cumulative percent
80

50

20

0.1

-80 -50 -20 10 40 70 100

Residuals

The normality assumption appears reasonable. The residuals fall along a line.

f)
Residual Plot

100

70

40
Residuals
10

-20

-50

-80

3 3.3 3.6 3.9 4.2 4.5 4.8


Predicted (X 1000)

The plot is satisfactory. There does not appear to be a nonrandom pattern in the residual vs. predicted plot.

g)

60
Engineering Statistics 5th edition October 01, 2010

Residual Plot for y

100

70

40

Residuals
10

-20

-50

-80

28 28.4 28.8 29.2 29.6 30 30.4


(X 1000)
x3

There is a slight indication that variance increases as x3 increases. There is a fanning out appearance of the
residuals.

h) Using the equation found in part a):

y 3829.26  0.215(28900)  21.213(170)  1.657(1589) 3854.943

6-47. a) The regression equation is


y* = 19.7 - 1.27 x3* + 0.00541 x4 +0.000408 x5

Predictor Coef SE Coef T P


Constant 19.690 9.587 2.05 0.047
x3* -1.2673 0.9594 -1.32 0.195
x4 0.0054140 0.0002711 19.97 0.000
x5 0.0004079 0.0001645 2.48 0.018

S = 0.01314 R-Sq = 99.1% R-Sq(adj) = 99.0%

Analysis of Variance

Source DF SS MS F P
Regression 3 0.68611 0.22870 1323.62 0.000
Residual Error 36 0.00622 0.00017
Total 39 0.69233

The P-value from the ANOVA table is approximately zero. Therefore, reject H0 and conclude that the regression model
is significant at D = 0.01. The test can also be conducted in more detail as follows:

H 0:E 3 E4 E5 0
H 1:E j z 0 for at least one j
The test statistic is
SS R / k
f0
SS E / ( n  p)
Reject H0 if f0 > fD,3,36 where f0.01,3,36 = 4.38
Using the results from the ANOVA table
0.686112 / 3
f0 1323.62
0.00622033 / 36
Because 1323.62 > 4.38 reject H0 and conclude that the regression model is significant at D = 0.01.
P-value < 0.00001

61
Engineering Statistics 5th edition October 01, 2010

b) D = 0.01 t .005,36 2.72


H 0:E 3 0 H 0 :E 4 0 H 0 :E 5 0
H1:E 3 z0 H 1:E 4 z 0 H 1:E 5 z 0
t0 1.32 t0 19.97 t0 2.48
| t 0 | ! t D / 2 ,36 | t 0 | ! t D / 2 ,36 | t 0 | ! t D / 2 ,36
Fail to reject H 0 Reject H 0 Fail to reject H 0

E3: Fail to reject H0. There is not sufficient evidence that the coefficient of ln(x3) in the model differs from zero at D =
0.01.
E4: Reject H0. The coefficient of x4 in the models differs from zero at D = 0.01.
E5: Fail to reject H0. There is not sufficient evidence that the coefficient of x5 in the model differs from zero at D =
0.01.

c)

Residual Plot for ln(y)


Residual Plot for ln(y) (X 1E-3)
(X 1E-3)
38
38

28
28

18
18
Residuals
Residuals

8
8

-2
-2

-12 -12

-22 -22

8 8.1 8.2 8.3 8.4 8.5 1026 1027 1028 1029 1030 1031 1032
(X 0.01)
Predicted ln(x3)

Curvature is evident in the residuals plots from this model, whereas non-constant variance was evident in the previous
model.

6-48. a)

62
Engineering Statistics 5th edition October 01, 2010

The P-value from the ANOVA table is approximately zero. Therefore, reject H0 and conclude that the regression model
is significant at D = 0.01. The test can also be conducted in more detail as follows:

The test statistic is


SS R / k
f0
SS E / ( n  p)
Reject H0 if f0 > fD,3,36 where f0.05,2,7 = 4.74
Using the results from the ANOVA table

26.633 / 2
f0 572
0.163 / 7
Because 572 > 4.74 reject H0 and conclude that the regression model is significant at D = 0.05.

c) H 0 :E11 0
H 1:E11 z 0
The test statistic is

E 1  E1,0
t0
se(E )
1
Reject H0 if t0 < tD/2,n-2 where t0.025,7 = 2.365 or t0 > t0.025,7 = 2.365
Using the results from the table given in part a)
0.005699  0
t0 9.498
0.0006
Because 9.498 < 2.365 reject H 0 and conclude the quadratic term contributes significantly to the model at D = 0.05.

d)

63
Engineering Statistics 5th edition October 01, 2010

There might be a slight indication of non-constant variance. There is greater variability in the residuals as the predicted
values decrease.

e)

Normality assumption is reasonable. The residuals fall along a line.

6- 49. a)

64
Engineering Statistics 5th edition, SI

6- 49. Note: Problem 6-49 should read as follows:

a)

The simple linear regression model seems appropriate. The data fall along a straight line.

65
Engineering Statistics 5th edition, SI

b)

= 0.28 + 20.6x
c) = 0.28 + 20.6(1) = 20.88
d)
= 0.28 + 20.6(0.5) = 10.58
ei yi y i 11.8 10.58 1.22
e) The least squares estimate minimizes ( y i x i ) 2 .
Upon setting the derivative equal to zero, we obtain
2 ( y x )( x ) 2[ y x x 2 ] 0
i i i i i i

Solving for ,
yi xi
2
xi

f)

= 0.28 + 20.6x

66
Engineering Statistics 5th edition, SI

Examining the plot, the model seems appropriate in this case. In part b) the coefficient for the intercept was not
significantly different from zero (P-value = 0.811) so that the zero intercept model could be expected to provide an
adequate fit in this exercise.

6-50. a) has a t distribution with n 1 degrees of freedom.
2
x2i
b) From the previous exercsie
21031461
. , 3.611768, and
x 2i 14.7073 .
Therefore,
21031461
.
t0 22.3314 and H 0 : 0 is rejected at usual values.
3.611768
14.7073

6-51. a)

A linear regression model seems appropriate. The data fall along a line.

67
Engineering Statistics 5th edition, SI

b)

c) y 18101.3 254.85(20) 13004.3

d)

68
Engineering Statistics 5th edition, SI


If there were no error, the values would all lie along the 45 axis. YES, the plot indicates that age is a reasonable
regressor variable.

6-52. a)

Plot of Strength vs z=x-xbar Plot of Strength vs Age

2800 2800

2600 2600

2400 2400
Strength
Strength

2200 2200

2000 2000

1800 1800

1600 1600

-12 -8 -4 0 4 8 12 0 5 10 15 20 25

z Age

The slopes of both regression models are the same, but the y-intercepts differ.

b) The regression equation is


y = 2132 - 37.0 z

Predictor Coef SE Coef T P


Constant 2132.41 22.15 96.28 0.000
z -36.962 2.967 -12.46 0.000

S = 99.05 R-Sq = 89.6% R-Sq(adj) = 89.0%

y 2132.41 36.9618x

69
Engineering Statistics 5th edition, SI

0 2625.39 0 2132.41
vs.
36.9618
1 1 36.9618

Because the data is shifted by the mean of age, the intercept in the model Y 0* 1* z is now the average of
strength.
1
6-53. t0
2 / S xx
After the transformation
b
1 1 ,
a
S xx a 2 S xx ,
x ax , 0 b 0 ,
and b .
b 1 / a ( b / a ) 1 1
Therefore, t 0 t0 .
( b ) 2 / a 2Sxx ( b / a ) 2 Sxx 2 Sxx

6-54. H 0 :1 10
H 1:1 10
The test statistic is
1 1,0
t0
se( )
1
Reject H0 if t0 < t/2,n-2 where t0.005,10 = 3.17 or t0 > t0.005,10 = 3.17
Using the data from the referenced exercise
9.21 10
t0 23.37
0.0338
Because 23.37 < 3.17 reject H 0 . The coefficient of the regressor is significantly different from 10 at = 0.01.
P-value = 0.

6-55. a) All possible regressions.

Response is y*
x
x x 3 x x x
Vars R-Sq R-Sq(adj) C-p S 1 2 * 4 5 6

1 98.8 98.7 54.7 0.015088 X


1 98.5 98.5 72.0 0.016462 X
2 99.1 99.0 32.7 0.013115 X X
2 99.1 99.0 34.4 0.013277 X X
3 99.5 99.4 6.8 0.010145 X X X
3 99.4 99.4 10.9 0.010663 X X X
4 99.5 99.5 4.6 0.0097127 X X X X
4 99.5 99.4 6.3 0.0099487 X X X X
5 99.5 99.5 5.1 0.0096421 X X X X X
5 99.5 99.5 6.6 0.0098471 X X X X X
6 99.5 99.5 7.0 0.0097668 X X X X X X

b) Forward selection. Alpha-to-Enter: 0.25


Response is y* on 6 predictors, with N = 40

Step 1 2 3 4 5
Constant 7.275 6.837 6.698 6.728 -15.490

x4 0.00565 0.00430 0.00332 0.00333 0.00280


T-Value 54.80 11.31 6.52 7.98 5.63

70
Engineering Statistics 5th edition, SI

P-Value 0.000 0.000 0.000 0.000 0.000

x2 0.00003 0.00006 0.00003 0.00002


T-Value 3.65 4.66 2.52 1.23
P-Value 0.001 0.000 0.016 0.227

x6 -0.00159 -0.00330 -0.00461


T-Value -2.66 -5.25 -4.86
P-Value 0.011 0.000 0.000

x5 0.00041 0.00026
T-Value 4.33 2.11
P-Value 0.000 0.042

x3* 2.2
T-Value 1.81
P-Value 0.080

S 0.0151 0.0131 0.0122 0.00995 0.00964


R-Sq 98.75 99.08 99.23 99.50 99.54
R-Sq(adj) 98.72 99.03 99.17 99.44 99.48
C-p 54.7 32.7 23.7 6.3 5.1

c) Backward elimination. Alpha-to-Remove: 0.1


Response is y* on 6 predictors, with N = 40

Step 1 2 3
Constant -16.26 -15.49 -23.51

x1 -0.00004
T-Value -0.37
P-Value 0.713

x2 0.00002 0.00002
T-Value 1.25 1.23
P-Value 0.220 0.227

x3* 2.3 2.2 3.0


T-Value 1.82 1.81 2.90
P-Value 0.078 0.080 0.006

x4 0.00312 0.00280 0.00296


T-Value 3.14 5.63 6.12
P-Value 0.004 0.000 0.000

x5 0.00027 0.00026 0.00026


T-Value 2.11 2.11 2.07
P-Value 0.042 0.042 0.046

x6 -0.00463 -0.00461 -0.00501


T-Value -4.81 -4.86 -5.56
P-Value 0.000 0.000 0.000

S 0.00977 0.00964 0.00971


R-Sq 99.55 99.54 99.52
R-Sq(adj) 99.46 99.48 99.47
C-p 7.0 5.1 4.6

d) The model with only x3*, x4, x5, and x6 seems to be the best among all. It has a high R-Sq(adj) and small Cp value.

6-56. a) All possible regressions.

Response is Sat
S
A S u A
Mallows g e r n
Vars R-Sq R-Sq(adj) C-p S e v g x

71
Engineering Statistics 5th edition, SI

1 82.1 80.8 4.4 9.3577 X


1 57.0 54.0 27.3 14.487 X
2 87.9 86.1 1.1 7.9723 X X
2 83.0 80.4 5.5 9.4476 X X
3 88.0 85.0 3.0 8.2768 X X X
3 87.9 84.9 3.0 8.2942 X X X
4 88.0 83.6 5.0 8.6446 X X X X

b) Forward selection. Alpha-to-Enter: 0.25

Response is Sat on 4 predictors, with N = 16

Step 1 2
Constant 136.2 146.7

Age -1.43 -1.12


T-Value -8.01 -5.76
P-Value 0.000 0.000

Sev -0.56
T-Value -2.51
P-Value 0.026

S 9.36 7.97
R-Sq 82.07 87.92
R-Sq(adj) 80.79 86.06
Mallows C-p 4.4 1.1

c) Backward elimination. Alpha-to-Remove: 0.1

Response is Sat on 4 predictors, with N = 16

Step 1 2 3
Constant 146.2 146.2 146.7

Age -1.12 -1.12 -1.12


T-Value -5.25 -5.51 -5.76
P-Value 0.000 0.000 0.000

Sev -0.59 -0.59 -0.56


T-Value -2.11 -2.22 -2.51
P-Value 0.058 0.046 0.026

Surg 0.1
T-Value 0.03
P-Value 0.979

Anx 0.5 0.6


T-Value 0.22 0.25
P-Value 0.832 0.809

S 8.64 8.28 7.97


R-Sq 87.98 87.98 87.92
R-Sq(adj) 83.61 84.97 86.06
Mallows C-p 5.0 3.0 1.1

d) Model with only age and severity seems to be the best among all. It has a large R-Sq(adj) and small Cp and S
values.

6-57. The regression equation is


rads = - 440 + 19.1 mAmps + 68.1 exposure time

Predictor Coef SE Coef T P


Constant -440.39 94.20 -4.68 0.000
mAmps 19.147 3.460 5.53 0.000
exposure time 68.080 5.241 12.99 0.000

72
Engineering Statistics 5th edition, SI

S = 235.718 R-Sq = 84.3% R-Sq(adj) = 83.5%

Analysis of Variance

Source DF SS MS F P
Regression 2 11076473 5538237 99.67 0.000
Residual Error 37 2055837 55563
Total 39 13132310

a) y 440.39 19.147 x1 68.080 x 2


where x1 mAmps x 2 ExposureTime
b) 55563
2

se( 0 ) 94.20 , se( 1 ) 3.460 , and se( 2 ) 5.241


c) Based on the P-values from the t-test for each coefficient in the table above, both coefficients differ from zero at the
0.05 level of significance.
d) y 440.39 19.147(15) 68.080(5) 187.215
se( Y 15,5 ) 47.1
95% PI = (-299.8, 674.3)

6-58. The regression equation is


ARSNAILS = 0.488 - 0.00077 AGE - 0.0227 DRINKUSE - 0.0415 COOKUSE
+ 13.2 ARSWATER

Predictor Coef SE Coef T P


Constant 0.4875 0.4272 1.14 0.271
AGE -0.000767 0.003508 -0.22 0.830
DRINKUSE -0.02274 0.04747 -0.48 0.638
COOKUSE -0.04150 0.08408 -0.49 0.628
ARSWATER 13.240 1.679 7.89 0.000

S = 0.236010 R-Sq = 81.2% R-Sq(adj) = 76.5%

Analysis of Variance

Source DF SS MS F P
Regression 4 3.84906 0.96227 17.28 0.000
Residual Error 16 0.89121 0.05570
Total 20 4.74028

a) y 0.4875 0.000767 x1 0.02274 x 2 0.04150 x3 13.240 x 4 where


x1 AGE x 2 DrinkUse x3 CookUse x 4 ARSWater
b) 2 0.05570 se( 0 ) 0.4272 , se( 1 ) 0.003508 , se( 2 ) 0.04747 , se( 3 ) 0.08408 , and se( 4 ) 1.679
c) y 0.4875 0.000767(30) 0.02274(5) 0.04150(5) 13.240(0.135) 1.9307

73

You might also like