You are on page 1of 10

Week 3 Practice Quiz 8/8 points (100%)

Practice Quiz, 8 questions

Congratulations! You passed! Next Item

1/1
points

1.
Data were collected on 200 high schools students scores on various tests,
including science, math, reading and social studies as well as the gender of
the students. The model output for predicting science scores from the rest of
the variables is shown below. Which of the following is the linear model for
female students?

= 12.325 + 0.389 math + 0.050 social studies - 1.675 reading


science

= 12.325 + 0.389 math + 0.050 social studies + 0.335


science
reading

= 14.335 + 0.389 math + 0.050 social studies + 0.335


science
reading

= 10.315 + 0.389 math + 0.050 social studies + 0.335


science
reading
Correct
This question refers to the following learning objective(s): Dene the
multiple linear regression model as

^y = 0 + 1 x1 + 2 x2 + + k xk

where there are k predictors (explanatory variables).

1/1
points

2.
We modeled the gas mileage of 398 cars built in the 1970s and early 1980s
using engine displacement (in cubic inches), year of manufacture in relation to
1970 (e.g. 4 means the car was built in 1974; 12 means built in 1982, etc.), and
manufacturing site (domestic to the USA = 0; foreign to the USA = 1). The
regression output is provided below. Note that domestic is the reference level
for manufacturing site.

Which of the following is the best interpretation of the slope of year?

All else held constant, the model predicts that later model cars will
get an average of 0.72 additional miles per gallon for each year
dierence in the date of manufacture.

Correct
This question refers to the following learning objective(s):

- Interpret the estimate for the intercept (b 0 ) as the expected


value of y when all predictors are equal to 0 , on average.
- Interpret the estimate for a slope (say b 1 ) as All else held
constant, for each unit increase in x1 , we would expect y to be
higher/lower on average by b 1 .

All else held constant, the model predicts that as the date of
manufacture increases by 1 year, gas mileage changes 2.21 times as
fast for foreign cars as it does for domestic cars.

When a particular car is manufactured again in the following year,


its gas mileage will improve by 0.72 miles per gallon.

All else held constant, the model predicts that later model cars will
get an average of 12.48 additional miles per gallon for each year
dierence in the date of manufacture.

1/1
points

3.
We modeled the gas mileage of 398 cars built in the 1970s and early 1980s
using engine displacement (in cubic inches), year of manufacture in relation to
1970 (e.g. 4 means the car was built in 1974; 12 means built in 1982, etc.), and
manufacturing site (domestic to the USA = 0; foreign to the USA = 1). The
regression output is provided below. Note that domestic is the reference level
for manufacturing site.

Which of the following is false?

Given information on the manufacturing site of the car and the year
of manufacture, engine displacement is a signicant predictor of
gas mileage.
The 95% condence interval for the slope of the displacement
coecient can be calculated as 0.04 (1.96 16.42) .

Correct
16.42 is the T score, not the standard error.

This question refers to the following learning objective(s):

-The signicance of the model as a whole is assessed using an F-test.

- H0 : 1 = 2 = = k

HA : At least one i 0 .

- df = n k 1 degrees of freedom.

- Usually reported at the bottom of the regression output.

- Note that the p-values associated with each predictor are conditional
on other variables being included in the model, so they can be used to
assess if a given predictor is signicant, given that all others are in the
model.

- These p-values are calculated based on a t distribution with n k 1


degrees of freedom.

- The same degrees of freedom can be used to construct a condence


interval for the slope parameter of each predictor:


bi t nk1 SEbi

Given the engine displacement and manufacturing site, year of


manufacture is a signicant predictor of gas mileage.

If we add another variable to the model, for example the price of


the car, the p-values associated with year of manufacture, site of
manufacture, and engine displacement may change.

1/1
points
4.
You are considering adding an explanatory variable to an existing multiple
linear regression model. Which of the following statements is generally true
2 2
regarding R and adjusted R as a result of adding the variable?

2
If the variable is not a meaningful predictor, R will decrease and
2
adjusted R will stay about the same.

2
If the variable is not a meaningful predictor, R will be very close to
2
1 and adjusted R will decrease.

2 2
If the variable is a meaningful predictor, R and adjusted R will
both increase.

Correct
This question refers to the following learning objective(s): Note that
R 2 will increase with each explanatory variable added to the model,
regardless of whether or not the added variable is a meaningful
2
predictor of the response variable. Therefore we use adjusted R ,
which applies a penalty for the number of predictors included in the
model, to better assess the strength of a multiple linear regression
model:

2 SSE/(nk1)
R adj =1 SST/(n1)

where n is the number of cases and k is the number of predictors.

2
- Note that R adj will only increase if the added variable has a
meaningful contribution to the amount of explained variability in y ,
i.e. if the gains from adding the variable exceeds the penalty.

2 2
If the variable is not a meaningful predictor, R and adjusted R
will both decrease.

2
If the variable is a meaningful predictor, R will increase while
2
adjusted R will stay about the same.

2
If the variable is a meaningful predictor, adjusted R will increase
2
and be higher than R .
2
If the variable is a meaningful predictor, R will increase while
2 2
adjusted R will decrease and become closer to the value of R .

1/1
points

5.
Which of the following is true?

2
A model selected using the adjusted R backwards selection
approach will only contain explanatory variables that are signicant
at the 5 % level.

2 2
Adjusted R may or may not be smaller than R , depending on the
sample size and the number of predictors in the model.

A parsimonious model is the model containing the highest possible


number of predictors.

2
Adjusted R applies a penalty for the number of predictors included
in the regression model.

Correct
This question refers to the following learning objective(s): The general
idea behind backward-selection is to start with the full model and
eliminate one variable at a time until the ideal model is reached.

- p-value method:

(i) Start with the full model.

(ii) Drop the variable with the highest p-value and ret the
model.

(iii) Repeat until all remaining variables are signicant.

2
- adjusted R method:

(i) Start with the full model.


(ii) Ret all possible models omitting one variable at a time,
2
and choose the model with the highest adjusted R .

2
(iii) Repeat until maximum possible adjusted R is reached.

1/1
points

6.
Which of the following is false about conditions for multiple linear regression?

The residuals plot should show constant variability of residuals


around 0 .

It is ideal for there to be no strong relationships between any of the


explanatory variables.

When the residuals are plotted in a histogram, they should appear


normally distributed around 0 .

With multiple predictors in the model, its not necessary for each of
the numerical explanatory variables to have a linear relationship
with the response variable.

Correct
Each numerical explanatory variable should be linearly associated
with the response variable.

This question refers to the following learning objective(s): List the


conditions for multiple linear regression as

(1) linear relationship between each (numerical) explanatory variable


and the response - checked using scatterplots of y vs. each x , and
residuals plots of residuals vs. each x

(2) nearly normal residuals with mean 0 - checked using a normal


probability plot and histogram of residuals
(3) constant variability of residuals - checked using residuals plots of
residuals vs. ^
y, and residuals vs. each x

(4) independence of residuals (and hence observations) - checked


using a scatterplot of residuals vs. order of data collection (will reveal
non-independence if data have time series structure)

1/1
points

7.
Which of the following is false?

When dealing with collinearity, a useful strategy is to add more


predictors to the model, one at a time, until the bad eects of
collinearity disappear from the analysis.

Correct
Collinearity is a characteristic of existing predictors. Therefore, simply
adding more predictors to the model will not eliminate any collinearity
thats already present.

This question refers to the following learning objective(s): The general


idea behind backward-selection is to start with the full model and
eliminate one variable at a time until the ideal model is reached.

- p-value method:

(i) Start with the full model.

(ii) Drop the variable with the highest p-value and ret the
model.

(iii) Repeat until all remaining variables are signicant.

2
- adjusted R method:

(i) Start with the full model.


(ii) Ret all possible models omitting one variable at a time,
2
and choose the model with the highest adjusted R .

2
(iii) Repeat until maximum possible adjusted R is reached.

2
In backwards model selection using adjusted R as the criterion, we
2
drop variables from the model, one at a time, until adjusted R is
maximized.

R 2 is always greater than or equal to adjusted R 2 .

1/1
points

8.
True / False. If the F-test assessing the overall signicance of a multiple linear
regression yields a signicant p-value, all variables included in the model must
be signicant predictors.

True

False

Correct
The null hypothesis for the F-test assessing the overall signicance of
a multiple linear regression model is

H 0 = 1 = 2 = = k = 0

A signicant F-test means that at least one of these s is dierent than


0, not that all of them are dierent than 0.

You might also like