You are on page 1of 12

Chapter 17

Correlation and Regression


True/False Questions

The product moment correlation, r, is an index used to determine whether a linear, or


straight-line, relationship exists between X and Y. It indicates the degree to which the
variation in one variable, X, is related to the variation in another variable, Y.
(True, easy, page 497)
When determining the correlation coefficient, r, it does matter which variable is
considered to be the dependent variable and which the independent.
(False, different, page 499)
The test statistic is used when determining the statistical significance of the
relationship between two variables measured by using r.
(True, moderate, page 499)
A correlation matrix indicates the coefficient of correlation between each pair of
variables.
(True, easy, page 500)
The order associated with a partial correlation indicates how many variables are being
adjusted or controlled.
(True, moderate, page 501)
The partial correlation coefficient is a measure of the correlation between Y and X
when the linear effects of the other independent variables have been removed from X
but not from Y.
(False, moderate, page 501)
In the absence of ties, Kendalls yields a closer approximation to the Pearson
product moment correlation coefficient, than Spearmans s.
(False, difficult, page 502)
Regression analysis is concerned with the nature and degree of association between
variables and does not imply or assume any causality.
(True, easy, page 503)
The product moment correlation helps us determine the strength of the association
between two metric variables. Regression analysis helps us determine which
variables cause a change in other variables.
(False, easy, page 503)

230

In bivariate regression, the null hypothesis is that no linear relationship exists


between X and Y, or H0: 0.
(True, moderate, page 504)
Standardized variables have a mean of 1 and a variance of zero.
(False, easy, page 507)
The term beta coefficient or beta weight is used to denote the standardized regression
coefficient.
(True, moderate, page 507)
The statistical significance of the linear relationship between X and Y may be tested
by examining the hypotheses: H0: 0 ; H1: 0.
(False, moderate, page 508)
The formula for the coefficient of determination is r = SSreg/ SSy.
(True, moderate, page 509)
The hypotheses for the test for significance of the coefficent of determination are: H0:
RpopH1: Rpop
(True, difficult, page 510)
The standard error of estimate, SEE, is the standard deviation of the actual Y values
from the predicted values.
(True, moderate, page 510)
The general form of the multiple regression model is:
Y =0+ 1 X1+ 2 X2+ 3X3++ kXk+e
(True, moderate, page 512)
The coefficient of multiple determination is adjusted for the number of dependent
variables and the sample size to account for diminishing returns.
(False, difficult, page 513)
The multiple correlation coefficient, R, can also be viewed as the simple correlation
coefficient, r, between Y and .
(True, moderate, page 515)
R2cannot decrease as more independent variables are added to the regression
equation.
(True, easy, page 515)
In multiple regression, if the overall null hypothesis is rejected, we know which
specific coefficients (is) are nonzero.
(False, moderate, page 516)

231

A residual is the difference between the observed value of Yi and the value predicted
by the regression equation, i.
(True, easy, page 517)

232

If a variable explains a significant proportion of the residual variation, it should be


considered for inclusion in the regression equation.
(True, moderate, page 518)
If an examination of the residuals indicates that the assumptions underlying linear
regression are not met, the researcher can transform the variables in an attempt to
satisfy the assumptions.
(True, difficult, page 518)
The purpose of stepwise regression is to select, from a large number of predictor
variables, a small subset of variables that account for most of the variation in the
dependent or criterion variable.
(True, easy, page 519)
Stepwise procedures result in regression equations that are optional, in the sense of
producing the largest R2, for a given number of predictors.
(True, easy, page 520)
Multicollinearity arises when intercorrelations among the predictors are very low.
(False, easy, page 521)
When coding for dummy variables, c 1 codes are needed for the c categories of an
independent variable.
(True, easy, page 523)
In regression with dummy variables, the predicted for each category is the mean of
Y for each category.
(True, moderate, page 524)
Regression in which a single independent variable has been recoded into dummy
variables is equivalent to one-way analysis of variance.
(True, moderate, page 524)
Multiple Choice Questions
31. _____ is best to use to determine how strongly sales are related to advertising
expenditures.
a. Regression analysis
b. Partial correlation coefficient
c. ANOVA
d. Product moment correlation (r)
(d, moderate, page 497)

233

32. The _____ is a statistic summarizing the strength of association between two metric
variables.
a. regression analysis
b. partial correlation coefficient
c. ANOVA
d. product moment correlation
(d, moderate, page 497)
33. The equation for r involves dividing the _____ by _____.
COVxy; the product of the variance of X and Y (Sx2Sy2)
a.
product of the standard deviation of X and Y (SxSy); COVxy
b.
COVxy; the product of the standard deviation of X and Y (SxSy)
c.
product of the variances of X and Y (Sx2Sy2); COVxy
d.
(c, difficult, page 497)
34. The equation for r is represented as:
a.
COVxy/ Sx2Sy2
b.
SxSy/COV
c.
COVxy/ SxSy
d.
Sx2Sy2/COV
(c, moderate, page 497)
35. r2 measures:
a. the proportion of variation in one variable that is explained by the other.
b. the proportion of error variation .
c. the proportion of variation in Y related to the variation of the categories of X.
d. the proportion of variation in Y due to the variation within each of the categories
of X.
(a, moderate, page 499)
36. r = 0 indicates:
a. X and Y have a relationship
b. X and Y dont have a linear relationship
c. X and Y are unrelated
d. X and Y have a linear relationship
(b, moderate, page 499)
37. Which statement about the correlation coefficient, r, is true?
a. The calculation of r assumes that X and Y are metric variables whose distributions
have the same shape.
b. The correlation coefficient computed for a population is denoted by (rho).
c. Data obtained by using rating scales with a small number of categories tends to
deflate r.
d. All of the statements are true.
(d, difficult, page 499)

234

38. Which statement is not true about correlation matrices?


a. Usually only the lower portion of the matrix is considered.
b. The diagonal elements all equal 0.
c. A correlation matrix indicates the coefficient of correlation between each pair of
variables.
d. The upper triangular portion of the matrix is a mirror image of the lower
triangular portion.
(b, easy, page 500)
39. The _____ is a measure of the association between two variables after controlling or
adjusting for the effects of one or more additional variables.
a. regression analysis
b. partial correlation coefficient
c. ANOVA
d. product moment correlation
(b, moderate, page 500)
40. The question of How strongly are sales related to advertising expenditures when the
effect of price is controlled? is best answered via _____.
a. regression analysis
b. partial correlation coefficient
c. ANOVA
d. product moment correlation
(b, moderate, page 500)
41. Which statement is not correct about the partial correlation coefficient?
a. Partial correlations can be helpful for detecting spurious relationships.
b. The partial correlation coefficient is generally viewed as more important than the
part correlation coefficient.
c. The partial correlation coefficient represents the correlation between Y and X
when the linear effects of the other independent variables have been removed
from X but not from Y.
d. The partial correlation coefficient can be calculated by a knowledge of the simple
correlations alone.
(c, moderate, pages 500-502)
42. Which of the following is a measure of non-metric correlation?
a. Pearson product moment correlation
b. Spearmans rho
c. Kendalls tau
d. both b and c
(d, easy, page 502)

235

43. _____ is a statistical procedure for analyzing associative relationships between a


metric dependent variable and one or more independent variables.
a. regression analysis
b. partial correlation coefficient
c. ANOVA
d. product moment correlation
(a, moderate, page 502)
44. _____ is a procedure for deriving a mathematical relationship, in the form of an
equation, between a single metric dependent variable and a single metric independent
variable.
a. Chi-square
b. Part correlation
c. Multiple regression
d. Bivariate regression
(d, moderate, page 503)
45. Which of the following marketing questions would be best answered by bivariate
regression?
a. Are consumers perceptions of quality related to their perceptions of prices when
the effect of brand image is controlled?
b. Can the variation in market share be accounted for by the size of the sales force?
c. Do retailers, wholesalers, and agents differ in their attitudes toward the firms
distribution policies?
d. How do advertising levels (high, medium, and low) interact with price levels
(high, medium, and low) to influence a brands sale?
(b, difficult, page 503)
46. A technique for fitting a straight line to a scattergram by minimizing the square of the
vertical distances of all the points from the line is known as the _____.
a. least-square procedure
b. scatter diagram plot
c. sum of square errors procedure
d. maximum residual procedure
(a, easy, page 505)
47. The bivariate regression model that accounts for the probabilistic or stochastic nature
of the relationship between X and Y is _____.
= a+b1X1+b2X2
a.
Y = 0+ 1 Xi
b.
Yi =0+ 1 Xi+ ei
c.
i = a+bXi
d.
(c, easy, page 506)

236

48. What is the bivariate regression equation if sample observations are used to predict Y?
= a+b1X1+b2X2
a.
Y = 0+ 1 Xi
b.
Yi =0+ 1 Xi+ ei
c.
i = a+bxi
d.
(d, easy, page 506)
49. Which statement is not true about the constant b in the bivariate regression equation
i = a+bXi?
a.
It is usually referred to as the non-standardized regression coefficient.
b.
It is the slope of the regression line and it indicates the expected change in Y when
X is changed by one unit.
c.
It is the intercept of the regression line and it indicates the value of Y when X is
zero.
d.
It may be computed as b=COVxy/Sx2
(c, moderate, page 506)
50. Which equation depicts the relationship between the standardized and nonstandardized regression coefficients?
a. Byx= byx(S2x/S2y)
b. B2yx= byx(Sx/Sy)
c. Byx= byx(Sx/Sy)
d. B2yx= byx(S2x/S2y)
(c, moderate, page 507)
51. The standard deviation of b, or the standard error, is denoted as:
SEb
a.
SDb
b.
SSYb
c.
None of the above
d.
(a, moderate, page 508)
52. In bivariate regression, which statement is true concerning the coefficient of
determination, r2?
a. r2 is the square of the simple correlation coefficient obtained by correlating the
two variables.
b. r2 varies between 0 and 1.
c. r2 signifies the proportion of the total variation in Y accounted for by the variation
in X.
d. All are correct.
(d, easy, page 508)

237

53. _____ is the appropriate test statistic to use to determine the significance of the
coefficient of determination in bivariate regression.
a.
F statistic
b.
T statistic
c.
Z statistic
d.
2
(a, moderate, page 510)
54. To estimate the accuracy of predicted values, , found in bivariate regression, it is
useful to calculate the _____, the standard deviation of the actual Y values for the
predicted values.
a. coefficient of determination
b. standard error of the estimate
c. covariance
d. standard error
(b, difficult, page 510)
55. _____ is a statistical technique that simultaneously develops a mathematical
relationship between two or more independent variables and an interval-scaled
dependent variable.
a. Chi-square
b. The least-squares procedure
c. Multiple regression
d. Bivariate regression
(c, moderate, page 511)
56. The general form of the multiple regression model is estimated by which equation?
i = a+bXi
a.
Yi =0+ 1 Xi+ ei
b.
=a+ b1 X1+ b2 X2+ b3X3++ bkXk
c.
= a+b1X1+b2X2
d.
(c, easy, page 512)
57. Which statistic is associated only with multiple regression and not with bivariate
regression?
a.
adjusted R2
b.
F test
c.
estimated or predicted value ()
d.
both a and b
(d, moderate, page 513)
58. The _____ denotes the change in the predicted value, , per unit change in X1 when
the other independent variables, X2 to Xk, are held constant.
a. partial regression coefficient
b. partial correlation coefficient
c. part correlation coefficient
d. part regression coefficient
(a, moderate, page 513)
238

59. Which statement is not true about partial regression coefficients?


a. The combined effects of X1 and X2 on Y are additive. In other words, if X1 and X2
are each changed by one unit, the expected change in Y would be (b1 + b2).
b. The beta coefficients are the partial regression coefficients obtained when all the
variables (Y, X1, X2 Xk) have been standardized to a mean of 0 and a variance of 1
before estimating the regression equation.
c. Partial regression coefficients have an order associated with them.
d. Both a and b are not true.
(c, moderate, page 513-514)
60. In multiple regression, if the overall null hypothesis is rejected:
a. the mean value of the dependent variable will be different for different categories
of the independent variable.
b. the means of the independent variables are not equal.
c. there is an association between the independent variables.
d. one or more population partial regression coefficients have a value different from
0.
(d, moderate, page 516)
61. In multiple regression, if the overall null hypothesis is rejected, which statement is
true?
a. We know which specific s are nonzero.
b. We can use t = b/SEb to determine which s are nonzero.
c. We do not know which s are nonzero.
d. Both b and are correct.
(d, moderate, page 516)
62. _____ is a regression procedure in which the predictor variables enter or leave the
regression equation one at a time.
a. Multiple regression
b. Bivariate regression
c. Dummy variable regression
d. Stepwise regression
(d, easy, page 519)
63. Which of the following is not a problem associated with multicollinearity?
a. The partial regression coefficients may not be estimated precisely. The standard
errors are likely to be high.
b. It becomes difficult to assess the relative importance of the independent variables
in explaining the variation in the dependent variables.
c. Predictor variables may be incorrectly included or removed in stepwise
regression.
d. It becomes difficult to compute the correct test statistic.
(d, difficult, page 521)

239

64. _____ variables may be used as predictors or independent variables by coding them
as dummy variables.
a. interval
b. categorical
c. ratio
d. all of the above
(b, easy, page 523)
65. The regression equation for a categorical variable with four categories would be
modeled as:
i=a+ b1 D1+ b2 D2+ b3D3
a.
i=a+ b1 D1+ b2 D2+ b3D3+ b4D4
b.
Y=a+ b1 D1+ b2 D2+ b3D3
c.
Y=a+ b1 D1+ b2 D2+ b3D3+ b4D4
d.
(a, moderate, page 523)
Essay Questions
66. In what ways can regression analysis be used?
Answer
1. Determine whether the independent variables explain a significant variation in the
dependent variable: whether a relationship exists.
2. Determine how much of the variation in the dependent variable can be explained
by the independent variables: strength of the relationship.
3. Determine the structure or form of the relationship: the mathematical equation
relating the independent and dependent variables.
4. Predict the values of the dependent variable.
5. Control for other independent variables when evaluating the contributions of a
specific variable or set of variables.
(moderate, page 503)
67. Briefly explain how a scatter diagram benefits the researcher?
Answer
A scatter diagram is useful for determining the form of the relationship between the
variables. A plot can alert the researcher to patterns in the data, or to possible
problems. Any unusual combinations of the two variables can be easily identified.
(easy, pages 504-505)

240

68. What are the assumptions made by the regression model in estimating the parameters
and in significance testing?
Answer
1. The error term is normally distributed. For each fixed value of X, the distribution
of Y is normal.
2. The means of all these normal distributions of Y, given X, lie on a straight line
with slope b.
3. The mean of the error term is 0.
4. The variance of the error term is constant. This variance does not depend on the
values assumed by X.
5. The error terms are uncorrelated. In other words, the observations have been
drawn independently.
(difficult, page 511)
69. Given the multiple regression equation, = a+b1X1+b2X2, and the bivariate equation
= a+bX, why is the partial regression coefficient, b1, different from the regression
coefficient, b, obtained by regressing Y on only X1?
Answer
This happens because X1 and X2 are usually correlated. In bivariate regression, X2 was
not considered and any variation in Y that was shared by X1 and X2 was attributed to
X1. However, in the case of multiple independent variables, this is no longer true.
(difficult, page 513)

241

You might also like