Professional Documents
Culture Documents
Population slopes
Random Error
Yi 0 1X1i 2 X 2i k X ki i
The error term is normally distributed. For each fixed value of X, the distribution of Y is normal. The mean of the error term is 0 and SD should be one . The variance of the error term is constant. This variance does not depend on the values assumed by X. The error terms are uncorrelated. In other words, the observations have been drawn independently.
Assumptions
are
independent
amongst
Assumptions
Independent variables should be uncorrelated with residual.. Model should be properly specified. No. of observation should be more than no. of parameters Model is linear in parameters Independent variables are fixed in repeated samples.
Mr. Pranav Ranjan & Ms. Razia Sehdev ICTC, LPU
The strength of association in multiple regression is measured by the square of the multiple correlation coefficient, R2, which is also called the coefficient of multiple determination.
Adjusted R2
R2, coefficient of multiple determination, is adjusted for the number of independent variables and the sample size to account for the diminishing returns. After the first few variables, the additional independent variables do not make much contribution.
Mr. Pranav Ranjan & Ms. Razia Sehdev ICTC, LPU
Observations
ANOVA Regression
df
Residual
Total
12
14 Coefficien ts
27033.306
56493.333 Standard Error 114.25389 25.96732
2252.776
Intercept
306.52619
Price
Advertising
-24.97509
74.13096
Mr. Pranav Ranjan & Ms. Razia Sehdev 10.83213 -2.30565 0.03979 ICTC, LPU
-48.57626
17.55303
-1.37392
130.70888
b1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price, net of the effects of changes due to advertising
Mr. Pranav Ranjan & Ms. Razia Sehdev ICTC, LPU
b2 = 74.131: sales will increase, on average, by 74.131 pies per week for each $100 increase in advertising, net of the effects of changes due to price
(continued)
0.44172
47.46341 15
52.1% of the variation in pie sales is explained by the variation in price and advertising
SS MS F Significance F 14730.01 3 6.53861 2252.776
ANOVA
df
Regression
Residual Total
2
12 14 Coefficien ts
29460.027
27033.306 56493.333 Standard Error 114.25389 25.96732
0.01201
Intercept
306.52619
Price
Advertising
-24.97509
74.13096
Mr. Pranav Ranjan & Ms. Razia Sehdev 10.83213 -2.30565 0.03979 ICTC, LPU
-48.57626
17.55303
-1.37392
130.70888
Adjusted
Regression Statistics
2 r
(continued)
Multiple R
R Square Adjusted R Square Standard Error Observations
0.72213
0.52148
2 adj
.44172
0.44172
47.46341 15
44.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variables
SS MS 14730.01 3 2252.776 F 6.53861 Significance F 0.01201
df 2 12
29460.027 27033.306
Total
14
Coefficien ts
56493.333
Standard Error 114.25389 Upper 95% 555.46404 -1.37392
t Stat 2.68285
P-value 0.01993
Intercept Price
306.52619 -24.97509
Mr. Pranav Ranjan & Ms. Razia Sehdev ICTC, LPU 10.83213 -2.30565 0.03979
(continued)
Observations
15
df 2 12 14 Coefficien ts
306.52619 -24.97509
74.13096 Mr. Pranav 25.96732 0.01449 Ranjan & Ms. 2.85478 Razia Sehdev
ICTC, LPU
(continued) t-value for Price is t = -2.306, with p-value .0398 t-value for Advertising is t = 2.855, with p-value .0145
SS MS 14730.01 3 2252.776 F Significance F
ANOVA
df
Regression
Residual Total
2
12 14 Coefficien ts
29460.027
27033.306 56493.333 Standard Error
6.53861
0.01201
t Stat
P-value
Lower 95%
Upper 95%
Intercept
Price Advertising
306.52619
-24.97509 74.13096
114.25389
10.83213
2.68285
-2.30565
0.01993
0.03979
57.58835
-48.57626 17.55303
555.46404
-1.37392 130.70888
Mr. Pranav Ranjan & Ms. Razia Sehdev 25.96732 2.85478 0.01449 ICTC, LPU
Multicollinearity
Multicollinearity arises when intercorrelations among the predictors are very high. Result in several problems, including: The partial regression coefficients may not be estimated precisely. The standard errors are likely to be high. The magnitudes as well as the signs of the partial regression coefficients may change from sample to sample. It becomes difficult to assess the relative importance of the independent variables in explaining the variation in the dependent variable. Predictor variables may be incorrectly included or removed in stepwise regression. Mr. Pranav Ranjan & Ms. Razia Sehdev
ICTC, LPU
Multicollinearity
A simple procedure for adjusting for multicollinearity consists of using only one of the variables in a highly correlated set of variables.
Alternatively, the set of independent variables can be transformed into a new set of predictors that are mutually independent by using techniques such as principal components analysis. More specialized techniques, such as ridge regression and latent root regression, can also be used.
Mr. Pranav Ranjan & Ms. Razia Sehdev ICTC, LPU
Multicollinearity Diagnostics:
Variance Inflation Factor (VIF) measures how much the variance of the regression coefficients is inflated by multicollinearity problems. If VIF equals 0, there is no correlation between the independent measures. A VIF measure of 1 is an indication of some association between predictor variables, but generally not enough to cause problems. A maximum acceptable VIF value would be 10; anything higher would indicate a problem with multicollinearity.
Tolerance the amount of variance in an independent variable that is not explained by the other independent variables. If the other variables explain a lot of the variance of a particular independent variable we have a problem with multicollinearity. Thus, small values for tolerance indicate problems of multicollinearity. The minimum cutoff value for tolerance is typically .10. That is, the tolerance value must be smaller than .10 to indicate a problem of multicollinearity.
Mr. Pranav Ranjan & Ms. Razia Sehdev ICTC, LPU