You are on page 1of 8

Linear Regression in Excel

SIMPLE LINEAR REGRESSION EXAMPLE


Butlers Trucking Company is an independent trucking Company in southern California. A major portion of Butlers business involves deliveries throughout its local area. To develop better work schedules, the managers want to estimate the total daily travel time for their drivers. Initially the managers believed that the total daily travel time would be closely related to the number of miles traveled in making the daily deliveries. A simple random sample of 10 driving assignments is provided in Table 1. Use Excel to make a scatter diagram of these deliveries (to verify that a linear relationship does exist) and develop a regression equation expressing this relationship. Table 1 Driving Assignment 1 2 3 4 5 6 7 8 9 10 X1=Miles Traveled 100 50 100 100 50 80 75 65 90 90 Y=Travel Time (hrs.) 9.3 4.8 8.9 6.5 4.2 6.2 7.4 6.0 7.6 6.1

Excel Instructions for Drawing a Scatter Plot 1. Enter the above information in the Excel spreadsheet as shown in Figure 1 below. 2. Click on Insert on the toolbar and then click on the Chart tab. The Chart Wizard will appear. In step 1 on select the XY (scatter) chart type (Figure 2), then click next. 3. Your numerical data is contained in cells A2 through B11. So in step two enter your data range as shown in Figure 3, and click next. 4. In steps 3 you can give your chart a title and label your axes. In step 4 specify where you want the chart to be placed. The finished chart is shown in Figure 4. 5. After verifying that a linear trend does exist, determine the least squared regression equation.

Figure 1

Figure 2

Figure 3

Butler Trucking Example


10 Travel Time 8 6 4 2 0 0 50 100 150 Miles Traveled
Figure 4

Excel Instructions for Regression Analysis 1. The Regression Macro (which is part of the Analysis ToolPak) is standard with Excel, however, it is not always active and available for use. Select the Tools menu, if Analysis ToolPak is active then you should see a Data Anaylsis item at the bottom of the menu. If this item is present skip to step 3. 2. If this item is not there then you need to do one easy step. Select the Add Ins option under the Tools menu, which brings up the following window.

Figure 5 Click the Analysis ToolPak checkbox, then OK. Analysis Toolpak should now be present under Tools in the future. 2. Select the Data Analysis option under the Tools menu and select the Regression option (as shown below).

Figure 6 3. Your dependent variable (y) data is in cells B1 through B11 (including the variable name or label), and your independent variable data (x) is in cells A1 through A11. Click the labels box to indicate that the first row contains the variable names, and then click ok. See Figure 7.

Figure 7

4. A new worksheet will appear revealing the results of your regression analysis. The results from this analysis are shown below.
SUMMARY OUTPUT

Correlation Coefficient Coefficient of Determination

Regression Statistics Multiple R 0.8149057 R Square 0.6640713 Adjusted R Square 0.6220802 Standard Error 1.0017919 Observations 10

ANOVA df Regression Residual Total 1 8 9 Coefficient s 1.273913 0.0678261 SS 15.87130435 8.028695652 23.9 MS 15.87130 4 1.003587 F 15.814578 Significance F 0.004080177

P value for Anova Test

b0
Intercept X1=Miles Traveled

Standard Error 1.400744525 0.017055637

t Stat 0.909454 2 3.976754 7

P-value 0.3896874 0.0040802

Lower 95% -1.95621171 0.028495691

Upper 95% 4.5040378 0.1071565

Lower 95.0% -1.9562117 0.0284956 9

Upper 95.0% 4.5040378 0.1071565

Interpreting Results

b1

P value for t test for X1

1. In your second model summary table, you will find the Coefficient of Determination, R2, and the Correlation Coefficient, R. 2. The ANOVA table gives the F statistic for testing the claim that there is no significant relationship between your independent and dependent variables. The sig. value is your p value. Thus you should reject the claim that there is no significant relationship between your independent and dependent variables if p< . 3. The Columns below the Coefficients box gives the b0 and b1 values for the regression equation. The intercept value is always b0. The b1value is next to your independent variable, x. 4. In the last P-value column of the coefficient output data, the p values for individual t tests for our independent variable is given (in the same row as your independent variable). Recall that this t test tests the claim that there is no relationship between the independent variable and your dependent variable. Thus you should reject the claim that there is no significant relationship between your independent variable and dependent variable if p< .

II. MULTIPLE REGRESSION EXAMPLE


In attempting to identify another independent variable, the managers felt that the number of deliveries could also contribute to the total travel time. Table 2 includes the number of deliveries for each of the random driving assignments provided in Table 1. Table 2 1 2 3 4 5 6 7 8 9 10 11 A X1=Miles Traveled 100 50 100 100 50 80 75 65 90 90 B X2=Number of Deliveries 4 3 4 2 2 2 3 4 3 2 C Y=Travel Time (hrs.) 9.3 4.8 8.9 6.5 4.2 6.2 7.4 6.0 7.6 6.1

To determine the regression equation for this scenario follow the same SPSS steps provided for Simple Linear Regression with the following modifications: Enter your multiple regression data in Excel as shown above. In Step 3, specify your dependent variable (y) data is in cells C1 through C11 (including the variable name or label), and your independent variable data (x1 and x2) is in cells A1 through B11. Click the labels box to indicate that the first row contains the variable names, and then click ok. See Figure 8.

Your output for this multiple regression problem should be similar to the results shown below.

SUMMARY OUTPUT

Regression Statistics 0.95067816 Multiple R 6 0.90378897 R Square 5 Adjusted R 0.87630011 Square 1 0.57314215 Standard Error 2 Observations ANOVA df Regression Residual Total 2 7 9 SS 21.6005565 1 2.29944348 6 23.9 Standard Error 0.95154772 5 0.00988849 5 0.22111346 1 Lower 95.0% 3.11875268 3 0.03775204 1 0.40057548 9 Upper 95.0% 1.38134975 0.08451715 6 1.44627524 4 MS 10.80028 0.328492 F 32.87836743 Significance F 0.00027624 10

Intercept X1=Miles Traveled X2=Number of Deliveries

Coefficients 0.86870146 7 0.06113459 9 0.92342536 7

t Stat -0.91294 6.182397 4.176251

P-value 0.391634304 0.000452961 0.004156622

Lower 95% 3.11875268 3 0.03775204 1 0.40057548 9

Upper 95% 1.38134975 0.08451715 6 1.44627524 4

Interpreting Results
1. In your second model summary table, you will find the Adjusted Coefficient of Determination, Adjusted R2, and the Correlation Coefficient, R. 2. The ANOVA table gives the F statistic for testing the claim that there is no significant relationship between your all of your independent and dependent variables. The sig. value is your p value. Thus you should reject the claim that there is no significant relationship between your independent and dependent variables if p< . 3. The Coefficients box gives the b0 and b1, and b2 values for the regression equation. The constant value is always b0. The b1value is next to your x1 value, and b2 is next to your x2 value. 4. In the last column of the coefficient box, the p values for individual t tests for our independent variables is given. Recall that this t test tests the claim that there is no relationship between the independent variable (in the corresponding row) and your dependent variable. Thus you should reject the claim that there is no significant relationship between your independent variable (in the corresponding row) and dependent variable if p< .

You might also like