Professional Documents
Culture Documents
Plotting the initial data for annual sales for the years 2000 through 2007 we can see that there is a clear upward
linear trend to the data. Using the linear trendline function to forecast forward 2 periods, we obtain an R 2 value of
0.9762 indicating that the year is a significant factor in explaining variation in annual sales. Using the trendline
equation Sales = Year*11748-23400000 to forecast forward for the year of 2008, we obtain predicted annual sales
of 189984.
Annual Sales
250000
200000
y = 11748x - 2.34E+07
150000 R² = 0.9762
100000
50000
0
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
Given the strong linear properties of the annual sales data, using the linear regression model we are able to
generate similar results to the linear trendline with an R2 of 0.9762 and a trendline equation of Sales =
Year*11747.7619-23400847, which gives an annual forecast of 188659.
150000
Sales
Sales
50000
0
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Year
12/10/2015 Northern Napa Valley Winery
Given that the data exhibits a linear trend it is suggested that we can apply the double exponential smoothing for
forecasting. Using this model, we obtain a forecast of 189841 for the 2008 annual sales. Here we can see that
through optimizing the weighting factors, we obtain a forecast that is similar to those produced by the linear
regression and trendline forecasts.
150000
Sales
100000
50000
0
2000 2001 2002 2003 2004 2005 2006 2007 2008
Year
In comparing the error measures of the three models, we can see that the linear regression model is a better
estimator of the annual forecast as it provides a more accurate trendline equation and has the lowers error scores.
Plotting the initial data for month sales for the period of Jan 2000 to October 2008 reveals a linear upward trend
where the magnitude of seasonal span increases as the level of the time series increases. Given the seasonal
component of the data, it is difficult to forecast the data set using linear regression. Here we can see that the R2
value for the model is 0.4478 which indicates that the time period (month/year) is not a significant factor in
explaining sales, and that our forecast will not give an accurate representation of all the data components (both
linearity and seasonality).
Monthly Sales
25000
20000
y = 2.6271x - 88575
R² = 0.4478
15000
Sales
10000
Linear (Sales)
5000
0
Jan-01
Jan-03
Jan-00
Jan-02
Jan-04
Jan-05
Jan-06
Jan-07
Jan-08
Jan-09
Jul-00
Jul-01
Jul-02
Jul-03
Jul-04
Jul-05
Jul-06
Jul-07
Jul-08
Jul-09
By using classical decomposition to deseasasonalize the data, we are able to remove the seasonal component and
re-compute the linear regression to obtain a higher R2 value of 0.92287, which indicates that the time period is a
significant factor affecting the sales. Through this model we are able to use the linear trendline function more
accurately to help forecast future sales.
Jan-07
Jan-01
Jan-02
Jan-03
Jul-03
Jan-04
Jan-05
Jan-06
Jan-08
Jan-09
Jul-00
Jul-01
Jul-02
Jul-04
Jul-05
Jul-06
Jul-07
Jul-08
Jul-09
12/10/2015 Northern Napa Valley Winery
Once the linear forecasting is applied, we re-seasonalize the data using the seasonal indices to adjust for the
seasonal component of the data.
25000
20000
15000
Sales
10000
5000 Forecast
0
Jan-00
Jan-01
Jan-02
Jul-02
Jan-03
Jan-04
Jan-05
Jan-06
Jan-07
Jan-08
Jan-09
Jul-00
Jul-01
Jul-03
Jul-04
Jul-05
Jul-06
Jul-07
Jul-08
Jul-09
While multiple linear regression and linear trendlines have been able to help provide a good starting point for
forecasting the data, we can see that the season component of the data plays a significant role in its distribution.
Given these data traits the Holt-Winters exponential smoothing model is well-suited to forecast the data. A
comparison of the additive versus multiplicative models shows that the multiplicative model provides a more
accurate forecast, due to the seasonal variation that increases with the time period.
10000 Forecast
5000
Actual
0
Jan-00 May-01 Sep-02 Feb-04 Jun-05 Nov-06 Mar-08 Aug-09 Dec-10
Period
10000 Forecast
5000
Actual
0
Jan-00 May-01 Sep-02 Feb-04 Jun-05 Nov-06 Mar-08 Aug-09 Dec-10
Period
12/10/2015 Northern Napa Valley Winery
In comparing the error measures of the three models, we can see that the linear regression using classical
decomposition for seasonality is the least error model, followed by the Holt-Winters multiplicative model.
In comparing the annual sales as predicted using the linear regression model, calculating the monthly sales
individually and summing them over the annual period of 2008 we get a value of 189022 versus the values of
189984 from linear trendline and 188659 linear regression and 189841 for double exponential smoothing. Using
the least error measures we can see that while the linear regression model worked best for calculating the
monthly and annual forecasts individually, the monthly Holt-Winter multiplicative model summed over the annual
period of 2008 was a closer representation to the annual forecast obtained from the linear regression and linear
trendline methods.