Demand Forecasting II: Causal Analysis

Demand Forecasting II
Causal Analysis
Chris Caplice
ESD.260/15.770/1.260 Logistics Systems
Sept 2006
Agenda
Forecasting Evaluation
Use of Causal Models in Forecasting
Approach and Methods
Ordinary Least Squares (OLS) Regression

Other Approaches
Closing Comments on Forecasting
MIT Center for Transportation & Logistics ESD.260
Chris Caplice, MIT
Forecast Evaluation
How do we determine what is a good forecast?
Accuracy - Closeness to actual observations

Bias - Persistent tendency to over or under predict
Fit versus Forecast Tradeoff between accuracy to
past forecast to usefulness of predictability
Forecast Optimality Error is equal to the random
noise distribution
Combination of art and science
Statistically find a valid model

Art find a model that makes sense
Chris Caplice, MIT
Accuracy and Bias Measures

1. Forecast Error: et = xt - xt
n
MD =
2. Mean Deviation:
t =1
3. Mean Absolute Deviation
MAD =
t =1
4. Mean Squared Error:
MSE =
n
5. Root Mean Squared Error:
t =1
2
t
2
t
6. Mean Percent Error:
t =1
RMSE =
7. Mean Absolute Percent Error: MAPE =
et
D
MPE = t =1 t
n
et
D
t =1
n
4
Chris Caplice, MIT
Moving Average Forecasts

116.00
115.00
114.00
113.00
MD
MAD
MSE
RMSE
MAPE
112.00
111.00
110.00
MA3
0.05
0.56
0.47
0.68
0.50%
MA10
0.21
1.07
1.67
1.29
0.96%
MA20
0.35
1.41
2.71
1.65
1.27%
109.00
108.00
-
20.00
MA3
40.00
MA10
60.00
MA20
80.00
100.00
120.00
ActDemand
Chris Caplice, MIT
Analysis of the Forecast

25
Are the forecast errors

~N(0,Var(e))?
Frequency
10
What is the expected value of the
errors?
What is the variance of the errors?
.1
9)
0.
12
0.
42
0.
72
1.
03
1.
33
M
or
e
(0
.4
9)
.1
0)
.7
9)
(0
(1
.4
0)
0
(0
For Moving Averages:
15
(1
20
From actual observations,
error
Are the observed errors ~N(0,Var(e))?

For the MA3 data
Errors
2.00
e = 0.05
e= 0.69
D= 1.478
1.50
1.00
0.50
Testing for Normalcy Chi-Square,
Kolmogorov-Smirnov, or other tests
(0.50)
20
40
60
80
100
120
(1.00)
(1.50)
(2.00)
Chris Caplice, MIT
Corrective Actions to Forecasts

Measures of Bias
Cumulative Sum of Errors (Ct)
Normalize by dividing by RMSE (Ut)

Ut should ~0 if unbiased
Smoothed Error Tracking Signal (Tt)
Tt=zt/MADt
Where zt= et + (1-)zt-1 (smoothing constant)
Autocorrelation of forecast Errors
Correlation between successive observations
Corrective Actions
Adaptive Forecasting
Methods where the smoothing coefficients change over time

Found (generally) to be no better than standard methods
Human Intervention
Overrule the models output look for reason

Rules of thumb: |Tt|>f or |Ct|>k(RMSE) (f~0.4 and k~4)
Lower values (of k or f) lead to more intervention
Chris Caplice, MIT
Causal Forecasting Models

Assumes that demand is highly correlated with some
environmental factors
Model is built to relate the independent exogenous
factors to the demand
Examples:
Diapers ~ f(birth rates lagged by 1 year)

NFL Jerseys ~ f(team and individual performance)
New products ~ f(product lifecycle)
Promotional Items ~ f(marketing promotions & ads)
Regional Sales ~ f(household demographics in area)
Umbrellas / Fuel ~ f(weather, temperature, rain, etc.)
Form of Dependent Variable dictates the method used
Continuous takes any value

Discrete takes only integer values
Binary is equal to 0 or 1
Chris Caplice, MIT
OLS Linear Regression

The relationship is described in terms of linear model
The data (xi,yi) are the observed pairs from which we try to
estimate the coefficients to find the best fit
The error term, , is the unaccounted or unexplained
portion
The error terms are assumed to be iid ~N(0,) and catch
all of the factors ignored or neglected in the model
E (Y | x) = 0 + 1 x
yi = 0 + 1 xi
Yi = 0 + 1 xi + i
Observed
for i = 1, 2,...n
StdDev(Y | x) =
Unknown
Chris Caplice, MIT

Residuals
Predicted or estimated values are found by using the regression

coefficients, b.
Residuals, ei, are the difference of actual predicted values
Find the bs that minimize the residuals
yi = b0 + b1 xi for i = 1, 2,...n
ei = yi yi = yi b0 + b1 xi for i = 1, 2,...n
How should I measure the residuals?
Min sum of errors - shows bias, but not accurate

Min sum of absolute error - accurate & shows bias, but intractable
Min sum of squares of error shows bias & is accurate
2
e
(
i =1 i ) = i =1 ( yi yi ) = i=1 ( yi b0 b1 xi )
n
The best model minimizes the residual sum of squares

10
Chris Caplice, MIT

We can find the optimal values of b0 and b1 by taking
first order conditions of the SSE:
( e ) = ( y y ) = ( y b
n
i =1
i =1
i =1
b1 xi )
This gives us the following coefficients:
b0 = y b1 x
b1
n
i =1
( xi x )( yi y )
2
(
x
x
)
i =1 i
11
Chris Caplice, MIT

Expansion to multiple variables is straightforward
So, for k variables we need to find k regression
coefficients
Yi = 0 + 1 x1i + ... + k xki + i
for i = 1, 2,...n
E (Y | x1 , x2 ,..., xk ) = 0 + 1 x1 + 2 x2 + ... + k xk
StdDev(Y | x1 , x2 ,..., xk ) =
( e ) = ( y y ) = ( y b
n
i =1
i =1
i =1
12
b1 x1i ... bk xki )
Chris Caplice, MIT
OLS Example
4,500
4,000
3,500
3,000
2,500
Ju
l
ay
M
ar
M
Ja
n
No
v
Se
p
Ju
l
ay
M
ar
2,000
M
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Demand
3,025
3,047
3,079
3,136
3,454
3,661
3,554
3,692
3,407
3,410
3,499
3,598
3,596
3,721
3,745
3,650
4,157
4,221
4,238
4,008
Ja
n
Month
What do you see?
13
Chris Caplice, MIT
OLS Example
Month
Establish relationship
Fi = f(X1i, X2i, Xni)

=0+1X1i+2X2i++nXni
Fi = Level + Trend + Season
= 0 + 1 X1i + 2 X2i
Where X2i = 1 if a summer month,
= 0 o.w.
Points to consider:
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Demand Period
3,025
1
3,047
2
3,079
3
3,136
4
3,454
5
3,661
6
3,554
7
3,692
8
3,407
9
3,410
10
3,499
11
3,598
12
3,596
13
3,721
14
3,745
15
3,650
16
4,157
17
4,221
18
4,238
19
4,008
20
Summer
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
What if the trend is not linear?

How do I handle seasonality if it impacts the trend?
How does OLS treat old versus new data?
How much information do I need to keep on hand?
14
Chris Caplice, MIT
OLS Example (Excel)

4,300
4,100
3,900
3,700
Actual
Predicted
3,500
SUMMARY OUTPUT
3,300
0.979
0.958
0.953
79.21
20
3,100
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
Dec
Nov
Oct
Sep
Jul
Aug
Jun
May
Apr
Mar
Feb
2,900
Jan
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
ANOVA
df
Regression
Residual
Total
Intercept
Period
Summer
2
17
19
Coefficients
2,969.14
48.03
303.51
SS
MS
F
Significance F
2442766.966 1221383.483 194.6730408
1.91955E-12
106658.4214 6274.024786
2549425.387
Standard Error
37.21
3.20
37.70
t Stat
79.79
15.00
8.05
P-value
0.0000
0.0000
0.0000
Lower 95%
Upper 95% Lower 95.0% Upper 95.0%
2,890.62
3,047.65
2,890.62
3,047.65
41.27
54.79
41.27
54.79
223.97
383.04
223.97
383.04
Fi = 2969 + 48 (Period) + 304 (Summer_Flag)

15
Chris Caplice, MIT
OLS Example (Excel)

Coefficient of Determination
R2= 1-ESS/TSS=RSS/TSS
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
Standard Error (estimate of

around the regression line)
0.979
0.958
0.953
79.21
20
Sum of the Squares

Regression (RSS) =(-y)2
Error (ESS) =(y-)2
Total (TSS) =(y-y)2
ANOVA
df
Regression
Residual
Total
Intercept
Period
Summer
Regression
Coefficients
2
17
19
SS
MS
F
Significance F
2442766.966 1221383.483 194.6730408
1.91955E-12
106658.4214 6274.024786
2549425.387
Coefficients Standard Error

2,969.14
37.21
48.03
3.20
303.51
37.70
Degrees of
Freedom = n-k-1
t Stat
79.79
15.00
8.05
P-value
0.0000
0.0000
0.0000
Lower 95% Upper 95%

2,890.62
3,047.65
41.27
54.79
223.97
383.04
Std Error of Regression

Coeff (sbm)
16
95%
Confidence
Intervals
Lower 95.0%
Upper 95.0%
2,890.62
41.27
223.97
3,047.65
54.79
383.04
t-Statistic (bm/sbm)
Is bm different from 0?
P-value tells you % conf.
Chris Caplice, MIT
Coefficient of Determination (R2)

Measures Goodness of Fit of the model
Captures the amount of variation that the model
explains
R2=1-ESS/TSS = RSS/TSS
TSS = ESS + RSS
Variation of observed around mean = Variation of observed
around estimated Variation of estimated around the mean

Generally, a higher R2 is better, but . . .
Model needs to make sense

High R2 does not indicate causality
It really depends on how the model is being used as
to what is good enough
The individual coefficients need to be tested
17
Chris Caplice, MIT
Discrete Choice Models

What if you are predicting demand for one
product over another?
Model Selections (Blue vs. Red Cars)

Mode Forecasting (pick one of many)
OLS => y = b0 +b1 X
1
Yi =
1 + e X i
X
18
X
Chris Caplice, MIT
Sales Forecasting Methods

Sales Forecast Errors (MAPE) by
forecast horizons in years
Expert Opinions
44.8%
37.3%
14.9%
Sales Force
Executives
Industry Surveys
Level
Statistical Models
30.6%
20.9%
11.2%
6.0%
3.7%
Nave Model
Moving Average
Exp. Smoothing
Regression
Box-Jenkins
<.25 yrs 2 yrs >2 yrs
Industry
11
15
Corporate
11
18
Product Group
10
15
20
Product Line
11
16
20
Product
16
21
26
Source: Mentzer & Cox (1984)
Source: Dalrymple (1987) Survey 134 companies
19
Chris Caplice, MIT
Misc. Forecasting Issues

Data Issues
Sales data is not demand data

Transactions can aggregate and skew actual demand
Ordering quantities can dictate sourcing
Historical data might not exist
Demand visibility can be skewed by level of echelon
Bullwhip effect
Collaborative Planning, Forecasting, and Replenishment
(CPFR)
Forecasting vs. Inventory Management

Statistical Validity vs. Use and Cost of Model
Demand is not always exogenous
20
Chris Caplice, MIT
Questions, Comments,
Suggestions?

Demand Forecasting II: Causal Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Demand Forecasting II: Causal Analysis

Uploaded by

Copyright:

Available Formats

Demand Forecasting II

Ordinary Least Squares (OLS) Regression

Closing Comments on Forecasting

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Accuracy - Closeness to actual observations

Combination of art and science

Statistically find a valid model

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Accuracy and Bias Measures

3. Mean Absolute Deviation

4. Mean Squared Error:

5. Root Mean Squared Error:

6. Mean Percent Error:

MIT Center for Transportation & Logistics ESD.260

7. Mean Absolute Percent Error: MAPE =

Chris Caplice, MIT

Moving Average Forecasts

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Analysis of the Forecast

Are the forecast errors

What is the expected value of the

For Moving Averages:

From actual observations,

Are the observed errors ~N(0,Var(e))?

Testing for Normalcy Chi-Square,

Kolmogorov-Smirnov, or other tests

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Corrective Actions to Forecasts

Cumulative Sum of Errors (Ct)

Normalize by dividing by RMSE (Ut)

Smoothed Error Tracking Signal (Tt)

Autocorrelation of forecast Errors

Correlation between successive observations

Methods where the smoothing coefficients change over time

Overrule the models output look for reason

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Causal Forecasting Models

Diapers ~ f(birth rates lagged by 1 year)

Form of Dependent Variable dictates the method used

Continuous takes any value

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

OLS Linear Regression

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

OLS Linear Regression

Predicted or estimated values are found by using the regression

How should I measure the residuals?

Min sum of errors - shows bias, but not accurate

The best model minimizes the residual sum of squares

Chris Caplice, MIT

OLS Linear Regression

This gives us the following coefficients:

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

OLS Linear Regression

Yi = 0 + 1 x1i + ... + k xki + i

MIT Center for Transportation & Logistics ESD.260