You are on page 1of 21

Demand Forecasting II

Causal Analysis

Chris Caplice
ESD.260/15.770/1.260 Logistics Systems
Sept 2006

Agenda
Forecasting Evaluation
Use of Causal Models in Forecasting
Approach and Methods

Ordinary Least Squares (OLS) Regression


Other Approaches

Closing Comments on Forecasting

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Forecast Evaluation
How do we determine what is a good forecast?

Accuracy - Closeness to actual observations


Bias - Persistent tendency to over or under predict
Fit versus Forecast Tradeoff between accuracy to
past forecast to usefulness of predictability
Forecast Optimality Error is equal to the random
noise distribution

Combination of art and science

Statistically find a valid model


Art find a model that makes sense

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Accuracy and Bias Measures


1. Forecast Error: et = xt - xt
n

MD =

2. Mean Deviation:

t =1

3. Mean Absolute Deviation

MAD =

t =1

4. Mean Squared Error:

MSE =
n

5. Root Mean Squared Error:

t =1

2
t

2
t

6. Mean Percent Error:

MIT Center for Transportation & Logistics ESD.260

t =1

RMSE =

7. Mean Absolute Percent Error: MAPE =

et

D
MPE = t =1 t
n

et

D
t =1

n
4

Chris Caplice, MIT

Moving Average Forecasts


116.00
115.00
114.00
113.00

MD
MAD
MSE
RMSE
MAPE

112.00
111.00
110.00

MA3
0.05
0.56
0.47
0.68
0.50%

MA10
0.21
1.07
1.67
1.29
0.96%

MA20
0.35
1.41
2.71
1.65
1.27%

109.00
108.00
-

20.00
MA3

40.00
MA10

60.00
MA20

MIT Center for Transportation & Logistics ESD.260

80.00

100.00

120.00

ActDemand

Chris Caplice, MIT

Analysis of the Forecast


25

Are the forecast errors


~N(0,Var(e))?

Frequency

10

What is the expected value of the

errors?
What is the variance of the errors?

.1
9)
0.
12
0.
42
0.
72
1.
03
1.
33
M
or
e

(0

.4
9)

.1
0)

.7
9)

(0

(1

.4
0)

0
(0

For Moving Averages:

15

(1

20

From actual observations,

error

Are the observed errors ~N(0,Var(e))?


For the MA3 data

Errors
2.00

e = 0.05
e= 0.69
D= 1.478

1.50
1.00
0.50

Testing for Normalcy Chi-Square,

Kolmogorov-Smirnov, or other tests

(0.50)

20

40

60

80

100

120

(1.00)
(1.50)
(2.00)

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Corrective Actions to Forecasts


Measures of Bias

Cumulative Sum of Errors (Ct)

Normalize by dividing by RMSE (Ut)


Ut should ~0 if unbiased

Smoothed Error Tracking Signal (Tt)

Tt=zt/MADt
Where zt= et + (1-)zt-1 (smoothing constant)

Autocorrelation of forecast Errors

Correlation between successive observations

Corrective Actions

Adaptive Forecasting

Methods where the smoothing coefficients change over time


Found (generally) to be no better than standard methods

Human Intervention

Overrule the models output look for reason


Rules of thumb: |Tt|>f or |Ct|>k(RMSE) (f~0.4 and k~4)
Lower values (of k or f) lead to more intervention

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

Causal Forecasting Models


Assumes that demand is highly correlated with some
environmental factors
Model is built to relate the independent exogenous
factors to the demand
Examples:

Diapers ~ f(birth rates lagged by 1 year)


NFL Jerseys ~ f(team and individual performance)
New products ~ f(product lifecycle)
Promotional Items ~ f(marketing promotions & ads)
Regional Sales ~ f(household demographics in area)
Umbrellas / Fuel ~ f(weather, temperature, rain, etc.)

Form of Dependent Variable dictates the method used

Continuous takes any value


Discrete takes only integer values
Binary is equal to 0 or 1

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

OLS Linear Regression


The relationship is described in terms of linear model
The data (xi,yi) are the observed pairs from which we try to
estimate the coefficients to find the best fit
The error term, , is the unaccounted or unexplained
portion
The error terms are assumed to be iid ~N(0,) and catch
all of the factors ignored or neglected in the model

E (Y | x) = 0 + 1 x

yi = 0 + 1 xi
Yi = 0 + 1 xi + i
Observed

for i = 1, 2,...n

StdDev(Y | x) =

Unknown

MIT Center for Transportation & Logistics ESD.260

Chris Caplice, MIT

OLS Linear Regression


Residuals

Predicted or estimated values are found by using the regression


coefficients, b.
Residuals, ei, are the difference of actual predicted values
Find the bs that minimize the residuals

yi = b0 + b1 xi for i = 1, 2,...n
ei = yi yi = yi b0 + b1 xi for i = 1, 2,...n

How should I measure the residuals?

Min sum of errors - shows bias, but not accurate


Min sum of absolute error - accurate & shows bias, but intractable
Min sum of squares of error shows bias & is accurate
2
e
(
i =1 i ) = i =1 ( yi yi ) = i=1 ( yi b0 b1 xi )
n

The best model minimizes the residual sum of squares


MIT Center for Transportation & Logistics ESD.260

10

Chris Caplice, MIT

OLS Linear Regression


We can find the optimal values of b0 and b1 by taking
first order conditions of the SSE:

( e ) = ( y y ) = ( y b
n

i =1

i =1

i =1

b1 xi )

This gives us the following coefficients:

b0 = y b1 x
b1

n
i =1

( xi x )( yi y )

2
(
x

x
)
i =1 i

MIT Center for Transportation & Logistics ESD.260

11

Chris Caplice, MIT

OLS Linear Regression


Expansion to multiple variables is straightforward
So, for k variables we need to find k regression
coefficients

Yi = 0 + 1 x1i + ... + k xki + i

for i = 1, 2,...n

E (Y | x1 , x2 ,..., xk ) = 0 + 1 x1 + 2 x2 + ... + k xk
StdDev(Y | x1 , x2 ,..., xk ) =

( e ) = ( y y ) = ( y b
n

i =1

i =1

MIT Center for Transportation & Logistics ESD.260

i =1

12

b1 x1i ... bk xki )

Chris Caplice, MIT

OLS Example
4,500
4,000
3,500
3,000
2,500

MIT Center for Transportation & Logistics ESD.260

Ju
l

ay
M

ar
M

Ja
n

No
v

Se
p

Ju
l

ay
M

ar

2,000
M

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug

Demand
3,025
3,047
3,079
3,136
3,454
3,661
3,554
3,692
3,407
3,410
3,499
3,598
3,596
3,721
3,745
3,650
4,157
4,221
4,238
4,008

Ja
n

Month

What do you see?

13

Chris Caplice, MIT

OLS Example

Month

Establish relationship

Fi = f(X1i, X2i, Xni)


=0+1X1i+2X2i++nXni
Fi = Level + Trend + Season
= 0 + 1 X1i + 2 X2i
Where X2i = 1 if a summer month,
= 0 o.w.

Points to consider:

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug

Demand Period
3,025
1
3,047
2
3,079
3
3,136
4
3,454
5
3,661
6
3,554
7
3,692
8
3,407
9
3,410
10
3,499
11
3,598
12
3,596
13
3,721
14
3,745
15
3,650
16
4,157
17
4,221
18
4,238
19
4,008
20

Summer
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1

What if the trend is not linear?


How do I handle seasonality if it impacts the trend?
How does OLS treat old versus new data?
How much information do I need to keep on hand?

MIT Center for Transportation & Logistics ESD.260

14

Chris Caplice, MIT

OLS Example (Excel)


4,300

4,100

3,900

3,700

Actual
Predicted

3,500

SUMMARY OUTPUT
3,300

0.979
0.958
0.953
79.21
20

3,100

Aug

Jul

Jun

May

Apr

Mar

Feb

Jan

Dec

Nov

Oct

Sep

Jul

Aug

Jun

May

Apr

Mar

Feb

2,900
Jan

Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
ANOVA

df
Regression
Residual
Total

Intercept
Period
Summer

2
17
19
Coefficients
2,969.14
48.03
303.51

SS
MS
F
Significance F
2442766.966 1221383.483 194.6730408
1.91955E-12
106658.4214 6274.024786
2549425.387
Standard Error
37.21
3.20
37.70

t Stat
79.79
15.00
8.05

P-value
0.0000
0.0000
0.0000

Lower 95%
Upper 95% Lower 95.0% Upper 95.0%
2,890.62
3,047.65
2,890.62
3,047.65
41.27
54.79
41.27
54.79
223.97
383.04
223.97
383.04

Fi = 2969 + 48 (Period) + 304 (Summer_Flag)


MIT Center for Transportation & Logistics ESD.260

15

Chris Caplice, MIT

OLS Example (Excel)


Coefficient of Determination
R2= 1-ESS/TSS=RSS/TSS
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations

Standard Error (estimate of


around the regression line)
0.979
0.958
0.953
79.21
20

Sum of the Squares


Regression (RSS) =(-y)2
Error (ESS) =(y-)2
Total (TSS) =(y-y)2

ANOVA
df
Regression
Residual
Total

Intercept
Period
Summer

Regression
Coefficients

2
17
19

SS
MS
F
Significance F
2442766.966 1221383.483 194.6730408
1.91955E-12
106658.4214 6274.024786
2549425.387

Coefficients Standard Error


2,969.14
37.21
48.03
3.20
303.51
37.70

Degrees of
Freedom = n-k-1

MIT Center for Transportation & Logistics ESD.260

t Stat
79.79
15.00
8.05

P-value
0.0000
0.0000
0.0000

Lower 95% Upper 95%


2,890.62
3,047.65
41.27
54.79
223.97
383.04

Std Error of Regression


Coeff (sbm)
16

95%
Confidence
Intervals
Lower 95.0%
Upper 95.0%
2,890.62
41.27
223.97

3,047.65
54.79
383.04

t-Statistic (bm/sbm)
Is bm different from 0?
P-value tells you % conf.
Chris Caplice, MIT

Coefficient of Determination (R2)


Measures Goodness of Fit of the model
Captures the amount of variation that the model
explains

R2=1-ESS/TSS = RSS/TSS
TSS = ESS + RSS
Variation of observed around mean = Variation of observed

around estimated Variation of estimated around the mean


Generally, a higher R2 is better, but . . .

Model needs to make sense


High R2 does not indicate causality
It really depends on how the model is being used as
to what is good enough
The individual coefficients need to be tested

MIT Center for Transportation & Logistics ESD.260

17

Chris Caplice, MIT

Discrete Choice Models


What if you are predicting demand for one
product over another?

Model Selections (Blue vs. Red Cars)


Mode Forecasting (pick one of many)

OLS => y = b0 +b1 X

1
Yi =
1 + e X i

X
MIT Center for Transportation & Logistics ESD.260

18

X
Chris Caplice, MIT

Sales Forecasting Methods


Sales Forecast Errors (MAPE) by
forecast horizons in years

Expert Opinions

44.8%
37.3%
14.9%

Sales Force
Executives
Industry Surveys

Level

Statistical Models

30.6%
20.9%
11.2%
6.0%
3.7%

Nave Model
Moving Average
Exp. Smoothing
Regression
Box-Jenkins

<.25 yrs 2 yrs >2 yrs

Industry

11

15

Corporate

11

18

Product Group

10

15

20

Product Line

11

16

20

Product

16

21

26

Source: Mentzer & Cox (1984)

Source: Dalrymple (1987) Survey 134 companies

MIT Center for Transportation & Logistics ESD.260

19

Chris Caplice, MIT

Misc. Forecasting Issues


Data Issues

Sales data is not demand data


Transactions can aggregate and skew actual demand
Ordering quantities can dictate sourcing
Historical data might not exist

Demand visibility can be skewed by level of echelon

Bullwhip effect
Collaborative Planning, Forecasting, and Replenishment
(CPFR)

Forecasting vs. Inventory Management


Statistical Validity vs. Use and Cost of Model
Demand is not always exogenous
MIT Center for Transportation & Logistics ESD.260

20

Chris Caplice, MIT

Questions, Comments,
Suggestions?

You might also like