Professional Documents
Culture Documents
- Lao Tzu
I think there is a world market for may be 5 computers
- Thomas Watson, Chairman of IBM 1943
Qualitative Techniques.
Expert opinion (or Astrologers)
Quantitative Techniques.
Time series techniques such as exponential smoothing
Casual Models.
Uses information about relationship between system elements
(e.g regression).
Why Time Series Analysis ?
Trend Cyclical
Seasonal Irregular
Trend Component
When a cyclical pattern in the data has a period of less than one
year, it is referred as seasonal variation.
When the cyclical pattern has a period of more than one year we
refer to it as cyclical variation.
Irregular Component
Erratic fluctuations
White Noise
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
More
than
one
Demand
Demand
year gap
Random
movement
Time Time
(a) Trend (b) Cycle
Less
than
Demand
Demand
one
year gap
Time Time
(c) Seasonal pattern (d) Trend with seasonal pattern
Time Series Data Additive and Multiplicative Models
1. Additive Forecasting Model
Seas
Trend onality
Cyclical Ra
ndom
Yt Tt St Ct Rt
Seas
Trend
Ra
onality Cyclical ndom
Yt Tt St Ct Rt
Time Series Data Decomposition
Multiplicative Forecasting Model
Seas
Trend onality
Yt Tt St
Yt
St
Tt Yt / Tt is called
deseasonalized data
Time Series Techniques
Moving Average.
Exponential Smoothing.
The forecast for period t+1 (Ft+1) is given by the average of the n most
recent data.
t
1
Ft 1 Yi
n it n 1
The forecast for period t+1 (Ft+1) is given by the average of the n most
recent data.
t
1
Ft 1 Yi
n it n 1
1 t
Ft 1 Yi
7 i t 7 1
Measures of aggregate error
1 n
Mean absolute error MAE MAE
t
E
n t 1
1 n 2
Mean squared error MSE MSE E
t
n t 1
1 n 2
Root mean squared error RMSE E
t
RMSE n t1
Et = Ft - Yt
DAD Forecasting MAPE and RMSE
Smoothing Equations
Ft 1 *Yt (1) * Ft
F1 Y1
Smoothing Equations
2
Ft Yt 1 (1)Yt2 (1) Yt3 ....
Exponential Smoothing Forecast
60
Demand Exponential Smoothing Forecast
50
40
30
20
10
0
18-09-2014 08-10-2014 28-10-2014 17-11-2014 07-12-2014 27-12-2014 16-01-2015 05-02-2015
Ft 1 0.8098*Yt (1 0.8098) * Ft
DAD Forecasting MAPE and RMSE
Exponential Smoothing
= 0.8098
Choice of
n
Min
Yt Ft 2 / n
t 1
Ft Yt 1 (1)Ft 1
Double exponential Smoothing Holts model
Holts method can be used to forecast when there is a linear trend present
in the data.
The method requires separate smoothing constants for slope and intercept.
Holts Method
Equation for
intercept or
Holts Equations
level
T1 can be set to any one of the following values (or use regression to get initial values):
T1 (Y 2Y1 )
T1 (Y2 Y1 ) (Y3 Y2) (Y4 Y3 )/ 3
T1 (Yn Y1 ) /(n 1)
60
50
40
30
20
10
0
0 20 40 60 80 100 120 140
Demand Forecast
= 0.8098; = 0.05
Forecasting Accuracy
The forecast error is the difference between the forecast value and the actual value for
the corresponding period.
Et Yt F t
E t Forecasterror at period t
Yt Actual value at time period t
Ft Forecastfor time period t
Measures of aggregate error
1 n Et
Mean absolute percentage MAPE
error MAPE n t 1 Yt
1 n 2
Mean squared error MSE MSE
t
E
n t 1
n
(Yt 1 Ft1)2
U t 1n
(Yt 1 Yt )2
t 1
The value U is the relative forecasting power of the method against nave technique. If U < 1,
the technique is better than nave forecast If U > 1, the technique is no better than the nave
forecast.
Theils coefficient for DAD Hospital
Method U
2
n-1
n F Y
t t
Y F 2
t 1 Y t 1
t 1 t 1 t
U1 , U2 2
n n n-1
Y Y
Y t
2
F t
2
t 1 t 1 t 1Y t
t 1 t
U1 is bounded between 0 and 1, with values closure to zero indicating greater accuracy.
In seasonal effect the time period must be less than one year, such as, days,
weeks, months or quarters.
Seasonal effect
Seasonal effect can be eliminated from the time-series. This process is called
deseasonalization or seasonal adjusting.
Seasonal Adjusting
Seasonal index for day i (or month i) is the ratio of daily average of day i (or
month i) to the average of daily (or monthly) averages times 100.
Example: DAD Example - Demand for Continental Breakfast
5 October 1 November 2014 Data
Regression Statistics
Multiple R 0.268288
R Square 0.071978
Adjusted R Square 0.036285
Standard Error 3.715395
Observations 28
ANOVA
df SS MS F Significance F
Regression 1 27.83729 27.83729 2.016587 0.167469
Residual 26 358.9082 13.80416
Total 27 386.7455
Ft = Tt x St
Y29 = 46
Forecasting using method of averages in the
presence of seasonality
Auto-correlation
Auto correlation is a correlation of a variable observed at two time
points (e.g. Yt and Yt-1 or Yt and Yt-3).
For any k, reject H0 if | pk| > 1.96/n. Where n is the number of observations.
PACF Demand for Continental Breakfast at
DAD hospital
Stationarity
Non-stationary Stationary
White Noise
That is, we use Yt as the response variable and Yt-1, Yt-2 etc. as
explanatory variables.
AR(1) Parameter Estimation
Yt Yt 1 t
n 2 n 2
t Yt Yt 1
t 2 t 2
n
YtYt 1
OLS
t 2
n Estimate
2
Yt 1
t 2
AR(1) Process
Yt = Yt-1 + t
Yt = (Yt-2 + t-1) + t
First two
PAC are
different
from
zero.
ACF Function - DAD Continental
Breakfast
Actual Vs Fit- Continental Breakfast
Data
AR(2) Model
MAPE = 8.8%
Residual White Noise
(Yt 41.252) = 0.663 (Yt-1 41.252) 0.204 (Yt-2 41.252)
Yt = (Xt - )
Moving Average Process
A moving average process is a time series regression model in which
the value at time t, Yt, is a linear function of past errors.
Yt = 0 + 1 t-1 + t
Moving Average Process MA(q)
We have.,
Yt 0 1t 2t 2 ... qt q t
MA(1) Spike at lag 1 then cuts of to zero. Spike Exponential decay. On negative side if
positive if 1 > 0 and negative side if 1> 0 on positive side if 1< 0.
1 < 0.
MA(q) Spikes at lags 1 to q, then cuts off to zero. Exponential decay or sine wave. Exact
pattern depends on signs of 1, 2 etc.
ACF Function - DAD Continental
Breakfast
SPSS Output
Actual Vs Fitted Value
Residual Plot
ARMA(p,q) Model
AR(p)
Model
MA(q)
Model
ARMA(p,q) Model Identification
ARMA(p,q) models are not easy to identify. We usually start with pure
AR and MA process. The following thump rule may be used.
MAPE = 8.8%
Residual White Noise
Forecasting Model Evaluation
Akaikes information criteria
AIC = -2LL + 2m
Where m is the number of variables estimated in the model
AIC and BIC can
Bayesian Information Criteria be interpreted as
distance from
true model
BIC = -2LL + m ln(n)
Where m is the number of variables estimated in the model and n is the number of
observations
Final Model Selection
X t 0 1 X t 1 2 X t 2 ... p X t p
0 1t 1 2t 2 ... qt q t
Where Xt = Yt Yt-1
Ljung-Box Test
Ljung-Box is a test on the autocorrelations of residuals. The test statistic is:
Qm n(n 2)
m
r 2k
k 1 n k
n = number of observations in the time series.
k = particular time lag checked
m = the number of time lags to be tested
rk = sample autocorrelation function of the kth residual term.
Identification: Identify the ARIMA model using ACF & PACF plots. This
would give the values of p, q and d.
Diagnostics: Check the residual for any issue such as not providing White
Noise.