Professional Documents
Culture Documents
\
|
=
=
n
s
c m c m
t
error standard
Intro to Time series
forecasting (in Excel)
Objectives
Often the task is to predict the next of a
series of periodic observations of a quantity
(ex. demand of a product)
The observations form what is called a time
series
When we rely only on past observations of
that quantity to predict future occurences, the
approach is called time series forecasting
This forecasting relies on the premise that
past patterns in the time series will carry on
into the future
The forecasting process
Step 1 Collect numerical data from internal
(ex. previous sales figure, production /
inventory quantities) and external sources
(national economic indicators, changes in
customer demographics)
Step 2 Generate a forecast based on the
numerical data (numerical forecasting
techniques)
Step 3 Check the accuracy of the
forecasting technique (maintain integrity of
forecasts)
The forecasting process
Step 4 Include qualitative judgements not
represented in the numerical data (adjust the
forecasts with qualitative or deterministic
judgements like regression)
Step 5 Apply the forecast in making
decisions (basing decision only on forecasted
value, and not uncertainty, can cause serious
problems particularly if the cost of the actual
outcome exceeding the forecast is quite
different than the cost of it being lower than
the forecast)
Approaches to Forecast
Qualitative techniques when historical data do
not exist or are too expensive; gather relevant
information, including opinions of experts
Causal procedures regression attempt to use
knowledge about one or more factors (independent
variables) to predict the value of another factor
(dependent variable)
Time series methods analysis of historical data
about the quantity to be predicted and its
subsequent extrapolation into the future; useful
when hundreds of thousands of items must be
forecast
The problem
A sales manager needs to forecast sales for the
coming month for a particular type of car
It is first of October and we have the prior nine
months of sales
Month
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
Car
Sales
21 23 21 20 21 19 28 32 26 ?
Basic approaches for One-period
Forecasts Simple Approaches
All-Period average assumption that sales in
October will behave like the composite of sales in
prior months; sales from prior months have same
importance
Prior Period - use October forecast because it is
most recent we give 100% weight to the prior
months sales
To decide which is a better technique, we ask how
each of the techniques would have done if they were
used to forecast each month up to October
Forecasting with average and last period
Month Jan Feb Mar April May June July Aug Sept Oct
Actual Car Sales 21 23 21 20 21 19 28 32 26
All-period
Average
21.0 22.0 21.7 21.3 21.2 20.8 21.9 23.1 23.4
Prior period as
forecast
21.0 23.0 21.0 20.0 21.0 19.0 28.0 32.0 26.0
Failure of forecasting
The All-period average can be characterized
as being quite unresponsive to abrupt
fluctuations in sales
Prior Period forecast is extremely responsive
to sale fluctuations
We should look at techniques that are
somewhere in between these two extremes
Moving Average
It is usually reasonable to assume that more
recent information is of more value than older
information
A problem with basing a forecast on only the
prior periods sales can be that something
may cause sales to jump abnormally one
period
A logical compromise is to base the forecast
on the average of a certain number of most
recent periods this is the moving average
(ex.: forecast for Oct can be mean of prior 3
months)
Smoothed Average
Sales for the month prior to the month to be
forecast is most relevant, the month prior to
that be less relevant and so on
Each prior month receives comparably less
relevance
This decline in significance can be
represented by an exponential function this
is why we call this forecasting technique
exponential smoothing
Exponential smoothing
If we have both a forecast and an actual figure for the
prior period, then the new forecast can be calculated
as:
New forecast = x (most recent actual) + (1-) x (most
recent forecast)
is a value that is greater than 0 and less than 1. If =
0.2, then 20% of the forecast will go to the prior period
sales and 80% will go to the prior forecast
It is like a MA but the weights are not constant
Still, in a MA we need n periods of data, in a ES we
only need 2 data current value and previous forecast
(sometimes provided by some other model)
Exponential smoothing (cont.)
To use ES, we need to determine an
appropriate factor to use
determines how responsive the forecast will
be to jumps in the prior period
A low means the forecast is unresponsive to
jumps
A higher than 0.5 means the forecast will be
extremely responsive to jumps ( = 1 is the
same as the nave prior-period forecast)
Compute the forecast for each month
using MA and ES
Month Jan Feb Marc April May June July Aug Sept Oct
Actual Car
Sales
21 23 21 20 21 19 28 32 26
?
3 month MA 21.7 21.3 20.7 20.0 22.7 26.3 28.7
ES
alpha=0.
3
21.0 21.6 21.4 21.0 21.0 20.4 22.7 25.5 25.6
The forecasts for October differ somewhat.
The decision maker needs some means of comparing the quality of the various
possible forecasts.
Comparison of forecasts
In order to evaluate the various forecasting
approaches, we calculate the forecasts that
each would have given in the past periods
and compare them
We provide graphs of the four approaches
used on the automobile data
The forecast patterns are quite different,
particularly in the latter months where sales
level moves up
Comparison of Forecasts
0
5
10
15
20
25
30
35
Jan Feb March April May June July Aug Sept
Actual
All-period Average
0
5
10
15
20
25
30
35
Jan Feb March April May June July Aug Sept
Actual
Prior period Forecast
0
5
10
15
20
25
30
35
Jan Feb March April May June July Aug Sept
Actual
3-month MA
0
5
10
15
20
25
30
35
Jan Feb March April May June July Aug Sept
Actual
ES
The Error of a Forecast
The error of a forecast is the degree the
forecast is off from the actual amounts
Absolute Error is the distance of the
forecast from the actual
Absolute Error = Actual - Forecast
Relative Error is the percentage the forecast
differs from actual
Relative Error = (Actual Forecast) / Actual
How do we choose weather absolute or
relative error is more appropriate?
If an absolute error of 1 is just as severe when the
actual is low as when the actual is high then an
absolute error model is more appropriate
If an absolute error of 1 is ten times as significant
when actual = 10 as when actual = 100, then a
relative error model is more appropriate
In order to compare the forecasting approaches we
will compute summary measures of the errors over
all periods
We look for approaches that have the smallest
measures of error
We need a forecast that is unbiased and precise
Bias
A simple summary of the errors of a particular
forecast is to average the errors for all periods
If the average is > 0 then we say that the forecast
has a positive bias (if < 0, negative bias)
Bias = the degree to which the forecast tends to
overshoot (negative bias) or under-shoot (positive
bias) the actual amounts
We can compute bias in both absolute and relative
terms:
Absolute Bias = average of all absolute errors (Average
Error AE)
Relative Bias = average of all relative errors (Mean
Percentage Error MPE)
Bias (cont.)
If a particular forecast method has a known
amount of bias, then we can simply adjust
each future forecast for the bias
If a forecast tends to under-estimate sales by 2
cars then we can just add two to all future
forecasts to eliminate the bias
Much worse is for forecasts to be imprecise
Precision
Precision = the distance the forecast tends
to be from the actual
For ex.: a forecasting technique tends to
over-estimate by 10 half of the time and
under-estimate by 10 the other half of the
time
On average the forecast is right (unbiased)
But the forecast technique is imprecise, since it is
always off by 10
Precision 3 ways of measurement
1. Compute the average of absolute values of all the
error terms this is called MAD (mean absolute
deviation)
2. Compute the average of the square of each of the
error terms this is called MSE (mean squared
error). This is appropriate when an absolute error of
2 is four times as significant as an error of 1
3. Compute the average of the absolute values of each
relative error this is called MAPE (mean absolute
percentage error). This indicates what percent the
forecast tends on average to be off from the actual
Computation of Error Summary Measures
Month Jan Feb Mar April May June July Aug Sept
Actual Car Sales 21 23 21 20 21 19 28 32 26
All-period Average 21.0 22.0 21.7 21.3 21.2 20.8 21.9 23.1
Absolute Error 2.0 -1.0 -1.7 -0.3 -2.2 7.2 10.1 2.9
Absolute Value of
Absolute Errors
2.0 1.0 1.7 0.3 2.2 7.2 10.1 2.9
Square Error 4.0 1.0 2.8 0.1 4.8 51.4 102.9 8.3
Relative Error 9% -5% -8% -1% -12% 26% 32% 11%
Absolute Value of
Relative Error
9% 5% 8% 1% 12% 26% 32% 11%
Comparison of Forecasting Techniques
Statistic All Per
Avg
Prior
Per
3-m
MA
ES
=0.3
ES
=0.2
AE
2.13 0.63 2.22 1.93 2.17
MAD
3.41 3.38 3.56 2.93 3.12
MSE
21.9 18.38 26.15 19.42 20.73
MPE
6.40% 1.25% 6.29% 5.82% 6.72%
MAPE
12.86% 13.28% 12.95% 10.94% 11.60%
Oct.
Forecast
23.4 26 28.7 25.6 24.5
Which one do we choose?
At first glance we might be tempted to use
the Prior Period as it has the lowest AE and
MPE
More important is the precision. The PP has
the worst MAPE
ES with =0.3 has the best absolute and
relative precision
There is no one best forecasting technique
for all types of data
There are some characteristics of the data
that will influence our forecasting techniques
Multi period patterns
Seasonality
Trend and Cycle
Some standard models
Considering Seasonality
We expect that the more information we have about
past sales, the more accurately we can predict the
future
In our example it would be more difficult to come up
with an accurate forecast two months out
(November), or more
However, if we have more info about the past we
may improve our potential for forecasting further
By accounting for regular patterns in the data we
may improve our forecast one-period ahead as well
Seasonality
Digging into the archives we found sales
figures for the two years prior to the data
used above
It is reasonable to assume that the
automotive sales are seasonal with
increases and decreases occurring at about
the same time each year
We observe the seasonality by looking at the
graphs for monthly observations during the 3
years of data
Sales
0
5
10
15
20
25
30
35
j
a
n
M
a
r
M
a
y
J
u
l
y
S
e
p
t
N
o
v
j
a
n
M
a
r
M
a
y
J
u
l
y
S
e
p
t
N
o
v
j
a
n
M
a
r
M
a
y
J
u
l
y
S
e
p
t
N
o
v
j
a
n
Slight sales decline from January to May, sharp increase about
October and a decline until the end of the year.
For any future year, it is reasonable to presume that August sales will
be higher than April sales the actual amount will be quantified by
calculating seasonality factors
Seasonality Factor
These factors will tell us what percentage any
given month of sales will be relative to an average
of all months
For ex.:
to calculate a March seasonality factor, we observe that
for 1992, March sales were 20.9
We will compare that value to the average sales for a year
centered on March
Suppose we take the year from Oct. 1991 and running to
Sept 1992 (12 months with an average of 23.3)
896 . 0
3 . 23
9 . 20
_ _ _ _ _
92 _
_
92
= =
=
March g surroundin year for sales Avg
Sales March Known
Factor y Seasonalit
March
Seasonality Factor
We interpret the 0.896 to mean that sales for
March of 1992 were 89.6% of the average
of the surrounding year
March is not really the centered month in that
year, it is the 6
th
month out of 12
It would be more accurate to average the
year beginning with Sept 91 with the year
beginning with Oct 91 to get a true centered-
moving-average
Computing Seasonality Factors
We will compute seasonality factors for each
observation, excepting the first 5 months and
the last 6 months, for which we do not have a
complete year surrounding the observation
We can average all seasonality factors
available for a particular month to obtain a
seasonality index for that month
For ex.: we average the seasonality factor of
March 91 with that of March 92 to come up
with a factor for March that can be used in
forecasting
Deseasonalize
When we have a seasonal index for each month we
can deseasonalize the prior sales figures by
dividing the actual sales figures by the
corresponding seasonal index
This will remove the variation which we attribute to
seasonality
We can then generate forecasts using the
deseasonalized data which we would expect not to
be mis-forecast due to seasonality
These forecasts can then be re-seasonalized by
multiplying them by the corresponding seasonality
factors which puts back in the effects of
seasonality
Considering Trend and Cycle
It was necessary to use the same deseasonalized
forecast for October 92 and for forecasts of 2
months ahead, 3 months ahead, and so on
The ES approach gives only one forecast and if we
need to forecast more than one period ahead a flat
extrapolation into the future is made
We may have reasons to believe that over time the
de-seasonalized figures are generally increasing or
decreasing
We will explore the idea of a trend a consistent,
long term movement
We will assume a trend that is a straight-line pattern
up or down, but exponential or other curved trends
are also possible
Trend and Cycle
The de-seasonalized data consists of the
combination of trend and cycle
Cycle is an up and down movement
associated with general business conditions
with an irregular period that is very hard to
predict
In many industries the business cycle ranges
from 2 to 10 years
We will focus on separating the trend from
the cycle
Trend
Regressing the de-seasonalized sales on the time
index (1,2,3,33 months) we obtain the results:
Coefficients
Standard
Error
t Stat P-value
Lower
95%
Upper
95%
Intercept 27.24782705 0.797328257 34.17391 3.58E-26 25.62167 28.87399
X Variable 1 -0.161728621 0.040920028 -3.95231 0.000417 -0.24519 -0.07827
The -0.16 indicates that the de-seasonalized sales
figures tend to decrease by about -0.16 cars per
month
Trend and Cycle
X Variable 1 Line Fit Plot
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
0 10 20 30 40
X Variable 1
Y
The wandering of the data around the trend line reflects the cycle
that is part of the data and some leftover unpredictable occurencies
For ex.: Observation 30 (June 92) falls below the trend line
Trend
We might improve the ES of using the trend
estimate by assuming that the October forecast is
0.16 lower than the 23.75 forecast, and that
November forecast is 2 times 0.16 lower than the
23.75 forecast and so on
The trend can enhance the forecast for the following
months
One weakness of the linear regression approach is
that it assumes that the trend is constant over time
If we have reasons to believe that the trend is
changing, then we would prefer an approach that
will continuously change the trend estimate over
time
Holts Model
Regular ES attempts to estimate the LEVEL of sales
in the future
Holts Model improves this by computing both a
LEVEL and a TREND
The TREND forecast reflects the current expected
period-to-period change in sales level for the future
Given the LEVEL and TREND estimates made in a
particular month, we can produce a k month-ahead
forecast that would be the current estimate of
LEVEL plus k times the TREND
Holts Model finding the new LEVEL
All we need is to be clear about how the
estimates of LEVEL and TREND are updated
each period
Holts idea was to smooth each estimate with
ES ex.: the Sept. actual sales can be used to
update the LEVEL forecast according to the
following equation
LEVEL
Sept
= SALES
Sept
+ (1- )(LEVEL
Aug
+ TREND
Aug
)
where is a smoothing constant in ES
Holts Model finding the new TREND
Now that we have an updated LEVEL forecast, we
can use it to update the TREND forecast
We do not have a TREND actual value, so we use
the difference between the level of Sept and the
level of Aug
TREND will be:
TREND
Sept
=(LEVEL
Sept
LEVEL
Aug
) +(1-)TREND
Aug
where is another constant for ES which determines
the sensitivity of the TREND estimate to changes in
TREND. Generally is small (ex.: 0.2) because
trend is a long-term effect it should not change
rapidly
Holts Model forecasts
We can use the September LEVEL and
TREND estimates to forecast future sales as
follows:
FORECAST
Oct
=LEVEL
Sept
+ 1 * TREND
Sept
FORECAST
Nov
=LEVEL
Sept
+ 2 * TREND
Sept
FORECAST
Dec
=LEVEL
Sept
+ 3 * TREND
Sept
Holts Model properties
This model does not consider seasonality
It can be applied to de-seasonalized sales
and the resulting data to be re-seasonalized
It would be easier to have a model that
consider both trend and seasonality
Winters Model
It is similar to Holts Model it contains
updating estimates of both LEVEL and
TREND
In addition, this model updates seasonality
factors each period, again using ES
As with Holt, assume that we have estimates
of LEVEL and TREND from Aug
Assume further that we have a seasonality
factor which was updated from September of
the prior year
Winters Model include de-seasonalized
data
SEASON = seasonality factor
Observe that the de-seasonalized Sept sales
equals Sales
Sept
/SEASON
priorSept
The Sept sales cab be used to update the
LEVEL forecast with the equation:
LEVEL
Sept
=*(SALES
Sept
/SEASON
priorSept
)+(1-)*
*(LEVEL
Aug
+TREND
Aug
)
Winters Model updating TREND and
SEASON
The TREND estimate can be updated as in Holt:
TREND
Sept
=(LEVEL
Sept
LEVEL
Aug
) +(1-)TREND
Aug
Note that LEVEL
Sept
represents an updated de-
seasonalized forecast for Sept sales, so
SALES
Sept
/LEVEL
Sept
represents the implied
seasonality factor
Therefore we can update the seasonality factor:
SEASON
Sept
= (SALES
Sept
/LEVEL
Sept
)+(1- ) SEASON
priorSept
where is a smoothing constant between 0 and 1. It is
usually higher than and since each seasonality
estimate is only updated once per year
Winters Model forecasts
FORECAST
Oct
=(LEVEL
Sept
+1*TREND
Sept
)*SEASON
priorOct
FORECAST
Nov
=(LEVEL
Sept
+2*TREND
Sept
)*SEASON
priorNov
FORECAST
Dec
=(LEVEL
Sept
+3*TREND
Sept
)*SEASON
priorDec
Other Advanced Techniques
Winter is a general method that will work quite well
in a variety of settings. Still other variations exist
Trend can be exponential, quadratic etc.
We assumed that trend is additive (TREND is added
to LEVEL); other models assume multiplicative
trends
We may have a dramatic jump (a competitor
suddenly goes out of business) there are models
with adjusting alpha, beta, gamma
The methods we saw assumed a given period (12
months) for the seasonality. Other methods impose
combinations of seasonal patterns with different
length of seasonality (3, 5 months) and separate out
the estimates of their effects
Implementation considerations
Choosing a technique
We should consider the reasonableness of the
assumptions (if seasonality does not exist we use
Holt)
Deciding on aggregation
A car company may sell cars, trucks, vans We
may decide on forecasting the selling of the blue
XYZ sports-car with options which is a detailed
level this may have a lot of error
We could decide on forecasting the sales of cars
in general (aggregated level) then we have the
problem of disaggregating
Implementation considerations
Determining initial model parameters
The models presented have low sensitivity to
starting parameters after a few periods the
parameters tend to stabilize
We can use any reasonable values to start
Using forecasts in decision making
Error exists
The probability distribution of the errors can be
used to calibrate the uncertainty of forecasts
Implementation considerations
Monitoring forecast accuracy
Even if history is relevant, future conditions may
change and the forecast error rises dramatically
A way to watch this is the use of a tracking
signal a chart could be made to track the actual
forecast error after each period
The forecaster may choose a threshold value of
error which may be considered unacceptable
If the threshold is exceeded then we re-evaluate
the forecasting technique
Standard Econometric Time
series analysis using E-views
Stationarity
Autocorrelation functions
AR, MA, ARMA processes
Where we are and what we need
We have seen how to analyze a time series
of monthly sales in an intuitive manner
We will try to extend the previous techniques
to an advanced approach to time series
forecasting
These techniques consider the time series as
a stochastic process and can provide a more
efficient approach on how to use the history
to come up with a forecast
Stochastic processes
A random stochastic process is a collection of
random variables ordered in time
We will use GDP as an example:
Y
t
denotes a random variable (GDP at moment t)
Y
1
Y
2
Y
3
Y
87
Y
88
are random variables from
moment 1, 2, 3, 87, 88 (GDPs at all these
moments)
Keep in mind that each of these Ys is a
random variable
Stochastic variables and
Stochastic processes
Day 1
Day 2
Day 3
Day T
Random variables, we
only see their realizations
Stochastic process
GDP is a stochastic process
US GDP was $2872.8 billion in the first quarter of
1970
In theory, the GDP for that quarter could have been
any number, depending on the economic and
political climate prevailing
The figure $2872.8 is a particular realization of all
such possibilities
Therefore we will say that GDP is a stochastic
process and the actual values we observed from
1970-I until 1991-IV are a particular realization of
that process (i.e. sample)
Stationarity
What would be the time series that I would like
best in order to do very good forecasts?
We need to assume some stability of the data
before making forecasting analysis
In econometric words, stability means
independence of time (a series without
seasonality) or stationarity (a series that has
statistical properties which are constant,
stationary, from one observation to the other)
Stationarity - definition
Broadly speaking, a stochastic process is
said to be stationary if
its mean and variance are constant over time and
the value of the covariance between the two time
periods depends only on the distance or gap or
lag between two time periods and not the actual
time at which the covariance is computed (no
seasonality, no dependence on time).
Stationarity
Realizations of k
GDPs
( )
( )
k k t t
t
t
Y Y Cov
Y Var
Y E
=
=
=
,
) (
2
If the process is stationary,
all these indicators are
constant i.e. time
invariant
A stationary time series will tend to return to its mean (called mean
reversion) and fluctuations around this mean (measured by its variance) will
have a constant amplitude on average.
Stationarity
If a series is not stationary in the sense just defined,
it is called nonstationary time series (for ex. a
series with time-varying mean or a time-varying
variance or both)
Why is stationarity so important?
If a time series is not stationary, we can only study
its behavior for the time period under consideration
each set of time series data will therefore be for a
particular episode
As a consequence, it is not possible to easily
generalize to other periods history does not look
like the future, so careful with forecasting!
Stationarity (cont.)
If we look at some data, weak stationarity
means that the values fluctuate with some
constant variation around a constant level
0 100 200 300 400 500 600 700 800 900 1000
-4
-3
-2
-1
0
1
2
3
4
0 100 200 300 400 500 600 700 800 900 1000
-30
-20
-10
0
10
20
30
40
50
60
Stationary Not Stationary
Stationarity (cont.)
( )
k t t l
Y Y Cov
= ,
( )
t
Y Var =
0
k k
=
( ) ( ) ( )
t k t t k t k t t
Y Y Cov Y Y Cov Y Y Cov , , ,
) ( ) ( +
= =
Covariance between
return at moment t
and return at t-k
Same covariance
for variables
realized at equal
distance in time
Realiza
tion Y
t
Realization
Y
t-k
This means
that
Variance is
the same
Covariance with k lags before
t is the same as covariance
with k lags after t
Quick test for stationarity
It is common to
assume that a series
of asset returns is
weakly stationary
We can check for this
if we have a
sufficiently large
number of
observations
We can divide the
data into subsamples
and check for the
consistency of the
results!
If stationarity holds then the mean and
variance of the first subsample should
equal (statistically) the mean and variance
of the second subsample
A special stationary process white noise
A special stochastic process is the purely
random or white noise process
This process has zero mean and a constant
variance
2
and is serially uncorrelated
So, a stationary process (constant mean,
constant variance and constant covariance
zero no matter what lag we use)
This is the series of residuals that we use in
the regression analysis (the differences of the
realized values of the dependent variable and
the fitted values of the dependent)
Some classical nonstationary processes
To better understand the properties of the
stationary time series we will look at some
important nonstationary time series
The classic example is the random walk
model (exchange rates, asset prices seem to
follow a random walk they are
nonstationary)
We will see two types of random walks:
Random walks without drift
Random walks with drift
Random Walk without Drift
Suppose u
t
is a white noise with mean 0 and
variance
2
The series Y
t
is said to be a random walk if
Y
t
= Y
t-1
+ u
t
The value of Y at time t is equal to its value at time
(t-1) plus a random shock
We can think of this equation as a regression of Y at
time t on its value lagged one period
The beta coefficient is 1!
It is usually said that stock prices are essentially
random and we can not (on average) make
profitable speculations (if one could predict
tomorrows price on the basis of todays price, we
would all be millionaires)
Random Walk without Drift
The best forecast for
tomorrow is just the
price of today
Time varying variance - nonstationarity
Variance increases
linearly so we lose
in precision for each
forecast
Can be shown that the variance increases with
time the forecast is more and more foggy the
more we go into the future
This evolution of the variance also means that
variance of Y
t+k
does not have the same variance
as Y
t
, which means that the random walk without a
drift is not stationary
RW persistence of shocks
Y
t
is the sum of initial Y
0
plus the sum of
random shocks
Thus, the impact of a particular shock does
not die away
If u
12
= 2 rather than 0 (its average) then all
the series following the 12
th
realization will be
higher with 2 units the effect of this shock
never dies out!
Econometricians say RW has infinite memory
RW write the difference
Interestingly, if we write the RW process as
where delta () is the first difference operator,
then the obtained process (the first difference
process) is stationary (it is the white noise)
Random Walk with Drift (RWD)
The random walk with drift has another
constant parameter
where delta () is the drift parameter
The name drift comes from the fact that if we
write the preceding equation as
it shows that Y
t
drifts upward or downward,
depending on being positive or negative
RWD
It can be shown that
which means that for the RWD both the mean
and the variance are time variant (change in
time)
Therefore, RWD is nonstationary too!
Stochastic and Deterministic Trends
The distinction between stationary and nonstationary
stochastic processes has a crucial bearing on
whether the trend (the slow long run evolution of the
series) is deterministic or stochastic
If a trend is completely predictable it is a
deterministic trend
If a trend is not predictable it is a stochastic trend
We can have 4 situations:
1. Pure Random Walk
2. Random Walk with Drift
3. Random Walk with Deterministic Trend
4. Random Walk with Drift and Deterministic Trend
Pure Random Walk
Start with
If
We say that the series Y
t
is a
difference stationary process
IS
STATIONARY
Random Walk with Drift
Start with
If
The trend is called a
stochastic trend
Nonstationarity can be eliminated
by taking first differences of the
time series
Deterministic Trend
Start with
If
This is a trend stationary process. Even if the
mean of Y
t
depends on t, so it is changing, its
variance is constant.
This procedure of removing the trend is
called detrending
RW with Drift and Deterministic Trend
Start with
If
This is a RW with drift
and deterministic trend.
Deterministic vs. Stochastic Trend
The phenomenon of spurious regression
To see why stationary time series are so important
we look at two random walk models with no relation
between them whatsoever
Regression results provide significant coefficient
this is the spurious regression
If we regress differences in Y to differences in X
then we get non-significant coefficient
This is due to stochastic trends
Before running a regression we should look at the
stationarity of the variables used in the regression
Tests of stationarity
In practice two questions:
How do we find out if a given time series is
stationary?
If we find that a given time series is not stationary, is
there a way that it can be made stationary?
Next points:
Graphical analysis
The correlogram test
Unit root test of stationarity
Transforming non-stationary time series - making
them stationary
Finding stationarity
Graphical analysis
Start with the time series graph
The GDP has been increasing, showing an upward
trend, suggesting that the mean has been
changing perhaps the GDP is not stationary!
Autocorrelation function (ACF) and Correlogram
We could analyze the correlations (statistical
relation) of Y
t
with Y
t-k
for a lot of ks
This will tell us how much the past is
incorporated into the future (how much of the
future can be explained by the past)
We analyze the white noise, which is stationary
We also look at the RW, which is nonstationary
We try to figure out if there is a pattern in the
autocorrelations that could give us a criteria for
finding the stationary process
Correlation and Autocorrelation Function
The correlation coefficient between two
random variables X and Y is:
Real Mean
of X
Real Mean
of Y
The sample estimator of the correlation
coefficient
Sample Mean
of X
Sample Mean
of Y
Autocorrelation Function
Assume a weakly stationary time series Y
t
The relation between Y
t
and its k lags Y
t-l
is of
interest, we are dealing with autocorrelation (called
the lag-k autocorrelation of Y
t
).
is a function of l
equality
( )
( ) ( )
( )
( )
0
, ,
k
t
k t t
k t t
k t t
k
Y Var
Y Y Cov
Y Var Y Var
Y Y Cov
= = =
1
2
1
< s
=
+ =
+
T k
Y Y
Y Y Y Y
T
t
t
T
k t
k t t
k
Correlogram
Check the behavior of autocorrelations on a
graph
Look at their behavior for the white noise and
for the RW
What are the differences?
For the white noise the autocorrelations are
around 0
For the RW the autocorrelation coefficients at
various lags are very high for all the lags. The
autocorrelation coefficient starts at a very high
value and declines very slowly toward zero as the
lag lengthens
GDP correlogram
The correlogram of the GDP time series
looks very much like the correlogram of the
RW
The correlation coefficient starts at a very
high value at lag 1 (0.969) and declines
slowly
Choice of Lag Length
This is an empirical matter
A rule of thumb is to compute ACF up to one-
third to one-quarter the length of the time
seris
For our data we have 88 observations so
we look at 22-29 lags
There are also some statistical criterion (not
our purpose here)
How do we test for lag significance?
Statistical result:
If Y
t
is an iid sequence satisfying
E(Y
t
2
) is finite
then the sample autocorrelation estimator is
asymptotically normal with mean zero and
variance 1/T for any fixed integer k.
This result can be used for testing:
H
0
:
H
1
:
Eviews gives p-values! If p-value is lower than
0.05, then the lag is significantly different from 0
with 95% confidence
This info provides a
test for the lags!
What we did until now
We used Eviews to build the correlogram:
Took the GDP data from Excel and pasted into
the GDP series created in Eviews
Select the series gdp, click on Views, then
Correlogram
Said that the correlogram for the GDP series
looks very much like the correlogram of the RW
Unit Root Test
The starting point is the process
We know that if =1, then the process is a RW
without drift, which is stationary
What if we regress Y
t
on its one period lagged value
Y
t-1
and find out if the estimated is statistically
equal to 1 (i.e. a unit root)
We can do
and see if = 1 is statistically 0
If = 0, then the process is nonstationary but the
first differences are stationary
Augmented Dickey Fuller test
In Eviews we can test the following:
Y
t
Y
t-1
= Y
t-1
+ (Y
t-1
Y
t-2
)
If is less than 0, then the series is stationary with 0 mean
Y
t
Y
t-1
= + Y
t-1
+ (Y
t-1
Y
t-2
)
If is less than 0, then the series is stationary with nonzero
mean
Y
t
Y
t-1
= + t + Y
t-1
+ (Y
t-1
Y
t-2
)
If is less than 0, then the series is stationary with nonzero
mean around a deterministic trend
Eviews provides the values of the coefficients in a
DF equation and the p-values for the coefficient
which tell us about the stationarity of the time series
Presence of this factor simply
insures that the regression
provides good results
Transforming nonstationary time series
In order to avoid spurious regressions we need to
make sure that the variables used in the regression
are stationary
We use the ADF and check the unit root
If the series have a unit root then we take the first
difference (ex. GDP
t
GDP
t-1
) and use this series in
the regression
If the first difference series has a unit root too, then
we can use the second difference (ex. [GDP
t
GDP
t-1
] [GDP
t-1
GDP
t-2
]) and so on
Eviews allows us to test up to the second difference
Trend-stationary process
This is a process that is stationary around the
trend line but nonstationary altogether
Hence, the simplest way to make such a
series all stationary is to regress the series
on time and the residuals will be stationary
Run this regression,
where t is simply the
series 1, 2, 3, T
This series will be stationary.
It is called the detrended time
series. For forecasting
purposes we only put the trend
back to our forecast
Taking seasonality out
There may be time series that show
seasonality and because of that they are
nonstationary
Looking at the Earnings per share reported
every quarter by Johnson and Johnson
The ACF shows strong serial correlations
Taking seasonality out
After taking the first difference we have a time series with the
following ACF:
We observe that the ACF is strong when the lag is a multiple
of periodicity 4
We could take the seasonality out by simply taking another
difference of the data: value at t minus value at t-4
Taking seasonality out
In our case we observe the seasonality for the first
difference time series D(Y
t
) = Y
t
Y
t-1
To take seasonality out we take the 4
th
differences
of the series D(Y
t
):
D(Y
t
) D(Y
t-4
) = Y
t
Y
t-1
Y
t-4
+ Y
t-5
This is called seasonal differencing
The seasonality may be the cause of the non-
stationarity so, after taking seasonality out, the
series may become stationary
Approaches to economic forecasting
1. Exponential smoothing methods
2. Single-equation regression models
3. Simultaneous-equation regression models
4. Autoregressive integrated moving average
models (ARIMA)
5. Vector autoregression
Stationary time series
If a time series is stationary, then we can
model it in many ways
We will use ARIMA or Box-Jenkins
methodology to analyze a stationary time
series
Autoregressive processes (AR)
Moving Average processes (MA)
Autoregressive and Moving Average processes
(ARMA)
Integrated Autoregressive and Moving Average
processes (ARIMA)
AR
If Y
t
is a stationary time series (already tested with
the ADF test) then it can look like:
Y
t
=
1
(Y
t-1
) + u
t
where is the mean of Y and u
t
is a white noise
This is a first-order autoregressive process or
AR(1)
This model says that the forecast value of Y at time t
is simply some proportion (=
1
) of its value at time t-
1 plus a random shock or disturbance at time t
The Y values are expressed around their mean
values
AR (p)
We can also have this stationary process
Y
t
=
1
(Y
t-1
) +
1
(Y
t-2
) + u
t
which is a second-order autoregressive process,
or AR(2)
In general we can have AR(p) processes
Y
t
=
1
(Y
t-1
) +
1
(Y
t-2
) + +
1
(Y
t-p
) + u
t
p is higher when more lagged values have
something to say about the future values
more values from the past have influence on
the future values
Forecasting with AR
The one step ahead forecast for an AR(1):
Y
forcast1
=
1
(Y
present
) + u
forecast1
Hence, the forecast error comes from the u
t
We saw that u
t
is a white noise, so the error is associated
with its variance
2
(how much it moves around the mean)
The 2-step ahead forecast
Y
forcast2
=
1
(Y
forecast1
) + u
forecast2
Y
forcast2
=
1
(
1
(Y
present
) + u
forecast1
) + u
forecast2
For the 2-step ahead, the error comes from
1
u
1
+u
2
. If we
compute the variance of this we will see that it is
(1+
1
)
2
which is slightly higher than the variance of the one step
ahead variance but it does not increase linearly
MA
Another mechanism that may generate values for Y
could be
Y
t
= +
1
u
t-1
+ u
t
where is a constant and u is simply a white noise
Here Y at time t is equal to a constant plus a moving
average of the current and past error terms
Thus Y follows a first-order moving average, or an
MA(1)
Under the same logic as for AR(p) we may have a q-
order moving average MA(q) as:
Y
t
= +
1
u
t-1
+
2
u
t-2
+
3
u
t-3
+ +
q
u
t-q
+ u
t
ARMA
An Autoregressive and Moving Average
process ARMA(1,1) looks like this:
Y
t
= +
1
Y
t-1
+
1
u
t-1
+ u
t
We can also have an ARMA(p,q) process,
where p is the number of autoregressive
terms and q is the number of moving average
terms
ARIMA
I comes from integrated a series is said to be
integrated of order 1 (2,d) if the series is
nonstationary but its first (second, d
th
)-difference
is stationary
So, we use the ADF to see if the series is stationary
and if not, then we do the first difference
If the first difference is stationary then we apply the
ARMA model to model the series
ARIMA(p,d,q) means a series that is stationary at its
d
th
difference and the stationarity is modeled by
using p terms of AR and q terms of MA
Box-Jenkins methodology
Step 1 Identification
Step 2 Estimation
Step 3 Diagnostic
Checking
Step 4 Forecasting
Step 1 - Identification
We use the autocorrelation function (ACF) and
the partial autocorrelation function (PACF)
PACF shows the significance of the following
terms:
Y
t
=
1
(Y
t-1
) + u
t
Y
t
=
1
(Y
t-1
) +
2
(Y
t-2
) + u
t
Y
t
=
1
(Y
t-1
) +
2
(Y
t-2
) ++
p
(Y
t-p
)
+ u
t
General guidelines for patterns of ARMA
processes
ACF and PACF for GDP series
ACF declines very slowly ACF up to 23 lags
are statistically significantly different from 0
After the first lag the PACF drops dramatically
and all PACF lags are statistically insignificant
Much different is the correlogram for the first
difference of the GDP:
All ACFs at lag 1, 8 and 12 seem statistically
significant
Same for PACF
How do we choose the correct ARMA pattern
for the GDP time series?
How do correlograms look for the AR,
MA, ARMA?
We will look at AR(1), AR(2), MA(1), MA(2),
ARMA(1,1), ARMA(2,2) and so on
Each of these stochastic processes exhibits
typical patterns of ACF and PACF
If the time series of the first difference of the
GDP fits one of these patterns we can
identify the time series with that process
Of course we will have to apply diagnostic
tests to see if the ARMA model is reasonably
accurate
Identifying the GDP lag order
Autocorrelations decline up to lag 4
Except for lags 8 and 12, the rest of them is
not statistically significantly different from 0
PACF are also significant at the same lags
We will choose an AR(12) for the first
difference of GDP but we do not need to put
all the terms up to lag 12, we only use the
terms 1, 8 and 12:
Y
t
= +
1
Y
t-1
+
1
Y
t-8
+
1
Y
t-12
Step 2: Estimation
Run the ARMA we identified - Results from Eviews
Variable Coefficient Std. Error t-Statistic Prob.
C 23.08936 2.980356 7.747181 0.0000
AR(1) 0.342768 0.098794 3.469531 0.0009
AR(8) -0.299466 0.101599 -2.947523 0.0043
AR(12) -0.264371 0.098582 -2.681742 0.0091
R-squared 0.293124 Mean dependent var 21.52933
Adjusted R-squared 0.263256 S.D. dependent var 36.55936
S.E. of regression 31.38030 Akaike info criterion 9.782096
Sum squared resid 69915.33 Schwarz criterion 9.905695
Log likelihood -362.8286 F-statistic 9.813965
Durbin-Watson stat 1.766317 Prob(F-statistic) 0.000017
Step 3: Diagnostic Checking
One simple diagnostic is to obtain the
residuals from the estimation
Compute the ACF and PACF of the residuals
See if the residuals have any significant
autocorrelation
If not, it means that the ARMA process that
we used took all the autocorrelations out, so
we are ready for forecasts
Step 4: Forecasting
We used data from 1970-I to 1991-IV
We want to forecast the GDP for the first four
quarters of 1992
We analyzed the differences in the GDP
levels, so, in order to find the levels of GDP in
the 4 coming quarters we will undo the first
difference. The first forecast will be
Y
1992-I
-Y
1991-IV
= +
1
(Y
1991-IV
-Y
1991-III
)+
8
(Y
1989-
IV
-Y
1989-III
)+
12
(Y
1988-IV
-Y
1988-III
)+u
1992-I
Rewrite the model and fill in the numbers
Step by step forecasting:
1. Obtain the data many time series data
2. Check stationarity for all the series that we
have
3. Check stationarity of the first and second
difference in case the series of levels is not
stationary
4. Transform the data to obtain stationary time
series
5. Run regressions (if the analysis requires) using
the stationary data
Step by step forecasting
1. Do forecasts using the regressions for your
analysis
2. If we do not have data for explanatory variables or
the regression provided insignificant coefficients or
we have many time series to forecast then we
model the stationary time series obtained by
transformation:
1. Identify the ARMA process at hand
2. Estimate the ARMA process
3. Diagnostics checking
4. Forecasting
Example:
Our job is to forecast sales for the following 4
quarters
Values of sales may depend much on
the personal disposable income
the personal consumption expenditure
competitors profits
What are the forecasts?
Steps:
Check stationarity of all the series in the
analysis:
Eviews:
Type: series sales, series gdp, series income, series
comp
Open each time series and paste data from Excel
Click on View and choose Unit Root Test
Check Level (we look at the stationarity of the series itself)
and check Trend and Intercept (to test for all the possible
problems)
If Prob* is higher than 0.05, then the series in non-
stationary (this happens most of the times)
If Prob* is lower than 0.05, then the series is stationary, we
can use it as it is (for regressions or time series forecasting)
Steps:
Transform the non-stationary process(es) into
stationary time series
Check for the stationarity of the first-difference time series:
Use Unit Root Test, check 1
st
Difference
If Prob* is higher than 0.05, then we check for the 2
nd
Difference too
If Prob* is lower than 0.05, then the first difference is
stationary, so we can use it for further analysis
If 1
st
difference is not stationary, we could also look at the
correlogram of the first difference
Use View, Correlogram, 1
st
Difference and see if there are some
possible seasonality patterns in the data (have a look at the fourth
lag, as we have quarterly data)
Take the seasonality out by doing the 4
th
difference of the series
of first differences
Steps:
With the stationary data we try to find an
ARMA model to fit the data
Build the correlogram (View, Correlogram, 1
st
difference or 2
nd
difference or simply the de-
seasonalized data)
Find ACFs and PACFs that are significant
Pick the ARMA model you want
Estimate the ARMA model
Click Quick (main menu), Estimate Equation, then
write the name of the series you do the ARMA for
(gdp, income, comp or D(gdp), D(income),
D(comp) for the first difference)
Steps:
Choose other ARMA specification to see if
you can find a better adjusted R
2
Do the diagnostic checking for the ARMA you
choose
In the equation window click on View, Residual
Tests, Correlogram Q statistics
See if the correlogram has significant ACFs of
PACFs
If the residuals are significant at some lags then
we should choose another ARMA (redo the last
analysis)
Steps:
If all the series are stationary, then do the
regression
Quick, Estimate Equation, write the regression:
(ex. sales c gdp income comp
or sales=c(1) + c(2)*gdp+c(3)*income+c(3)*comp)
Check for significance of the parameters
Save the equation click Name (same window)
Do the forecasts for all the explanatory variables
using their estimated ARMA coefficients 4 quarters
into the future
Use the forecasts of the explanatory variables in the
estimated regression in order to build forecasts for
the sales in the regression
Steps:
We could also use the ARMA procedure for
the sales itself, and come up with time series
forecasts
Compute the averages of the forecasts from
the regression and the forecasts from time
series to come up with a more calibrated
forecast for the sales
Our example:
Sales does not have a unit root, hence Sales is
stationary
Still, Sales seems to have seasonality (4
th
level)
After de-seasonalizing the data, we found AR with
lags 3 and 5 to be significant
The 1
st
difference of PDI is stationary, no significant
autocorrelation it looks like a white noise
The 1
st
difference of the PCE is stationary, we tried
to fit an AR with lags 2 and 3
The 1
st
difference of the Profits is stationary, we
found significance at lags 1 and 5
Regression and forecasts
We run the regression of the deseasonlized
Sales (being stationary) on first differences of
PCE, PDI and Profits
We found no significance! The regression will
not be able to help us in the forecast
Thus, in order to do the forecasts we only use
the properties of the time series of sales as it is
One alternative would be to use Holts model or
Winters model
The better alternative is to use the estimated
ARMA(3,5;0) to do the forecast
Your forecast for sales
The forecast for the first quarter of 1992 will be
computed from:
Sales
1992-I
Sales
1991-I
= -28530.49 -0.444 (Sales
1991-II
Sales
1990-II
) 0.286 (Sales
1990-IV
Sales
1989-IV
)
The following forecasts will be computed in the same
manner
If we want a forecast that includes many models, then we
could simply do the average of the forecasts computed
using other models (Holts of Winters) and the forecast
coming from this ARMA estimation model
Econometrics and Risk
Management
Value-at-Risk
Monte Carlo simulations
Historical Simulations
Using time series analysis in risk management
Risk management huge industry
Need a way to measure the risk how possible is for a
company to lose money due to its exposure to some
economic variable
Ex.: exposure to currency rate changing
In order to measure the risk we need to put
probabilities to all the possible events (what could
possibly happen) this means to know the distribution
of the economic variable
To do this we need to look into the past, find a history
of the random variable and extract the distribution
Then see how much we could loose in, say, 5% of the
times
Understanding VaR
There is a X% probability to suffer a $Y loss
or more over the next Z hours/days.
Standard Normal Distribution
R
z
=
Using Standard
Normal Tables
o z R + =
Past evolution
Future evolution
Normal
random
variable
STANDARD
Normal
Random
Variable
Probability
Density
Lower tail probabilities
Confidence Level and Confidence Factor
R
R
R
% 76 . 1
% 11 . 0
=
=
o
Example
Value-at-Risk (VaR)
There is a 1% probability to suffer a
$100,000 loss or more over the next 24
hours.
Bad
returns are
at the
1% left tail
probability
Assumption
Risk management
For risk management purposes the
distribution of the future value of the
economic variable to which we are exposed
is very important
Knowledge of this distribution provides us
with a measurement of the risk we are
running
The indicator that shows us the value of the
risk is called VaR value at risk
VaR is the value that we could lose if the 5%
worst events happen
VaR computation
The computation of VaR sometimes needs a more
complex technique the simulation technique
If the distribution of the random variable is not
normal, then we may not have a formula for VaR,
but we may approximate the changes of the
economic variable with a process
To compute the VaR we could simply build
simulations of the process, i.e. a lot of possible
tracks of the values of the variable
Each track will be assumed to have the same
probability to show up (to happen) as any other
track
We will look at the values produced by the
simulation and compute the VaR