Professional Documents
Culture Documents
BY
MIGUEL A. ARINO
IESE
UNIVERSIDAD DE NAVARRA
Avda. Pearson 21
08034 Barcelona
SPAIN
aarino@iese.es
1. Introduction.
Our purpose in this paper is to present a
methodology for forecasting univariate time series. This
methodology combines standard forecasting techniques with
wavelet methodology described in
this paper. The recently developed wavelet theory has proven to
be a useful tool in the analysis of
some problems in engineering and related fields. However, the
potential of this theory -which will
be outlined in the next section- for analyzing economic problems
has not been fully exploited yet.
This paper presents one of its many possible applications in this
field. As an example we will apply
the methodology to forecast car sales in the Spanish
market and compare the results with those
given by standard forecasting techniques. Roughly
speaking, using wavelets, we decompose a
time series (xt) in its trend (yt) and its seasonal component (zt). The way of decomposing the series
is explained in Arino and Vidakovic (1995) and is
outlined later in this paper. Then we apply
standard forecasting techniques to each component to obtain a
forecast of the original series (x t).
This paper is organized as follows: After this
introduction, a brief summary of wavelets is
presented together with the discrete wavelet
transform (DWT) and the application of wavelet
analysis to decompose a time series into its long-term trend and
its seasonal component. In Section
3 we apply the wavelet analysis described in the
previous section to the data set of monthly car
sales in the Spanish market. The seasonal component
and the long-term trend of the series are
found and each one of them is forecasted ultimately
providing a forecast of the total series. In the
last section we present our conclusions.
2. Wavelets.
In this section we first present wavelets briefly.
Then we review the discrete wavelet transform,
which is the wavelet counterpart to the discrete Fourier
transform. Finally we show the splitting of
a time series into cyclical components by using wavelet analysis.
As in Fourier analysis, there are
continuous and discrete versions of wavelet
analysis. Although for our purposes it is sufficient to
deal with the discrete version, we provide a general
summary of wavelets for decompose
continuous functions, as well. Since we will be
dealing with discrete data sets, our focus will be
on the discrete wavelet transform.
2.1. An introduction to wavelets.
The wavelet series representation of a
square-integrable function f is in some way, similar to the
Fourier series representation. The Fourier series representation
of a function takes as orthonormal
basis for its expansion the simple harmonics {einx}, while for the wavelet representation different
sets of orthonormal bases {jk(x)} satisfying some conditions are used.
f(x) = cn einx
n=-
where fjk are the wavelet coefficients of f with respect to the basis
{jk }, which can be found by
fjk = f(x) jk (x) dx.
-
The most simple example of wavelet basis is the Haar
basis generated by the mother wavelet
defined by
(x) =
1
-1
0
f(x) = yk
. . ,bN-1 ) defined by
N-1
bj = at e-i(2j/N)t
j = 0, . . . , N-1.
t=0
This sequence b is a linear transformation of a, which allows us to look at the data set in
the
frequency domain instead of the time domain.
b is the set of Fourier coefficients of the
discrete
Fourier transform of a. In a similar way, given a wavelet basis
( jk ), the discrete wavelet
transformation of a data vector x of dimension N=2n is another data vector d of the same
dimension as x which allows us to look at the data set x in local and frequency domain instead of
in the time domain. The vector d is the set of coefficients in the wavelet series representation
of x
with respect to the wavelet basis chosen. This
transformation is linear and orthogonal, and can be
described by a NxN orthogonal matrix W (details
about DWT can be found in Nason and
Silverman, 1994; Vidakovic and Muller, 1994; Strang, 1993; or
Arino and Vidakovic, 1995).
Since the elements of d are the wavelet coefficients of x with respect to the basis (jk ), d can be
written as
d = (c00, d00, d10, d11, d20, . . . , dn-1,2n-1-1 ),
where c00 is the coefficient of the scaling function . Apart from c00 there is one 0-level coefficient
d00, two 1-level coefficients d10 and d11, and in general there are 2j j-level coefficients dj0 , dj1 , .
. . and dj,2j-1 . The last level of coefficients is the (n-1)-level.
Finally we define the scalogram of d, which is the wavelet counterpart of the
periodogram in
Fourier analysis. If d is the vector of coefficients of the discrete wavelet
transformation of x (i.e. d
= W x), the energy of d at level j is defined as
2j-1
E(j) = djk 2
k=0
for j = 0, . . . , n-1
120000
100000
80000
60000
40000
20000
1/1993
1/1992
1/1991
1/1990
1/1989
1/1988
1/1987
1/1986
1/1985
1/1984
1/1983
1/1982
1/1981
1/1980
1/1979
1/1978
1/1977
1/1976
1/1975
1/1974
(14.15)
(1)
where operators Bk and k are defined by Bk (xt) = (xt-k) and k = 1 - Bk . In parenthesis are
presented the t-statistics of each coefficient. Monthly forecasts
according to model (1) together with
their standard error and actual sales are presented
in Table 1. The out of the sample root mean
square error (RMSE) is 16,963.9. Figure 2 shows a
graph of the sales for the period 1992-1994
and the forecasts for 1994 with the 95% confidence
interval.
Good references on Box-Jenkins models and standard
forecasting techniques are in Box and
Jenkins (1976), Pankratz (1991) and Granger and
Newbold (1986).
Jan-94
Feb-94
Mar-94
Apr-94
May-94
Jun-94
Jul-94
Aug-94
Sep-94
Oct-94
Nov-94
Dec-94
Forecasts
54260
61128
75276
67910
69004
74912
88028
45174
44534
59786
56174
65158
Stand. Error
7612.32
8087.02
8535.37
8961.31
9367.90
9757.57
10132.26
10493.58
10842.87
11181.25
11509.69
11829.01
120000
100000
80000
60000
40000
20000
Sales
Forecasts
Low-band
07-94
01-94
07-93
01-93
07-92
01-92
Upper-band
Figure 2. Actual car sales for 1992-1994 and forecasts for 1994
with the 95% confidence interval according to model
(1).
18
16
14
12
10
8
6
4
2
0
0
Levels
10
120000
100000
80000
60000
40000
20000
0
-20000
1-93
1-92
1-91
1-90
1-89
1-88
1-87
1-86
1-85
1-84
1-83
1-82
1-81
1-80
1-79
1-78
1-77
-40000
Figure 4. Series x and its long-term (z) and seasonal (y) components from January 1977 until December
1993.
3.2.1. Forecasting y.
y is a long-term trend time series from which
seasonality has been removed. Thus, it is expected
that no seasonal component will appear in the model
specified for y. In fact, the best ARIMA
model for y is an ARIMA(1,3,0):
11
(1 - 0.957 B) 3 zt = at.
(46.63)
(3)
The within the sample root mean square error in this model is
6.66. The coefficient of the model is,
although close, smaller than 1. Since the algorithm
that estimates the parameter of the model
converges we consider this model valid. Except for k
= 6, 12, 18 ..., the autocorrelations function
of the errors r(k) can be considered as 0. For k =
6, 12, 18 ..., r(k) are more problematic. Since
there is no reason to belive that a six-month
seasonality should be present in the series no further
steps will be taken by now to improve the model. An alternative
model will be explored in Section
4. In table 2 we present the forecasts of y for the twelve months of 1994 with their standard errors.
3.2.2. Forecasting z.
z is a time series which presents a seasonality of 12
months as expected. The best ARIMA model
which describes its behavior is an ARIMA(0,1,1)12:
12 zt = (1 - 0.701 B12) at
(13.46)
(4)
The within the sample root mean square error in this model is
7,311. The autocorrelation function
of the errors indicates that they can be considered
as white noise. Forecasts of z for the twelve
months of 1994 together with their standard errors are shown in
Table 2.
This series seems to describe quite well the typical seasonality
of car sales in Spain. The average of
the series is almost zero, indicating that
z captures seasonal deviations of the series
x from its
general trend: months of June and most of July
exhibit over-average car sales while August and
September exhibit under-average sales. Accordingly,
the average of forecasts of z for the twelve
months of 1994 is almost zero also. In addition,
other forecast techniques, like classical
decomposition of time series, offer similar results.
3.2.3. Forecasting x.
Forecasts of x for the twelve months of 1994 are the sum of
forecasts of y and z. They are shown
in Table 2 together with the standard errors and the actual
values of x. It can be seen that the out of
the sample RMSE of the Box-Jenkins forecast is
16,964 while for the wavelet forecast it is
12,895. Moreover, in each month, the standard error of the
wavelet forecast is smaller than that of
the Box-Jenkins forecast. Thus for this particular
example, the wavelet method performs better
than the Box-Jenkins model. In addition, the
forecasts of the Box-Jenkins model underestimates
12
Jan-94
Feb-94
Mar-94
Apr-94
May-94
Jun-94
Jul-94
Aug-94
Sep-94
Oct-94
Nov-94
Dec-94
y Fore.
61355
61154
61411
62194
63567
65590
68322
71819
76135
81320
87424
94494
z Stand. Err.
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
x Fore.
49861
56558
71279
65039
68034
76202
92423
53903
57884
79028
81975
98138
120000
100000
80000
60000
40000
20000
Sales
Forecasts
Low-band
7-94
1-94
7-93
1-93
7-92
1-92
Upper-band
13
4. Further considerations.
An alternative analysis of series y can be made to provide another global forecast for series x. The
ARIMA model for y would be an ARIMA(1,4,0)x(1,0,0)12:
(1 - 0.356 B) (1-0.8 B12) 4 yt = at.
(5.12)
(18.25)
(5)
Jan-94
Feb-94
Mar-94
Apr-94
May-94
Jun-94
Jul-94
Aug-94
Sep-94
Oct-94
Nov-94
Dec-94
y Fore.
61365
61195
61533
62482
64149
66637
70051
74493
80076
86934
95208
105045
z Stand. Err.
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
7310.95
14
x Fore.
49870
56599
71400
65326
68617
77250
94152
56577
61825
84642
89759
108688
References
[1]
[2]
Box, G.E.P. and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and
Control,
rev. ed., San Francisco: Holden Day
15
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
16
Appendix 1
Monthly car sales in the Spanish Market
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
Jan
47760
43948
49761
49976
50636
59374
37905
39502
41873
48602
47643
46744
29330
59510
81012
86803
85485
67998
87950
43150
54977
Feb
47654
47214
42762
59221
50962
43428
45194
39137
43091
42331
39716
37549
41088
68650
81713
97215
84873
70581
87520
60892
64490
Mar
59778
46394
59650
69055
69456
56098
50833
40775
53729
55121
46422
54185
55732
84380
104756
109727
105810
77769
99161
77093
88819
Apr
50254
51488
58365
59554
72807
51679
47518
47581
44875
45002
50404
51643
52388
86095
75912
94817
91076
82803
94201
67525
72145
May
49308
43658
56900
64926
60188
67079
58254
49368
45669
53608
50643
43781
59408
80288
97868
100780
96564
84268
86357
69440
89907
Jun
54737
52136
52200
63592
58801
55960
44363
44997
46931
45945
41744
42720
67537
79878
102419
117883
93946
84648
96992
77285
98017
Jul
61003
61981
63506
65377
53716
54722
51821
48687
53236
49821
56412
66489
82716
110543
106741
113877
107865
102518
111324
90288
104812
5647.5
Aug
37653
40590
30258
37467
41416
43834
35478
30494
30694
36967
36192
33391
44394
59501
77493
75610
75211
60327
51450
49920
65899
Sep
40370
35968
43093
46702
48016
47005
45684
34648
33788
35288
30697
32927
45734
55764
69431
72586
51849
57351
61320
52187
65201
-5690.8
1058.2
Oct
44303
50693
53449
49938
51817
44955
59943
42910
45843
46017
43257
57521
71291
78690
86576
94609
77259
81918
74330
58109
71931
Nov
42969
48517
53330
50059
52226
59949
50357
39933
48932
44190
40266
49047
68449
78445
89089
91543
70702
71273
69783
57913
68761
Dec
39934
49601
56403
46992
43992
36569
46799
47684
47072
47544
38833
59055
70984
86520
96210
93923
66374
72607
88066
71659
93883
822.4 -5091.9
8288.8
Appendix 2
c00
146876.3
d00
-75299.6
d1,k
-14442.6 -129874.1
d2,k
-18870.0 -10021.9 -48167.9 80346.3
d3,k
10154.8 -7099.4
d4,k
3625.9 -12377.5 -1741.4
-25129.9 28434.8 -4927.9
7412.5
7797.3
4673.1
-1782.4
-3022.4
d 5,k
-60.9 9087.8 -5984.8 -14322.4 12416.4
-814.6 -28808.5 17825.1 -9644.1 -4861.6 6637.0
-403.7 -12686.0
6366.5
5320.9 -10420.7 12989.8 -4325.5 -2734.7 20635.3
410.0 -20288.5 25842.7 -2609.5 -18335.2 34033.1
-13653.5 -24782.1 25008.2 -16042.2 -13497.2 18680.0
17
d6,k
989.4
-158.6 4558.0 -2456.6 2954.8
-1614.3 -5807.0 1854.9 -10485.9 3823.8
7068.2
3126.6 -4296.9 6722.0
4550.2
9406.7
2159.7 19700.0 -6882.2 -2649.3
-4632.8 8919.0 -8424.7 2618.5 22040.9
d7,k
1054.2
5156.6
-962.5
7503.7
11923.3
-10851.5
4211.0
-4253.6
1506.8
-3255.6
3456.8
2678.0
-3597.1
7365.7
-8865.5
7457.9
-5393.2
12284.1
6530.7
22508.2
5651.1
5091.6
7034.0
-8824.0
6511.6
7441.3
4549.4
-10277.3
18327.3
-12797.5
-2632.7
18100.2
-1733.1
872.2
3915.2
212.5
7153.4
18086.2
-22925.1
-4943.7
13726.3
-9122.2
9288.9
-2462.0
8794.0
5861.5
25420.1
-19552.1
11139.1
-1639.1
Feb
50176.7
46470.8
49580.2
55593.7
54887.3
52350.7
48068.4
44730.9
43353.6
45857.2
45051.1
43647.1
70913.5
87863.1
93657.6
89511.4
79934.9
77866.4
74631.5
61021.1
61112.7
Mar
52546.5
46609.8
50322.4
55671.2
54728.5
51994.1
47748.8
44483.2
43449.1
46040.1
44775.1
43935.2
72740.4
88739.2
93771.0
88767.3
79357.6
78029.9
73499.5
60968.5
69146.9
Apr
53798.4
46779.0
51077.9
55708.5
54577.8
51632.2
47445.0
44252.2
43582.3
46185.0
44493.2
44288.4
74523.6
89525.1
93799.0
87978.8
78850.6
78187.5
72257.5
61112.5
56956.4
-562.0
14997.0
6159.4
-2559.2
3994.8
10558.8
-3274.8
9544.2
-19942.1
-647.6
10593.7
-5131.8
12105.6
1989.6
-1215.4
-184.6
12045.2
-23569.3
-12634.3
-2807.9
Sep
46095.7
47242.7
54216.7
55545.3
53807.1
49842.4
46083.1
43443.4
44668.1
46165.8
43353.3
47025.9
82276.8
92263.4
92533.0
83677.3
77490.8
78070.6
65309.5
62497.3
Oct
46300.0
47514.9
54662.9
55440.6
53569.5
49470.9
45807.7
43355.0
44919.6
46006.6
43266.6
47809.0
83566.5
92640.9
92048.0
82846.1
77461.4
77692.5
64001.1
63359.5
-13363.9 5524.6
4786.2 3157.3
-7098.7 9892.5
6154.7
-304.5
9515.3 6539.8
19461.4 -20655.3
-15154.1 1988.9
11656.4 -4032.1
-10267.1 13881.6
Apendix 3
a) Series y
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1987
1988
1989
1990
1991
1992
1993
1994
1995
Jan
48200.4
46610.7
48938.1
55470.0
55039.7
52688.1
48399.3
44988.9
43295.7
45644.5
45318.4
43434.6
69071.1
86908.8
93480.3
90218.3
80581.5
77714.1
75626.3
61337.0
57757.6
May
52688.7
46964.1
51807.1
55715.8
54440.2
51272.3
47156.7
44041.7
43750.7
46283.8
44210.4
44699.5
76240.5
90216.3
93727.2
87146.7
78417.0
78320.1
70932.5
61393.5
Jun
50512.7
47093.1
52494.1
55705.9
54308.0
50917.2
46882.2
43855.6
43950.8
46333.3
43941.7
45171.8
77878.8
90824.7
93556.7
86285.3
78061.5
78402.0
69548.9
61824.3
18
Jul
48297.1
47056.3
53115.6
55679.8
54170.9
50567.0
46617.9
43695.9
44177.0
46331.2
43700.8
45710.2
79429.8
91360.4
93291.8
85407.9
77786.9
78405.8
68133.4
62290.0
Aug
46551.3
47069.6
53687.6
55627.7
54009.1
50210.8
46353.7
43558.8
44419.0
46275.1
43500.2
46325.2
80895.2
91835.9
92946.2
84532.7
77596.6
78303.5
66708.3
62463.9
Nov
46519.7
47910.3
55013.0
55318.2
53300.0
49102.4
45529.7
43297.8
45168.4
45804.4
43246.5
48674.6
84763.8
92968.1
91494.2
82046.2
77497.9
77159.9
62853.5
64397.2
Dec
46729.1
48400.1
55281.1
55182.6
53003.5
48742.8
45254.8
43276.5
45411.4
45571.3
43301.3
49627.9
85876.8
93248.9
90882.4
81288.8
77587.3
76470.1
61946.3
62094.8
b) Series z
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
Jan
-440.4
-2662.7
822.9
-5494.0
-4403.8
6685.9
-10494.3
-5486.9
-1422.7
2957.5
2324.6
3309.4
-21339.2
-9561.1
-5896.8
-6677.3
-4733.3
-12583.5
10235.9
-32476.3
-7077.4
-7193.7
Feb
-2522.7
743.2
-6818.2
3627.3
-3925.3
-8922.7
-2874.4
-5593.9
-262.6
-3526.2
-5335.1
-6098.1
-10713.0
-2263.5
-6150.1
3557.4
-4638.4
-9354.0
9653.6
-13739.5
106.8
-3680.5
Mar
7231.4
-215.8
9327.6
13383.8
14727.5
4103.9
3084.2
-3708.2
10279.9
9080.9
1646.9
10249.7
2710.1
11639.6
16016.8
15956.0
17042.7
-1588.6
21131.1
3593.5
14307.7
2433.5
Apr
-3544.4
4709.0
7287.1
3845.5
18229.2
46.8
73.0
3328.8
1292.7
-1183.0
5910.8
7354.6
-1932.9
11571.4
-13613.1
1018.0
3097.2
3952.4
16013.5
-4732.5
6797.7
7258.1
May
-3380.8
-3306.1
5092.9
9210.2
5747.8
15806.7
11097.3
5326.3
1918.3
7324.2
6432.6
-918.5
3716.9
4047.5
7651.7
7052.8
9417.3
5851.0
8036.9
-1492.5
7610.4
Jun
4224.3
5042.9
-294.1
7886.1
4493.0
5042.8
-2519.2
1141.4
2980.2
-388.3
-2197.7
-2451.8
10402.7
1999.2
11594.3
24326.2
7660.7
6586.5
18590.0
7736.1
13087.8
19
Jul
12705.9
14924.7
10390.4
9697.2
-454.9
4155.0
5203.1
4991.1
9059.0
3489.8
12711.2
20778.8
24065.6
31113.2
15380.6
20585.2
22457.1
24731.1
32918.2
22154.6
25738.4
Aug
-8898.3
-6479.6
-23429.6
-18160.7
-12593.1
-6376.8
-10875.8
-13064.8
-13725.0
-9308.1
-7308.2
-12934.2
-15847.8
-21394.2
-14342.9
-17336.2
-9321.7
-17269.6
-26853.5
-16788.3
-17289.5
Sep
Oct
Nov
Dec
-5725.7 -1997.0 -3550.7 -6795.1
-11274.7 3178.1
606.7 1200.9
-11123.7 -1213.9 -1683.0 1121.9
-8843.3 -5502.6 -5259.2 -8190.6
-5791.1 -1752.5 -1074.0 -9011.5
-2837.4 -4515.9 10846.6 -12173.8
-399.1 14135.3 4827.3
1544.2
-8795.4
-445.0 -3364.8 4407.5
-10880.1
923.4 3763.6
1660.6
-10877.8
10.3 -1614.4 1972.7
-12656.3
-9.6 -2980.5 -4468.3
-14098.9 9712.0
372.4 9427.1
-16174.1 7653.8
3032.5
3750.5
-26512.8 -4876.5 -6318.8
643.2
-22832.4 -6064.9 -3879.1 2961.1
-19947.0 2561.0
48.8
3040.6
-31828.3 -5587.1 -11344.2 -14914.8
-20139.8 4456.6 -6224.9 -4980.3
-16750.6 -3362.5 -7376.9 11595.9
-13122.5 -5892.1 -4940.6 9712.7
-17962.8 -3573.3 -8223.4 3063.6