Population Association of America, Springer Demography

Modeling and Forecasting Populations by Time Series: The Swedish Case
Author(s): Joao L. M. Saboia

Source: Demography, Vol. 11, No. 3 (Aug., 1974), pp. 483-492
Published by: Springer on behalf of the Population Association of America
Stable URL: https://www.jstor.org/stable/2060440
Accessed: 06-10-2018 21:20 UTC
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Population Association of America, Springer are collaborating with JSTOR to digitize,

preserve and extend access to Demography
This content downloaded from 168.176.5.118 on Sat, 06 Oct 2018 21:20:47 UTC
All use subject to https://about.jstor.org/terms
DEMOGRAPHY? Volume 11, Number 3 August 1974
MODELING AND FORECASTING POPULATIONS BY

TIME SERIES: THE SWEDISH CASE
Joao L. M. Saboia
Operations Research Center, University of California, Berkeley, California 94720
Abstract-Time series analysis techniques are used to model and to forecast

populations. An autoregressive (AR) and a moving average (MA) model,
which seem to fit the population of Sweden very well, are found. Fore-
casts are calculated using both models and are compared with the fore-
casts obtained by other methods. This comparison is very favorable for
the time series models. Although our study is confined to the mid-year
population of Sweden, there are good reasons to expect that the technique
can be successfully applied to other population parameters.
INTRODUCTION were obtained from the Demographic

Yearbook (United Nations, 1971). We
It is a common practice among de- could have worked with a smaller in-
mographers (e.g., Keyfitz and Flieger, terval for the population data, but using
1971) to project populations assuming the data mentioned above facilitates the
that the age-specific birth and death comparison of our results with the pro-
rates of a given period will persist un- jections assuming constant age-specific
changed over the succeeding periods and birth and death rates, which are already
that the population is closed to migra- calculated for five-year intervals in Key-
tion. These projections serve as a base- fitz and Flieger (1971).
line from which other predictions can be To make the paper more self-con-
evaluated (Keyfitz, 1972). Keyfitz also tained, a very brief summary of the time
suggests various techniques potentially series analysis technique is provided in
useful for population forecasts, among the Appendix.
which is the time series analysis tech- The calculations in this paper were
nique of Box and Jenkins (1970). made with the help of an electronic com-
In this paper we apply Box and Jen- puter and used more decimals than are
kins' results to population forecasts. shown here.
It was decided to study Sweden due to
the fact that the technique requires a MODEL IDENTIFICATION
large amount of data in order to be suc- Column 3 of Table 1 shows the Swed-
cessful, and this country has population ish population from 1780 to 1970 at
data available since the eighteenth cen- five-year intervals. The two models that
tury (Gille, 1949). Keyfitz and Flieger will be derived below are based on the
(1971) have the mid-year population for population up to 1960.
Sweden at five-year intervals since 1780. The general ARIMA (p, d, q) model is
The data used in this paper are the same described by
as in Keyfitz and Flieger (1971) with
the addition of the data for 1970, which 4(B)Vdzt = 0o + o(B)a,, (1)
483
484 DEMOGRAPHY, volume 11, number 3, August 1974
TABLE 1.-Population, Projections and Forecasts for Sweden (At mid-year and in millions)
Population
Projection Population Forecast
Fixed Rates
Population No Migration ARIMA(1,1,0) ARI4A(0,2,1)
t Year zt it-l(l) zt-2(2) Zt-(l) zt-2(2) zt-1(l) zt-2(2)
l 1780 2.104
2 1785 2.147 2.197
3 1790 2.161 2.193 2.292 2.238 2.197
4 1795 2.274 2.219 2.238 2.237 2.357 2.199 2.248
5 1800 2.352 2.374 2.272 2.403 2.347 2.337 2.237
6 1805 2.418 2.414 2.470 2.462 2.542 2.420 2.401
7 1810 2.380 2.497 2.478 2.522 2.591 2.485 2.488
8 1815 2.450 2.371 2.579 2.428 2.647 2.411 2.553
9 1820 2.573 2.553 2.379 2.556 2.552 2.494 2.443
10 1825 2.749 2.692 2.657 2.708 2.682 2.644 2.539
11 1830 2.876 2.931 2.806 2.912 2.849 2.855 2.715
12 1835 3.004 2.980 3.113 3.013 3.069 2.989 2.962
13 1840 3.123 3.166 3.090 3.141 3.156 3.122 3.102
14 1845 3.296 3.267 3.341 3.255 3.284 3.241 3.240
15 1850 3.462 3.471 3.437 3.458 3.396 3.432 3.360
16 1855 3.625 3.662 3.668 3.620 3.614 3.608 3.569
17 1860 3.824 3.795 3.880 3.781 3.774 3.777 3.755
18 1865 4.092 4.094 3.971 4.000 3.935 3.992 3.929
19 1870 4.164 4.356 4.354 4.305 4.163 4.293 4.160
20 1875 4.362 4.363 4.617 4.271 4.489 4.321 4.495
21 1880 4.572 4.621 4.568 4.537 4.398 4.533 4.479
22 1885 4.664 4.856 4.895 4.754 4.701 4.756 4.705
23 1890 4.780 4.962 5.164 4.782 4.921 4.817 4.940
24 1895 4.896 5.053 5.277 4.911 4.914 4.920 4.970
25 1900 5.117 5.175 5.334 5.027 5.050 5.028 5.061
26 1905 5.278 5.401 5.476 5.305 5.166 5.279 5.160
27 1910 5.499 5.575 5.707 5.433 5.475 5.439 5.441
28 1915 5.696 5.813 5.902 5.687 5.586 5.680 5.601
29 1920 5.876 5.942 6.151 5.871 5.857 5.882 5.862
30 1925 6.045 6.088 6.218 6.041 6.034 6.060 6.069
31 1930 6.131 6.225 6.314 6.204 6.200 6.224 6.245
32 1935 6.242 6.232 6.414 6.246 6.360 6.278 6.403
33 1940 6.356 6.313 6.331 6.370 6.376 6.377 6.426
34 1945 6.636 6.490 6.367 6.486 6.508 6.484 6.512
35 1950 7.017 6.909 6.593 6.855 6.625 6.815 6.612
36 1955 7.262 7.225 7.128 7.291 7.043 7.264 6.995
37 1960 7.480 7.427 7.379 7.463 7.508 7.508 7.512
38 1965 7.734 7.636 7.568 7.666 7.640 7.716 7.755
39 1970 8.046 7.962 7.791 7.939 7.835 7.988 7.953
Sources: Population data (column 3) were obtained from Keyfitz and Flieger, 1971; UJnited
Nations, 1948-1971. Population projections (columns 4 and 5) were obtained from
Keyfitz and Flieger, 1971. Population forecasts (columns 6 to 9) were calculated
asing Equations (22) to (25), respectively.
where Z= population at time t;

a= in dependent random variables
+(B) = 1 - OB- 02B _.. _-0,B';
normally distributed with mean
(2)
zero and variance 0-2 (white
O(B)= 1-01B-02B2- - Bq; noise);
(3) 0 = overall moving average constant.
Table 2 shows the estimated mean z,
Vzt = Zt- Zt_1; VdZt = V(VdlZt); (4)
variance a,2, and the first ten auto-
Bzt = Zt-1; Bmzt = zt-m; (5) correlations (rk's) and partial autocor-
The Swedish Case 485
TABLE 2.-Mean, Variance, Autocorrelations THE AUTOREGRESSIVE MODEL

and Partial Autocorrelations for Zt, Vzt and ARIMA(1, 1, 0)
'V 2Zt
V2zt~~~~t(2
This model is defined by
(1 -B)?t = a,, (6)

i(w) 4.29684 .14933 .00500
where
z( w) 2.70079 .00634 .00581
?bt = wt -W 'CV(7)
r1 .920 .517 -.271
r2 .839 .249 -.134 = vzt - ,
r3 .757 .101 -.129
r4 .683 .077 .000 N-1 /
r5 .613 .096 .017 w= E wt/(N -1), (8)

r6 .540 .047 -.086 t=1
r7 .464 .133 .127

r8 .384 .102 -.028
and N = number of data in the original
*9 .305 .118 .042 series zt.
*10 .229 .065 .095 The maximum likelihood estimate 4
was calculated using the procedure de-
.920 .517 -.271
scribed -in Box and Jenkins (1970, Ap-
022 -.049 -.024 -.224 pendix 7.5):
033 -.045 - .025 -.267
!44 .000 .054 -.202 =.540. (9)

055 -.021 .057 -.169
66 - .057 -.044 -.281 From Table 2, the mean for the time
077 -.073 .156 -.111
series wt is given by
!88 -.069 -.031 -.169
099 -.050 .063 -.105
w = .149. (10)
010, 10 -.043 -.031 .080
From Box and Jenkins (1970, Table
Sources: Mean, Variance, Autocorrelations, and
Partial Autocorrelations were calculated 6.6) we can estimate the standard error
according to Equations (13), (14), (18) and
(19) of the Appendix.
for iw. Its value is given by
aw _ .024. (11)
relations (+kk') for the original time Comparing
series (10) and (11) we see that jw
zt, as well as for the first two differences
should be considered nonzero. Finally, the
Vz, and V2z,. (See Appendix for residual
defini- variance can be estimated from
tion of z, AZ2 rk and ?kk.) the sum of squared residuals as in Box
As expected, it is found that in the and Jenkins (1970, Chapter 7). Thus we
original population time series zt the co- have
efficients rk decay approximately lin-
early, which implies that the series is 0a2 = .00442. (12)
nonstationary (Box and Jenkins, 1970, From (6), (7), (9) and (10) the ARIMA
pp. 174-175). (1, 1, 0) model is given by
Using the technique suggested by Box
and Jenkins (1970, Chapter 6), we can z, = 1.540z,1 - .540z 2 + .069 + a,.
identify two possible time series models: (13)
an autoregressive process of order 1 for
the first differences, ARIMA (1, 1, 0), THE MOVING AVERAGE MODEL
and a moving average process of order 1 ARIMA(0, 2, 1)
for the second differences, ARIMA This model is defined by
(0, 2, 1). (See Appendix for more details
of the identification procedure.) t = (1 - O1B)a, (14)
where minimum mean square error forecast

, (1) at origin t for lead time 1 is the con-
i-v= wt - i ditional expectation of z, given know-
ledge of all z's up to time t (see Appendix).
- V2zt -w W, (15)
For the ARIMA (1, 1, 0) model the
and corresponding forecasts at time t and
N-2 / 1 = 1, 2 are given by
w? wt /(N-2). (16)
t =1 StM = 1.540zt - .540z,-1 + .069 (22)
The least squares estimate 01 was deter- and
mined using the procedure of Box and
Jenkins (1970, Section 7.2): z, (2) = 1. 540zt(1) .540zt + .069, (23)
respectively, and for the ARIMA (0, 2, 1)
dl= .663. (17)
model
From Table 2, the mean for the time
series wt is given by zt(l)= 2zt - zt-, + .663at (24)
and
= .005. (18)
As mentioned before, the standard error A (2) 2z(1) =z (25)

for w can be estimated from Table 6.6
respectively.
of Box and Jenkins (1970):
Notice that while we only need the z,'s
(f)- .009. (19) in order to calculate the forecasts for the
ARIMA (1, 1, 0) model, we also need the
Comparing (18) and (19) we see that it
at's for the ARIMA (0, 2, 1) forecasts.
is reasonable to assume w = 0. In the
Estimates for the at's can be determined
same way as in the ARIMA (1, 1, 0)
as being the expectation of the at's
process, we can obtain an estimate for the
conditional on &1 and the zt's (Box and
residual variance:
Jenkins, 1970, Chapter 7).
62 = .00463. (20) Table 1 shows the Swedish population
from 1780 to 1970 with forecasts five
Finally, from (14), (15) and (17) the and ten years ahead (1 = 1, 2) for the two
ARIMA (0, 2, 1) model is given by models ARIMA (1, 1, 0) and ARIMA
(0, 2, 1), as well as the corresponding
zt = 2zt-- Zt-2 + at - .663at-1. (21) projections of Keyfitz and Flieger (1971).
Table 3 summarizes the two models As mentioned in the Appendix, assum-
ARIMA (1, 1, 0) and ARIMA (0, 2,ing
1).that the a,'s are normally distributed,
it follows that, given the information up
FORECASTS
to time t, the conditional probability
Box and Jenkins (1970) show that the distribution of the future population zt+,
TABLE 3.-Summary of Fitted Models
Type of Model Fitted Model Residual Variance
ARIMA(1,1,) zt= 1.540zt_1 - .540z t2 + .069 +at .00442

ARIMA(0,2,l) zt= 2zt_1 - zt-2 + at- .663at-1 .00463
Sources: The ARIMA (1,1,0) model was obtained from Equations (12) and (13). The
ARIMA (0,2;1) model was obtained from Equations (20) and (21).
is normal with mean Z (1) and variance with an estimated residual variance of
V(l). Thus, confidence intervals can be
easily obtained (see Box and Jenkins, 0a = .00452. (29)
1970, pp.135-137). With the adjusted models the fore-
Table 4 shows the forecasts for 1965 casts for 1975 and 1980 can be calculated
and 1970, together with 50 and 95 per- using equations similar to (22), (23),
cent limits, for the ARIMA (1, 1, 0) and (24) and (25).
ARIMA(0, 2, 1) models, based on the The resulting forecasts for 1975 and
population of Sweden up to 1960. The 1980 for both models, together with 50
order of magnitude of the standard de- percent and 95 percent limits, are sum-
viations for both models is the same. It marized in T'able 5.
is interesting to notice that for the
ARIMA(1, 1, 0) model Z38 and z39 lie
COMPARISON BETWEEN TIME SERIES
outside the 50 percent limits but inside
FORECASTS AND OTHER POPULATION
the 95 percent limits. For the ARIMA
PROJECTIONS
(0, 2, 1) model, Z38 lies inside the 50 per-
cent limits, and Z39 lies inside the 95 per- In this section we compare the results
cent limits. obtained in this paper with the projec-
If the forecasts for the Swedish pop- tions assuming fixed age-specific birth
ulation for 1975 and 1980 are desired, and death rates and no migration (Key-
the models can be adjusted, incorporat- fitz and Flieger, 1971), the United Na-
ing the populations of 1965 and 1970. In tions projections (United Nations, 1966),
this case the ARIMA (1, 1, 0) model is and also with the results obtained from
given by the logistic function (Pearl and Reed,
1920).
z, = 1.604z-1 - .604z,-2 + .062 + a, The logistic curve, N(t) = a/[1 + b
(26) exp(-ct)], was fitted to the population
data of Sweden from 1780 to 1960. The
with an estimated residual variance of
method employed was the least squares
0Ja = .00455. (27) of Keyfitz (1968, Section 9.2), and we
obtained
And the adjusted ARIMA (0, 2, 1) model
is given by N(t) = 11.988/[1 + 5.484
zt = 2zt-, Zt-2 + at - .628a,1 (28) * exp (-.0118t)]. (30)
TABLE 4.-Forecasts for the 1965 and 1970 Populations of Sweden with 50 Percent and 95
Percent Limits (In millions)
Population Forecast
Model Year Population (z t) With 50 Percent Limit With 95 Percent Limit
ARIMA(l,l,0) 1965 7.734 7.666 + .045 7.666 + .130
ARIMA(0,2,1) 1965 7.734 7.716 + .046 7.716 + .133
ARIMA(l,l,0) 1970 8.046 7.835 + .082 7.835 + .239
ARIMA(0,2,1) 1970 8.046 7.953 + .077 7.953 + .223
Sources: Population data (column 3) were obtained from United Nations, Demographic
Yearbook, 1966, 1971. Population forecasts were obtained from Equations (22) to (25).
Confidence limits were obtained from Equation (5.2.6) of Box and Jenkins, 1970.
TABLE 5.-Forecasts for the 1975 and 1980 Population of Sweden with 50 Percent and 95
Percent Limits (In millions)
Population Forecast
Model Year With 50 Percent Limit With 95 Percent Limit
ARIMA(1,1,0) 1975 8.296 + .045 8.296 + .132

ARIMA(0,2,1) 1975 8.316 + .045 8.316 + .132
ARIMA(1,1,0) 1980 8.510 + .086 8.510 + .250

ARIMA(0,2,1) 1980 8.586 + .077 8.586 + .224
Sources: Population forecasts and confidence limits are the sane as in Table 4.
The sum of the squares (ss) of the de- According to Keyfitz (1972), the qual-
partures of the fitted from the observed ity of a prediction can be measured by
populations (in millions squared) was the formula:
ss = .4676.
Quality
Table 6 shows the forecasts for the
mid-year population of Sweden for 1965 Prediction - Projection at fixed rates
and 1970, based on the population up to Realization - Projection at fixed rates
1960, using various methods. As we see,
(31)
all the procedures underestimated the
Swedish population in 1965 and 1970. Using Formula (31) we can construct
The best forecasts were obtained by the Table 7.
ARIMA(0, 2, 1) model, followed by the If one of the time series models were
United Nations proj ections and the to be chosen, it should be the ARIMA
ARIMA(1, 1, 0) model. The results (0, 2, 1). The first reason is that it is
obtained assuming fixed age-specific more reasonable to have second differ-
birth and-death rates and no migration, ences (V2zt) for human populations sta-
although inferior to the time series fore- tionary than first differences (Vzt). The
casts, are much better than the results second is that the forecasts for 1965 and
obtained by the logistic function. Notice 1970 for the ARIMA (0, 2, 1) model are
that the forecast for 1965 using the better than the corresponding forecasts
logistic curve is smaller than the Swedish for the ARIMA(1, 1, 0) model (see
population in 1960. Tables 6 and 7).
TABLE 6.-Mid-Year Population and Forecasts for Sweden for 1965 and 1970 Based on the
Population up to 1960 (In millions)
Population Forecast
Mid-year Fixed Rates

Year Population ARIMA(l,l,0) ARIMA(0,2,1) No Migration U.N. Logistic
1965 7.734 7.666 7.716 7.636 7.700 7.413
1970 8.046 7.835 7.953 7.791 7.920 7.579
Sources: Population data were obtained from United Nations, Demographic Yearbook,
1966, 1971. ARIMA(l,l,0) and ARIMA (0,2,1) forecasts were calculated according
to Equations (22) to (25). Projections with fixed rates and no migration were
obtained from Keyfitz and Flieger, 1971. U.N. projections were obtained from
United Nations, 1966. Logistic projections were calculated using Equation (30).
TABLE 7.-Quality of Predictions as Measured by Formula (31)
Constant Rates
Year ARIMA(1,1,0) ARIMA(0,2,1) No Migration U.N. Logistic
1965 .31 .82 .0 .65 -2.28
1970 .17 .64 .0 .51 - .83
Source: The quality of the various predictions is measured by Formula (31).
CONCLUSIONS The main advantage of the time series

models over most of the other usual
Although the results obtained here are
ways of projecting populations is that
encouraging, we should be aware of the
the time series forecasts can be given by
limitation of the application of time
normal distributions, and confidence in-
series analysis techniques to population
tervals can be easily obtained. Thus,
forecasting. This is due to the fact that
following the recommendation of Key-
in order to have good estimates for the
fitz (1972), we have a forecast given by
parameters of the model we should use
a probability distribution and not simply
at least 50 observations of the time series
by a number.
(Box and Jenkins, 1970). Even Sweden,
which is the country with the best de- APPENDIX
mographic data, has not such a long
Mathematically, a time series is a
demographic history if five-year inter-
sequence of observations at equispaced
vals are to be taken. Even taking one-
intervals of time. Let zt be the observation
year intervals, most countries do not
attimet, whereteI1, 2, * I *
have population data for 50 consecutive
The backward shift operator B is defined
years. Can we reduce this number and
by Bz, = z, , and in general Bmz, = Zt-m
still find a useful model? This can only Another operator used is the backward
be answered by carefully applying the
difference operator V, which is defined by
time series technique to other countries
Vzt = Zt - z,-. In general Vdz' =
and by comparing the forecasts obtained
V (Vd-z).
with future populations. The Box-Jenkins models are based on
Other demographic parameters, such the idea (Yule, 1927) that a time series
as number of births and deaths (or birth in which successive values are highly
and death rates), reproductive value, life dependent can be regarded as being
expectancy, infant mortality, depen- generated from a series of independent
dency rate, etc., rmight also be studied
shocks a,. The shocks used here have a
by similar time series models, so that normal distribution with mean zero and
future trends could be predicted. This variance 0a'. Such a sequence of random
was indeed done for U. S. births, and an variables a, is called white noise process.
ARIMA (0, 1, 2) model was found. De-
tails of this model can be obtained from Stationary Models
the author upon request. A stationary process zt is one in which
Further research is needed to under- the joint probability associated with m
stand the meaning of the parameters of observations zt, zt2, ... , Ztm made at any
the time series population models. Also set of times t,, t2y * * tm is the same as
it would be interesting to relate such that associated with m observations zt,+k,
parameters to the classical population Zt2+k, .* . , Zt,m+k made at times t1 + k,
models. t2+ ky., * tm + k.
and IAW
Let IA be the mean about = thethe
which mean about whi
process varies, and let difference of z, varies.
2t= -Zt (A.1) Model Identification and Estimation
The mixed autoregressive-moving av- For a stationary process let p(z) be

erage model of order (p, q), ARMA the probability distribution of Zt at an
(p, q), is defined by the relation arbitrary time t.
The mean of the process, which meas-
2t = 41Zt-1 + .. + &Zt-p ures the level about which it fluctuates,
is defined by
+ a, - 01a,- - - Oqat-q (A.2)
or
= E(zt)
O(B)2t= O(B)at, (A.3) co
where = f zp(z) dz. (A.1)

_co
?~(B) = 1-qB-42B2 _PBP And the variance, which measures its

(A.4) spread, is defined by
and
= -2E[(z, _)2]
O(B)= 1- 01B- 02B2- OqB'. co
(A.5) = f (z -Z _A)2p(Z) dz. (A.12)
_co
If q = 0, it becomes the autoregressive

model of order p, AR(p) , defined by Having N observations of a process, it is
possible to estimate its mean and vari-
O(B)2t = a,. (A.6) ance, respectively, by
And if p = 0, it becomes the moving N
average model of order q, MA(q), de- =(1/N) EZt (A.13)

t=
fined by
and
zt = o(B)a,. (A.7)
N
Yonstationary Models
z= (1/N) E (z - Z)2. (A.14)
t =1
Many series encountered in the real
life exhibit nonstationary behavior and The autocovariance at lag k is defined by
do not vary about a fixed mean. This is
7k = cov (Ztr Zt+k)
the case of the population series we used
in this paper. Such series may become
= E[(zt - /)(Zt+k - AAY
stationary after taking differences in
level. In general it is necessary to take k = 0,1, 2, .2 . (A.15)
the d'th difference of the time series in
And the autocorrelation at lag k is de-
order to have a stationary process. And
fined by
the autoregressive integrated moving av-
erage model of order (p, d, q), ARIMA Pk = Yk/7Oz. (A. 16)
(p, d, q) , is defined by
An estimate for the autocovariance
O(B)?t = o(B)a, , (A.8) N-k
where Ck = (1/N) (Zt - 2)(Zt+k - 2),
wt = Vdzt (A.9) k =0 12* ** n, (A.17)
= Wt- w (A.10) where n should not be larger than N/4.
An estimate for the k'th lag autocorrela- the parameters of the polynomials + (B)
tion Pk iS given by and @(B).
rk = Ck/CO- (A.18) Forecasts
The partial autocorrelation at lag k, Box and Jenkins (1970, Chapter 5)

4kk, is the solution of the Yule-Walker show that the minimum square error
equations forecast, z(1) at origin t for lead time 1 is
1 Pi P2 ... Pk-1 rkl Pi
Pi 1 Pi Pk-2 4k2 P2
... . 1.= k = 1 2k=, 2, . (A. 19)
_Pk-1 Pk-2 Pk-3 ... 1 1 LkkJ LPkl
By substituting estimates rk for the the conditional expectation of z,+,, given

theoretical autocorrelations pk in the knowledge of all z's up to time t. Also, on
Yule-Walker equations, 'kk can be esti- the assumption that the independent
mated. shocks a, have a normal distribution with
The autocorrelation function (pk, k - mean zero and variance Ua 2 it follows
1, 2, * * *) and the partial autocorrelation that, given the information up to time t,
function ('pkk, k = 1, 2, * * *) are the key the conditional probability distribution
elements in identifying the appropriate of a future value z,+ is also normally
model(s). The general procedure for distributed with mean Z (1) and variance
identifying a time series model is to esti- V(l) that can be easily estimated. Thus we
mate the autocorrelation and partial can obtain confidence intervals for the
autocorrelation functions for the original forecasts.
time series zt and successive differences
Vdzt, d = 1, 2, * . By comparing the
ACKNOWLEDGMENTS
foregoing functions with the theoretical
functions for the various models we I want to express my gratitude to Pro-
should be able to identify one or more fessor Richard E. Barlow, who intro-
models. The theoretical autocorrelation duced me to the study of populations
and partial autocorrelation functions for and to time series analysis, for the
the different time series models are stimulus and many suggestions given
studied in Chapter 3 of Box and Jenkins during the development of this research.
(1970) and are summarized in their Also, I am indebted to the Editor and
Table 3.2. We should keep in mind, how- the referees for their constructive criti-
ever, that the estimated autocorrelations cisms. My thanks are also extensive
can have rather large variances and can to Barbara Brewer and Sandra Ham
be highly autocorrelated with each other. for typing the manuscript. This research
Thus the estimated autocorrelations will has been partially supported by the
differ somewhat from the theoretical Office of Naval Research under Con-
autocorrelations. Box and Jenkins (1970) tract N00014-69-A-0200-1036, CAPES,
provide us with formulas to estimate the and COPPE-Federal University of Rio
standard errors for the estimated auto- de Janeiro, Brazil, with the University
correlation and partial autocorrelation of California. Reproduction in whole or
functions. They also present methods for in part is permitted for any purpose of
estimating the residual variance Ua2 and the United States Government.
REFERENCES of Growth of the Population of the United

States Since 1790 and Its Mathematical
Box, G. E. P., and G. M. Jenkins. 1970. Time Representation. Proceedings of the National
Series Analysis: Forecasting and Control. Academy of Sciences 6:275-288.
San Francisco: Holden-Day.
Sweden. Central Bureau of Statistics. 1955.
Gille, H. 1949. The Demographic History of
Historical Statistics of Sweden, No. 1. Pop-
the Northern European Countries in the
ulation: 1720-1950. Stockholm: Government
Eighteenth Century. Population Studies 3:
Publishing House.
3-65.
Keyfitz, Nathan. 1968. Introduction to the United Nations. 1948-1971. Demographic Year-
book, 1st to 23rd issues. New York: United
Mathematics of Population. Reading: Ad-
Nations.
dison-Wesley.
. 1972. On Future Population. Journal 1966. World Population Prospects as
of the American Statistical Association 67: Assessed in 1963. Population Studies No. 41,
347-363. Series A/41. New York: United Nations.
and Wilhelm Flieger. 1971. Population: Yule, G. U. 1927. On a Method of Investigat-
Facts and Methods of Demography. San ing Periodicities in Disturbed Series, with
Francisco: W. H. Freeman and Co. Special Reference to Wolfers' Sunspot Num-
Pearl, R., and L. J. Reed. 1920. On the Rate bers. Philosophical Transactions A226, A267.

Population Association of America, Springer Demography

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Population Association of America, Springer Demography

Uploaded by

Copyright:

Available Formats

Modeling and Forecasting Populations by Time Series: The Swedish Case

Author(s): Joao L. M. Saboia

Population Association of America, Springer are collaborating with JSTOR to digitize,

MODELING AND FORECASTING POPULATIONS BY

Abstract-Time series analysis techniques are used to model and to forecast

INTRODUCTION were obtained from the Demographic

t Year zt it-l(l) zt-2(2) Zt-(l) zt-2(2) zt-1(l) zt-2(2)

where Z= population at time t;

TABLE 2.-Mean, Variance, Autocorrelations THE AUTOREGRESSIVE MODEL

(1 -B)?t = a,, (6)

r5 .613 .096 .017 w= E wt/(N -1), (8)

r7 .464 .133 .127

!44 .000 .054 -.202 =.540. (9)

where minimum mean square error forecast

As mentioned before, the standard error A (2) 2z(1) =z (25)

TABLE 3.-Summary of Fitted Models

Type of Model Fitted Model Residual Variance

ARIMA(1,1,) zt= 1.540zt_1 - .540z t2 + .069 +at .00442

zt = 2zt-, Zt-2 + at - .628a,1 (28) * exp (-.0118t)]. (30)

Model Year Population (z t) With 50 Percent Limit With 95 Percent Limit

ARIMA(l,l,0) 1965 7.734 7.666 + .045 7.666 + .130

ARIMA(0,2,1) 1965 7.734 7.716 + .046 7.716 + .133

ARIMA(l,l,0) 1970 8.046 7.835 + .082 7.835 + .239

ARIMA(0,2,1) 1970 8.046 7.953 + .077 7.953 + .223

Model Year With 50 Percent Limit With 95 Percent Limit

ARIMA(1,1,0) 1975 8.296 + .045 8.296 + .132

ARIMA(1,1,0) 1980 8.510 + .086 8.510 + .250

Mid-year Fixed Rates

1965 7.734 7.666 7.716 7.636 7.700 7.413

1970 8.046 7.835 7.953 7.791 7.920 7.579

TABLE 7.-Quality of Predictions as Measured by Formula (31)

1965 .31 .82 .0 .65 -2.28

1970 .17 .64 .0 .51 - .83

Source: The quality of the various predictions is measured by Formula (31).

CONCLUSIONS The main advantage of the time series

The mixed autoregressive-moving av- For a stationary process let p(z) be

where = f zp(z) dz. (A.1)

?~(B) = 1-qB-42B2 _PBP And the variance, which measures its

If q = 0, it becomes the autoregressive

average model of order q, MA(q), de- =(1/N) EZt (A.13)

where Ck = (1/N) (Zt - 2)(Zt+k - 2),

wt = Vdzt (A.9) k =0 12* ** n, (A.17)

= Wt- w (A.10) where n should not be larger than N/4.

rk = Ck/CO- (A.18) Forecasts

The partial autocorrelation at lag k, Box and Jenkins (1970, Chapter 5)

1 Pi P2 ... Pk-1 rkl Pi

_Pk-1 Pk-2 Pk-3 ... 1 1 LkkJ LPkl

By substituting estimates rk for the the conditional expectation of z,+,, given

REFERENCES of Growth of the Population of the United

You might also like