You are on page 1of 4

DATA ANALYSIS / Time Series Analysis 621

Time Series Analysis


G R North, Texas A&M University, TX, USA to generating realizations of the height random varia-
Copyright 2003 Elsevier Science Ltd. All Rights Reserved. ble. Most realizations of such a process yield values near
the center of the bell-shaped curve, with large excur-
sions from the population average being very rare.
The mean of a random variable is the arithmetic
Introduction
average of its values taken over many realizations
Study of the records of weather and climate over time (actually the limiting value derived from innitely
is extremely valuable for practical as well as theoret- many). The variance is the average of the squares of the
ical purposes. For example, the instantaneous temper- deviations from the mean. The deviation from the
ature or humidity taken at hourly intervals constitutes mean for two variables can be multiplied together and
a time series of measurements. Such data might be used averaged over many realizations to form the covari-
for many purposes such as assembling a climatological ance of the two random variables. As the name
summary for the site or for performing an analysis to implies, the covariance is a measure of how two
identify the underlying physical processes responsible random variables covary. If the two variables are
for certain interesting features of the data. In such identical then their covariance is simply the variance.
an example, we might not be surprised to nd a daily If the two variables are not statistically related to one
swing of the temperature and humidity as well as another then the covariance is zero. The correlation
an annual oscillation. Were we to plot these hourly between two random variables is the covariance
temperatures, a nearly repeating graph of period 24 divided by the square root of the product of the
hours would result (the diurnal cycle). Examination variances. Its magnitude is always less than or equal to
over longer time spans leads to the identication of an unity. Anticorrelated variables have negative correla-
annual cycle in the data. By simple inspection, we tion (and covariance). In a time series we are often
would rather quickly decide that there is some interested in how an entry covaries with or is related to
underlying physical agent responsible for these near past (or future) entries. In particular, we would like to
repetitions or periodic statistics in the data stream. know if it is correlated with immediately past entries, a
This is an example of the most primitive form of time property known as serial correlation.
series analysis.
Many decades of experience have led environmental
and statistical scientists to devise very sophisticated
White Noise
methods of studying records of time series. The most
powerful innovation is the idea of a mathematical The simplest model of a randomly based time series is
model of the time series. A mathematical model one in which each new time step leads to a statistically
involving random components is a very convenient independent (no serial correlation) drawing (like the
way of representing a time series of data. Such models heights of individuals above); but each new entry is
always employ simplifying assumptions, but such drawn from an identical distribution. Each data entry
techniques work in a surprisingly large number of is statistically independent of the previous one. This
applications, especially in the geosciences. model is called the statistically independent, identi-
Before describing how such a time series can be cally distributed white-noise model. It is the most
constructed, we must introduce the idea of a random important model in time series analysis since almost all
variable. A random variable can take on values drawn other statistical models are derived from it.
from a certain probability distribution. For example, A few properties of the white noise model (we
the outcomes of ipping a coin are a random variable. hereafter assume that the identically distributed
Each ip constitutes a realization of the random property is included) are worth mentioning. First,
variable either heads (H) or tails (T). The probability the mean of the entries of such a time series can be
distribution is 5050 (probability 0.5 for H, 0.5 for T). estimated by adding the values of a long series of
More commonly encountered in geophysical and entries and dividing by the number of entries used. In
behavioral sciences is the case where the variable can doing this we are actually nding an estimate of the
take on a range of discrete or continuous values and its underlying mean of the probability distribution of the
frequency distribution is the normal or bell-shaped individual entries. Another property of time series that
curve distribution. An example is the distribution of is easily demonstrated with the white-noise model
heights of individuals. Drawing names from a hat and time series is that of an independent realization of a
announcing the height of the individuals is equivalent time series. Above we introduced the random variable.
622 DATA ANALYSIS / Time Series Analysis

Y (n)
5

n
20 40 60 80 100

Figure 1 An example of a realization of a white-noise time series with 100 entries. The mean of this time series is 2.00 and the entries are
drawn from a normal distribution with unit variance. In this plot as in subsequent ones the entries are joined by a continuous line.

In time series we actually have a random function or and these are common in nature. (Strictly speaking this
string of random variables, with individual realiza- only denes a second moment stationary time series,
tions being individual graphs of the function. The since nonstationary properties of the probability
property that is so evident is that averaging along the distribution may still be present. If the random
time series is equivalent to xing the time and variables are normally distributed, these mean and
averaging across realizations to nd the mean. This covariance stationary properties sufce to determine
property of ensemble averaging is a very powerful one, the strong forms of stationarity.) Perhaps some exam-
which enables many of the proofs and analyses of time ples of nonstationary time series will help clarify the
series analysis. An example of a single realization of a concept. The diurnal or seasonal data mentioned
white-noise process is shown in Figure 1. above are examples of nonstationary time series since
An example of a white-noise time series is the their means depend on local time of day or time of year.
heights of students standing in a cafeteria queue. On In addition, their variances will also have such a phase
the other hand, consider the heights of succeeding rst dependence; even their serial correlation structure
sons in an ancestral sequence. There is a genetically may have a phase dependence for example, the serial
determined correlation between the height of a father correlation between entries may by greater in winter
and that of son. Such a time series exhibits a positive than in summer. At rst glance the sequence of heights
serial correlation which diminishes to zero after a few of rst sons across generations may seem like a
generations a phenomenon known as regression to stationary time series, but there is known to be a
the mean. secular trend of increasing heights over generations,
probably because of better nutrition.
Despite our ability to enumerate many time series
that are nonstationary, the model of a stationary time
Stationary Time Series
series is very valuable in the geosciences. For example,
Consider next a time series which is not necessarily annual averages of temperature at a location are likely
white-noise. That is to say, an entry may not neces- to form a stationary time series at least to a good
sarily be statistically independent of its predecessor approximation. The statistics of such a time series
there may be serial correlation. Nevertheless, the time (mean, variance, serial correlation properties) make a
series may not have any preferred origin. That is, as we good summary of the sequence and for many purposes
look along the time series, statistically speaking, each may form an adequate substitute in practical applica-
time is equivalent to each other time. In such a time tions. For example, an insurance company may want
series the mean is independent of time and so is the to know the likelihood of the temperature (or ood
variance. The covariance between an entry and that a water level) exceeding a given threshold. The serial
certain number of steps, say n, earlier depends only on correlation structure is particularly important in
the temporal separation or lag, n. A time series having drought, where sequences of dry years can be the
the above properties is called a stationary time series most important indicator of consequences.
DATA ANALYSIS / Time Series Analysis 623

Autoregressive Processes A
1
The most common type of time series encountered in
the geosciences is the rst-order autoregressive process 0.8
(known as the AR1 process). In this process each new
entry can be written mathematically as the sum of two 0.6
terms, the rst proportional to the previous entry, the
second an additive white-noise term. Higher-order 0.4
autoregressive processes (ARn) model the next entry
as a sum of n 1 terms, the rst n of which are 0.2
proportional to the previous n entries along with the
additive white-noise term. We concentrate here on the Lag
2 4 6 8 10
AR1 process because of its central importance. The
parameters which describe the time series are its mean, Figure 3 The autocorrelation function corresponding to the AR1
its variance, and its so-called lag-one serial correla- process depicted in Figure 2. The lag is treated as a continuous
tion. It is the job of the analyst to take the given data variable for this plot for clarity of the display. The autocorrelation
time for this process is about 4.0.
series and determine or t the parameters to the data.
That is, one wants to know the mean, variance, and
lag-one serial correlation in the data. If the lag-one
applications. One begins the analysis by taking the
serial correlation turns out to vanish, then we infer
nite-length segment of data in the sequence and
that the series can be modeled by a white-noise time
estimating the Fourier coefcients for representing the
series (AR0). If the lag-one serial correlation is r then
data as a Fourier series on the segment. In this process
the lag-two is r2 , and so on. In the limit of very small
one is representing the data in terms of the Fourier
time steps in the series this tends to an exponential
coefcients instead of the temporal entries. The two
falloff of serial correlation. The value of n for which
are equivalent ways of expressing the content of the
the serial correlation falls to 1=e 0:3678 . . . is
data. Each Fourier coefcient is a component or
known as the autocorrelation time. The autocorrela-
amplitude of a certain sinusoidal waveform in the data
tion time is a measure of the memory of the system. It is
stream. From the point of view of time series modeling,
often said that the system forgets its past values after
the Fourier coefcients are random variables, since
a few autocorrelation times (Figures 2 and 3).
from one realization of the process on the same
segment to another the coefcients will differ. How-
ever, they will have certain statistical properties
Fourier Analysis of Time Series common across the ensemble of realizations. If the
If there are physical reasons to think that a time series segment is sufciently long and the series is stationary,
of data is stationary, then Fourier analysis of the data it can be proven that the Fourier coefcients corre-
can lead to a number of powerful techniques useful in sponding to different frequencies are uncorrelated.
This permits us to perform an analysis of variance over
the different frequency bands to examine how vari-
Y (n) ance is distributed over frequencies. It is routine to plot
6 a graph of the variance or sometimes known as power
as a function of frequency. This is known as spectral
5 analysis.
The most common example is the white-noise time
4
series. The white-noise spectrum is at; that is, every
3 frequency band has allotted the same variance. Hence,
one way of determining whether a certain time series is
2 white-noise is to perform the Fourier analysis and plot
the spectrum (variance or power versus frequency). If
1
the spectrum is at, we can infer that the time series is
n white-noise. Of course, if the time series segment is
20 40 60 80 100 short, there will be problems in estimating the spec-
Figure 2 An example of a realization of an autoregressive
trum of the underlying process because of sampling
process of order one. In this example the present entry is 0.75 times error. Analysts have devised many useful techniques
the previous entry with an added normally distributed variable of for statistical testing of the white-noise hypothesis.
variance 0.25. The mean of the time series is 3.00. The term white-noise spectrum derives from optics,
624 DATA ANALYSIS / Time Series Analysis

S The estimate will not only provide a most probable


1 value of the future entry, but some assessment of the
0.5 uncertainty in the forecast perhaps even a theoretical
0.2 frequency distribution of values that can be expected.
0.1 Interpolation is a second application of time series
0.05 modeling. Suppose there are missing values in an
0.02 empirical time series and for some reason one wishes to
0.01 insert values that are statistically consistent with the rest
of the entries. First one nds a model of the time series and
Frequency then one can nd the most probable entry with an
0.01 0.05 0.1 0.5 1 5 10
associated theoretical frequency distribution. Depending
Figure 4 A loglog plot of the spectrum associated with the AR1 on the application, one may wish to insert the most
process depicted in the previous two gures. Note that the log of the probable value or add to it a random number which is
spectrum turns over at about a frequency of 2p divided by the
autocorrelation time, which in this case is taken to be 4.00.
consistent in a statistical sense with the nearest neighbors.
A third application is in the area of signal process-
ing. One often nds a deterministic signal embedded in
where it refers to white light, which has an electro- some kind of noise (white or colored). The object
magnetic energy intensity which is somewhat uni- usually is to separate the noise from the signal and to
formly distributed across the visible part of the estimate the amplitude of the signal. This is the
spectrum. By the same analogy, a red noise spectrum problem in radio reception. We refer to the process as
is one which has its energy more concentrated in the detection. By time series modeling and knowing some
lower frequencies as opposed to being uniform. characteristics of the signal wave form one can nd an
The AR1 spectrum is of the red noise type. The optimal estimate of the signal strength. In electronics
characteristic frequency where the spectrum begins one might want to clean the noise away from the signal
to turn downwards is at about the inverse of the and amplify the residual, while in other applications
autocorrelation time (Figure 4). one might want to nd the amplitude of a periodic
signal such as the diurnal or seasonal cycle. One of the
most famous applications of this type in the geosci-
Applications of Time Series Analysis
ences is the detection of periodic signals in the record
Not only do time series analyses provide new insights of changes in continental ice sheet volume. These
into the underlying physical processes generating an excursions have been found to contain significant
empirical time series, but they are useful in a variety of variance peaked at periods of 100 000, 43 000 and
practical applications. 20 000 years, and these just happen to coincide with
A very common use of time series analysis is data the periods of the changes in the elliptical orbital
smoothing. Often one wishes to smooth out the highly parameters of planet Earth (eccentricity, obliquity, and
irregular short-term uctuations in a time series to get a precession of the equinoxes). Thus time series analysis
better view of longer-term trends or undulations. This was able to show conclusively that the ice ages are
can be accomplished by running a smoother over the linked to the changes in the Earths orbital elements.
time series. For example, one might take as the value at a
certain time the arithmetic average over several future See also
and past values. These past and future entries can be
Data Analysis: Empirical Orthogonal Functions and
weighted in various ways to make the smoothing
Singular Vectors.
optimal for the particular application. This class of
operations is known as moving-average smoothing.
Further Reading
Perhaps the most important application is forecast-
ing. One might ask, given a segment of a data-derived Bendat JS and Piersol AG (1986) Random Data: Analysis
time series, if it possible to use this information to and Measurement Procedures. New York: Wiley.
forecast future entries. The answer in principle is Bloomeld P (1976) Fourier Analysis of Time Series: An
Introduction. New York: Wiley.
simple. In the case of a white-noise time series (or an
Chateld C (1992) The Analysis of Time Series An
empirical one which is indistinguishable from white
Introduction. New York: Chapman & Hall.
noise) there can be no forecast skill, since each entry is Percival DB and Walden AT (1993) Spectral Analysis for
statistically independent of the past entries. But in the Physical Applications. Cambridge: Cambridge Universi-
case of an AR1 process there is correlation with past ty Press.
entries and this will permit some statistical estimate of Wei WWS (1990) Time Series Analysis. Redwood City CA:
future entries out to roughly one autocorrelation time. Addison-Wesley.

You might also like