You are on page 1of 96

Stochastic Analysis,

Modeling, and Simulation (SAMS)


Version 2000
USER's MANUAL
J. D. Salas, N. Saada, C. H. Chung, W. L. Lane, and D. K. Frevert
October, 2000

Computing Hydrology Laboratory


Water Resources, Hydrologic and Environmental Sciences
Engineering Research Center
Fort Collins, Colorado
TECHNICAL REPORT No.10

Stochastic Analysis, Modeling, and


Simulation (SAMS)
Version 2000 - User's Manual
by
Jose D. Salas1, Nidhal Saada2, and Chen-hua Chung2
Water Resources, Hydrologic and Environmental Sciences
Department of Civil Engineering, Colorado State University
Fort Collins, Colorado, U.S.A

William L. Lane3
Consultant, Hydrology and Water Resources Engineering,
1091 Xenophon St., Golden, CO 80401-4218.
and

Donald K. Frevert4
U.S Department of Interior
Bureau of Reclamation
Denver, Colorado
U.S.A

Professor, Water Resources, Hydrologic and Environmental Sciences, Civil Engineering


Department, Colorado State University.
2

Former graduate students, Water Resources, Hydrologic and Environmental Sciences , Civil
Engineering Department, Colorado State University.
3

Consultant, Hydrology and Water Resources Engineering, 1091 Xenophon St., Golden, CO
80401-4218.
4

Hydraulic Engineer, Water Resources Services, Technical Service Center, U.S Bureau of
Reclamation, Denver, CO 80225.

TABLE OF CONTENTS
Page
PREFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. DESCRIPTION OF SAMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 General Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Statistical Analysis of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Fitting a Stochastic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Generating Synthetic Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3. DEFINITION OF STATISTICAL CHARACTERISTICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1 Basic Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Annual Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Seasonal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Flood, Storage, and Drought Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 Storage Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Drought Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.3 Surplus Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32
32
32
32
33
33
34
35

4. MATHEMATICAL MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1 Data Transformations and Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Univariate ARMA (p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Univariate GAR (1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Univariate PARMA (p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Multivariate MAR (p) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6 Multivariate CARMA (p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7 Multivariate MPAR (p) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8 Disaggregation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8.2 Model Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.9 Model Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35
35
36
38
40
43
44
47
48
48
49
53

5. EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1 Statistical Analysis of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Stochastic Modeling and Generation of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Univariate ARMA(p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2 Univariate GAR(1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.3 Univariate PARMA(p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.4 Multivariate MAR(p) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.5 Multivariate CARMA(p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.6 Disaggregation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57
57
61
61
64
65
68
71
74

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
APPENDIX A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
APPENDIX B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
APPENDIX C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

PREFACE
Several computer packages has been developed since the 1970's for analyzing the stochastic
characteristics of time series in general and hydrologic and water resources time series in particular.
For instance, the LAST package was developed in 1977-1979 by the US Bureau of Reclamation
(USBR) in Denver, Colorado. Originally the package was designed to run on a mainframe
computer, but later it was modified for use on personal computers. While various additions and
modifications have been made to LAST over the past twenty years, the package has not kept pace
with either advances in time series modeling or advances in computer technology. These facts
prompted USBR to promote the initial development of SAMS, a computer software package that
deals with the Stochastic Analysis, Modeling, and Simulation of hydrologic time series, particularly
annual and seasonal streamflow series. It is written in C and Fortran and runs under modern
windows operating systems such as WINDOWS NT and WINDOWS 98. This manual describes
the current version of SAMS denoted as SAMS 2000.
ACKNOWLEDGEMENTS
SAMS has been developed as a cooperative effort between USBR and Colorado State
University (CSU) under USBR Advanced Hydrologic Techniques Research Project through an
Interagency Personal Agreement with Professor Jose D. Salas as Principal Investigator. Drs. W.L.
Lane and D.K. Frevert provided additional expert guidance and supervision on behalf of USBR.
Several former CSU graduate students collaborated in various parts of this project including, M.W.
AbdelMohsen, who developed many of the Fortran codes, M. Ghosh who initiated the programming
in C language followed by Mr. Bradley Jones, Nidhal M. Saada, and Chen-Hua Chung.
Acknowledgements are due to the funding agency and to the several students who collaborated in
this project.

ii

STOCHASTIC ANALYSIS, MODELING, AND SIMULATION


(SAMS 2000)
1. INTRODUCTION
Stochastic simulation of water resources time series in general and hydrologic time series
in particular has been widely used for several decades for various problems related to planning and
management of water resources systems. Typical examples are determining the capacity of a
reservoir, evaluating the reliability of a reservoir of a given capacity, evaluation of the adequacy of
a water resources management strategy under various potential hydrologic scenarios, and evaluating
the performance of an irrigation system under uncertain irrigation water deliveries (Salas et al, 1980;
Loucks et al, 1981).
Stochastic simulation of hydrologic time series such as streamflow is typically based on
mathematical models. For this purpose a number of stochastic models have been suggested in
literature (Salas, 1993; Hipel and McLeod, 1994). Using one type of model or another for a
particular case at hand depends on several factors such as, physical and statistical characteristics of
the process under consideration, data availability, the complexity of the system, and the overall
purpose of the simulation study. Given the historical record, one would like the model to reproduce
the historical statistics. This is why a standard step in streamflow simulation studies is to determine
the historical statistics. Once a model has been selected, the next step is to estimate the model
parameters, then to test whether the model represents reasonably well the process under
consideration, and finally to carry out the needed simulation study.
The advent of digital computers several decades ago led to the development of computer
software for mathematical and statistical computations of varied degree of sophistication. For
instance, well known packages are IMSL, STATGRAPHICS, ITSM, MINITAB, SAS/ETS, SPSS,
and MATLAB. These packages can be very useful for standard time series analysis of hydrological
processes. However, despite of the availability of such general purpose programs, specialized
software for simulation of hydrological time series such as streamflow, have been attractive because
of several reasons. One is the particular nature of hydrological processes in which periodic
properties are important in the mean, variance, covariance, and skewness. Another one is that some
hydrologic time series include complex characteristics such as long term dependence and memory.

Still another one is that many of the stochastic models useful in hydrology and water resources have
been developed specifically oriented to fit the needs of water resources, for instance temporal and
spatial disaggregation models. Examples of specific oriented software for hydrologic time series
simulation are HEC-4 (U.S Army Corps of Engineers, 1971), LAST (Lane and Frevert, 1990), and
SPIGOT (Grygier and Stedinger, 1990).
The LAST package was developed during 1977-1979 by the U. S. Bureau of Reclamation
(USBR). Originally, the package was designed to run on a mainframe computer (Lane, 1979) but
later it was modified for use on personal computers (Lane and Frevert, 1990). While various
additions and modifications have been made to LAST over the past 20 years, the package has not
kept pace with either advances in time series modeling or advances in computer technology. This
is especially true of the computer graphics. These facts prompted USBR to promote the initial
development of the SAMS package. The first version of SAMS (SAMS-96.1) was released in 1996.
Since then, corrections and modifications were made based on feedback received from the users.
In addition, new functions and capabilities have been implemented.
SAMS 2000 has the following capabilities and limitations:
1. Analyze annual and seasonal data. For seasonal data the maximum number of seasons is 12 (time
intervals within a year).
2. It includes several types of transformation options to transform the original data into normal.
3. It includes a number of single site, multisite, and disaggregation stochastic models that have been
widely used in literature.
4. It includes two major modeling schemes for modeling and generation of complex river network
systems.
5. Maximum number of stations is 40.
6. Maximum number of stations for a group (for purposes of multivariate disaggregation) is 10.
7. Maximum number of years for the input data file is 600.
8. The number of samples that can be generated is unlimited.
9. The number of years that can be generated is unlimited.
The purpose of this manual is to provide a detailed description of the current version of
SAMS developed for the stochastic simulation of hydrologic time series such as annual and monthly
streamflows.

2. DESCRIPTION OF SAMS
In section 2.1, a general description of
SAMS is presented in which different operations
undertaken by SAMS are briefly explained.
Then, each operation is explained and illustrated
in subsequent sections more thoroughly.
2.1 General Overview
SAMS is a computer software package
that deals with the stochastic analysis, modeling,
and simulation of hydrologic time series. It is

Fig. 1 SAMS main menu

written in C and Fortran and runs under modern


windows operating systems such as WINDOWS

NT and WINDOWS 98. The package consists of many menu option windows which enables the
user to choose between different options that are currently available. SAMS 2000 is a modified and
expanded version of SAMS-96.1. It consists of three primary application modules: 1) Statistical
Analysis of Data, 2) Fitting a Stochastic Model (includes parameter estimation and testing), and 3)
Generating Synthetic Series. Figure 1 shows the SAMS main menu. The user can select any of the
main modules by clicking on the desired option shown in this menu. Before running the
applications, the user must select (open) a file
that contains the (historical) input data. This can
be done by clicking on the "File Menu" option
shown on the top part of the main menu. This
will take the user to another menu, as shown in
Fig.2. Then the user may Open A File (select
a data file) and Display Current Data File
where the content of the opened file can be seen.
Examples of seasonal and annual input files are
shown in Appendices A and B, respectively.
SAMS has the capability of analyzing

Fig. 2 File menu

single site and multisite annual and seasonal data

and the results of the analysis are presented in graphical or tabular forms or are written on output
files. The current version of SAMS can be applied to annual and seasonal data, such as quarterly
and monthly data.
The Statistical Analysis of Data module consists of data plotting, checking the normality
of the data, data transformation, and data statistical characteristics. Plotting the data may help
detecting trends, shifts, outliers, or errors in the data. Probability plots are included for verifying
the normality of the data. The data can be transformed to normal by using different transformation
techniques. Currently, logarithmic, power, and Box-Cox transformations are available. SAMS
determines a number of statistical characteristics of the data. These include basic statistics such as
mean, standard deviation, skewness, serial correlations (for annual data), season-to-season
correlations (for seasonal data), annual and seasonal cross-correlations for multisite data, and
drought, surplus, and storage related statistics. These statistics are important in investigating the
stochastic characteristics of the data.
The second main application of SAMS Fitting a Stochastic Model includes parameter
estimation and model testing for alternative univariate and multivariate stochastic models. The
following models are included: (1) univariate ARMA(p,q) model, where p and q can vary from 1
to 10, (2) univariate GAR(1) model, (3) univariate periodic PARMA(p,q) model, (4) univariate
seasonal disaggregation, (5) multivariate autoregressive MAR(p) model, (6) contemporaneous
multivariate CARMA(p,q) model, where p and q can vary from 1 to 10, (7) multivariate periodic
MPAR(p) model, (8) multivariate annual (spatial) disaggregation model, and (9) multivariate
temporal disaggregation model. Two estimation methods are available, namely the method of
moments (MOM) and the least squares method (LS). MOM is available for most of the models
while LS is available only for univariate ARMA, PARMA, and CARMA models. For CARMA
models, both the method of moments (MOM) and the method of maximum likelihood (MLE) are
available for estimation of the variance-covariance (G) matrix. Regarding multivariate annual
(spatial) disaggregation models, parameter estimation is based on Valencia-Schaake or MejiaRousselle methods, while for annual to seasonal (temporal) disaggregation Lane's condensed method
is applied.
For stochastic simulation at several sites in a stream network system a direct modeling
approach based on multivariate autoregressive and CARMA processes are available for annual data

and multivariate periodic autoregressive process is available for seasonal data. In addition, two
schemes based on disaggregation principles are available. For this purpose, it is convenient to
divide the stations into key stations, substations, and subsequent stations. Generally the key stations
are the farthest downstream stations, substations are the next upstream stations, and subsequent
stations are the next further upstream stations. In the first scheme, the annual flows at the key
stations are added creating an annual flow data at an artificial or index station. Subsequently, a
univariate ARMA(p,q) model is fitted to the annual flows of the index station. Then, a spatial
disaggregation model relating the annual flows of the index station to the annual flows of the key
stations is fitted. Further, a statistical disaggregation model relating the annual flows of the key
station to those of the substations and another disaggregation model relating the annual flows of the
substations and the subsequent stations, are fitted.

In fact, this is a three-level (spatial)

disaggregration procedure. In the second scheme a multivariate AR(p) model is fitted to the annual
data of the key stations, then the rest of the model relating the annual flows at the key station,
substations, and subsequent stations are conducted in a similar manner as in the first scheme.
Furthermore, if the objective of the modeling exercise is to generate seasonal data by using
disaggregration approaches, then an additional temporal disaggregration model is fitted that relates
the annual flows of a group of stations with the corresponding seasonal flows.
The third main application of SAMS is Generating Synthetic Series, i.e. simulating
synthetic data. Data generation is based on the models, approaches, and schemes as mentioned
above. The model parameters for data generation can be those which are estimated by SAMS or
they can be provided by the user. If provided by the user, the program prompts the user to insert the
model type and then the model parameters. The statistical characteristics of the generated data are
presented in graphical or tabular forms along with the historical statistics of the data that was used
in fitting the generating model. The generated data including the "generated" statistics can be
displayed graphically or in table form, and be printed and/or written on specified output files. As
a matter of clarification, we will summarize here the overall data generation procedure for
generating seasonal data based on scheme 2:
(a) a multivariate AR(p) model is used to generate annual flows at the key stations;
(b) a spatial disaggregation model is used to disaggregate the generated annual flows at the key
stations into annual flows at the substations;

(c) a spatial disaggregation model is used to disaggregate the generated annual flows at substation
into annual flows at subsequent stations;
(d) a temporal disaggregation model is used to disaggregate the annual flows at a group of stations
into the corresponding seasonal flows at those stations.
In modeling and data generation of complex water resources systems involving many
stations, despite the versatility of SAMS 2000, keeping track of different options, components,
parameters, etc. involved can be a time consuming and confusing task. To help alleviating this
problem, a Status button (see Fig.3) can be activated. The user can review the current
transformation, modeling, and generation status and related information by clicking on the Status
button in any menu or window.
2.2 Statistical Analysis of Data
Figure 3 shows the statistical analysis data menu. By selecting the annual or seasonal button
the user can specify the type of data to be analyzed. Then, the following operations can be selected:
1. Plot time series data.
2. Check normality and transform time series.
3. Statistical characteristics of time series.
In the following sections, we will examine and
illustrate each of these options.
Plot Time Series Data
Plotting of the data can help in detecting
trends, shifts, outliers, and errors in the data.
SAMS can plot the data as curve, stick, and bar
graphs. Figure 4 illustrates a time series plot for
annual data. The scale of the plot is determined
based on the sample maximum and minimum as
shown in the control bar at the bottom, but the user

Fig.3 Statistical analysis menu

can change it by keying in the desired graph scale


range. This enables the user to zoom in and out the plot to examine the data and do on-screen
graphical check for the variability of the data. Note that if the station names or IDs are available
in the input data file, they will be shown on the plots or tables.
6

Fig.4 Plotting of annual time series

Check Normality and Transform Time series


SAMS tests the normality of the data by plotting the data on normal probability paper and
by using the skewness test of normality. To examine the adequacy of the transformation, the
comparison of the theoretical generated distribution based on the transformation and the counterpart
historical sample distribution are plotted as shown in Fig. 5 for annual data. For seasonal data, the
results of the seasonal skewness tests are presented in graphical and tabular formats. The test critical
values are also shown on the screen which are guides to check whether the data is within the normal
range. For example, if the sample skewness coefficient for a given season is less than or equal to
the critical value, the hypothesis of normality of the data can not be rejected. On the other hand, if
the sample skewness coefficient is greater than the table value, the hypothesis of normality is
rejected. In addition, for the specified season, the normal probability plot for the transformed
seasonal data and the comparison of the theoretical generated distribution and the sample
distribution for that season are also displayed.

Fig.5 Annual data transformation result

If the data at hand is not normal, one can check whether it can be normalized by a certain
transformation function. This can be done by clicking on "Transformations" button and a menu with
different types of transformations will appear. Fig. 6 shows the transformation menu for seasonal
data. The user can choose any type of transformation by simply clicking on the corresponding
button.

Three types of transformations are available: logarithmic, power, and Box-Cox

transformations. The transformation can be done all at once for all seasons or on a season by season
basis.

The user can choose any of the above transformations and accordingly key in the

transformation coefficients, then click the "Display" button to preview the transformation result.
Clicking on the "Accept Transformation" button will actually conduct the transformation for
the data of current station and store the transformation type and coefficients in memory. From this
point, SAMS will recognize the transformed data as the default data and will process this data
instead of the original data. For clarification, suppose that the user has chosen to transform the
annual data for site 1 by a logarithmic transformation and accepted the transformation by clicking

on the

"Accept Transformation" button.

Suppose further that the user wants to model


site 1 data with an ARMA (p,q) model. Then,
the ARMA model will be fitted

to the

transformed data and not the original data.


The question that can be raised here is: can I
get the model to fit the original data (without
having to start the whole process over again)?
The answer is yes. You can get your original
data

back

by

clicking

on

again

the

"Transformation" button, then choose the "No


Transformation" button (shown at the bottom in
Fig.6), and then in the next window (refer to
Figs. 5 and 7) use "Accept Transformation" to

Fig.6 Transformation menu for


seasonal data

retrieve the original data.


The save option (refer to Figs. 5 and 7)
allows the user to save the transformation

parameters in a special file. Before clicking on save, remember to actually transform the data
by clicking on Accept Transformation". Clicking on the "Save" button will prompt a file menu
and allow the user to select the file name (with an extension ".atr" and ".str" automatically attached
for annual and seasonal data, respectively) for storing the transformation parameters. This will
enable the user to access to the transformation parameters at any other time. To understand this
convenient feature of SAMS, suppose that a user transformed the data and fitted the PARMA (1,1)
model to the data. Subsequently, the user wants to fit a different model to the transformed data.
Instead of doing the transformation process over again, the user can simply open the transformation
file which was saved previously.

The user can access to this file by clicking on the

"Transformation" button and then on the "Open File that Contains Transformation Parameters"
button. After the file has been opened, one must click on "Accept Transformation" to actually
transform the data. For multisite data, instead of clicking on "Accept Transformation" for each site,
the user can simply click (once) on "Transform all sites" to conduct the data transformation for all

sites. Figure 7 shows an example of seasonal transformation results.

In the example the

logarithmic transformation has been used with varying values of the coefficient a..

Fig.7 Seasonal data transformation results

The steps that are usually involved in using the transformation window option presented in
Fig.5 and 7 are summarized below:
1. To check normality of data and use transformation options:
!

Key in the proper site number.

Key in the season number (available for seasonal data only).

Click on "Transformation" button.

10

From the transformation menu (for instance see Fig.6 for seasonal data), select a
transformation type.

Click "Display" on the next window (for instance see Fig. 5 and 7).

Key in the transformation coefficients (if necessary) and click "Display". See the
results and try other coefficients as needed.

2. To actually transform the data by using the selected transformation type and coefficients
!

Click on "Accept Transformation" button.

3. To save the selected transformation type and coefficients in a file


!

Click on "Save" button (previously you must have clicked on accept transformation).

4. To transform data by loading the previously saved transformation parameter file


!

Click on "Transformation" button and choose "Open File that Contains


Transformation Parameters" to open the transformation coefficients file.

Click on "Transform all sites".

It is suggested that if transformations are needed for both annual and seasonal data, the user
should conduct annual data transformation before conducting seasonal data transformation.
Statistical Characteristics of Time Series
A number of statistical characteristics can be calculated for the original and transformed data.
They can be available in graphical and
tabular formats and can be saved in an output
file. These are summarized below.
- For Annual Data:
!

Basic statistics such as mean,


standard deviation, skewness
coefficient, coefficient of
variation,

maximum,

and

minimum values.
!

Serial correlation coefficients.

Cross-correlation coefficients
for multisite data.

Drought, surplus (flood), and

Fig.8 Annual statistical characteristics


menu
11

storage related statistics.


Figure 8 shows the annual statistical characteristics menu.
- For Seasonal Data:
!

Basic statistics such as seasonal means, standard deviations, skewness coefficients,


coefficients of variation, maximum, and minimum values.

Season-to-season correlation coefficients.

Season-to-season cross-correlation coefficients for multisite data.

Drought, surplus (flood), and storage related statistics.

Figure 9 shows the seasonal statistical characteristics menu.


The menus shown in Figs. 8 and 9 are
used to select the desired statistics for either
annual and seasonal data, respectively. By
clicking on the "Save Statistics" button the
calculated statistics can be saved in files with
extensions ".ast" and ".sst" for annual and
seasonal

time

series,

respectively.

Depending on the users selection of the type


of statistic a window similar to the one
shown in Fig.10 will appear. The "Graph"
button permits the user to view the statistics
of the data in graphical format. For example,
Fig.10 shows the plot of the season-to-season

Fig.9 Seasonal statistical characteristics


menu

correlation coefficients for monthly data.


The two dashed gray lines represent the 95%
limits. If a correlation coefficient lies between these two lines, it means that the correlation is not
statistically significant. The "List" button provides the same information in tabular format. The user
must key in the needed information such as the site(s) number and other pertinent data depending
on the window at hand and then click on the "Graph" or "List" buttons to view the results. For
instance, the stations indicated in Fig. 10 are stations 1, 2, 3, and 4 and the time lags for calculating
the season to season correlations are 1, 2, and 3. The season to season correlations results are shown

12

for up to 4 stations. If the stations specified are more 4 stations(sites), say 7, then after viewing the
results for the first 4 stations, clicking on the "Next" button will enable one to view the results of the

Fig.10 Window showing the season to season correlations of seasonal data

remaining 3 stations.
2.3 Fitting a Stochastic Model
The LAST package included several programs to perform several objectives regarding
stochastic modeling of time series. The basic procedure involved modeling and generating the
annual time series using a multivariate AR(1) or AR(2) model, then using a disaggregation model
to disaggregate the generated annual flows to their corresponding seasonal flows. In contrast,
SAMS has two major modeling strategies which are direct and indirect modeling. Direct modeling
means fitting an stationary model (univariate ARMA or multivariate AR or CARMA) directly to the

13

annual data or fitting a periodic (seasonal) model (univariate PARMA or multivariate PAR) directly
to the seasonal data of the system at hand. Annual to seasonal disaggregation modeling on the other
hand is an indirect procedure since the modeling of seasonal data involves also modeling of the
corresponding annual data as well. Figure 11 displays the referred direct or indirect (using
disaggregation) modeling procedures under annual or seasonal categories. Regardless whether the
input data available is annual data or seasonal (for example monthly data) the user must select on
the annual button if the final objective of the modeling exercise is to generate annual flows only.
Otherwise, if the objective is to generate monthly
quantities then the seasonal button must be
selected.
The following specific models are
currently available in SAMS under each
category:
1. For Annual Modeling:
!

Univariate ARMA(p,q) model.

Univariate GAR(1) model.

Multivariate AR(p) model (MAR).

Contemporaneous ARMA(p,q)
model (CARMA).

Multivariate annual (spatial)


disaggregation.

2. For Seasonal Modeling:


!

Univariate PARMA(p,q) model.

Univariate

Fig.11 Stochastic modeling menu

seasonal

disaggregation.
!

Multivariate PAR(p) model (MPAR).

Multivariate seasonal disaggregation.

Figures 12 and 13 display the menus that can be used for selecting annual and seasonal
models, respectively. The user will need to click on the button corresponding to the desired model
and in turn a modeling menu will appear where the site number, the model order, etc. can be

14

specified. For example, Fig.14 shows a menu


that can be used to fit a PARMA(p,q) model.
Similar menus are available for ARMA, GAR(1),
MAR, CARMA, and MPAR models. The user
needs to specify the station(s) or site(s)
number(s).

If standardization of the data is

desired, one must click on the "Standardize Data"


button. Generally, the modeling is performed
with data in which the mean is subtracted. Thus,
standardization implies that not only the mean
will be subtracted but in addition the data will be
further transformed to have a standard deviation
equal to one. For example, for the data of season
5 the mean for season 5 will be subtracted from
each data point, then each observed data point for
that season will be divided by the standard
th

deviation of the 5 season. As a result, the mean

Fig.12 Annual stochastic modeling


menu

and the standard deviation of the standardized


data of the 5th season will become equal to zero
and one, respectively. Then, the order of the model to be fitted can be selected by clicking on "Enter
model order" button. For instance, one must enter p and q for ARMA models. In the case of MAR
or MPAR models, the user needs to key in the order p only. Subsequently, the method of estimation
of the model parameters must be selected.
Currently SAMS provides two methods of estimation namely the method of moments
(MOM) and the least squares (LS) method. MOM is available for the ARMA(p,q), GAR(1),
MAR(p), PARMA(p,1), and MPAR(p) models while LS is available for ARMA(p,q), CARMA(p,q),
and PARMA(p,q) models. The LS method requires initial parameters estimates (starting points).
These starting points can be selected by the user or the MOM parameters estimates can be used as
the starting points. For cases where the MOM estimates are not available such as for the PARMA
(p,q) model where q>1, the MOM parameter estimates of the closest model will be used instead.

15

For example, for the PARMA(3,3) model, the MOM estimates of the PARMA(3,1) model (including
zeros for the two remaining parameters) will be used as the starting points. For fitting CARMA(p,q)
models, the residual variance-covariance G matrix can be estimated using either the method of
moments (MOM) or the maximum likelihood estimation (MLE) method (Stedinger et al., 1985).

Fig.13 Seasonal stochastic modeling menu


The estimated model parameters can be
saved in a file selected by the user. This can be
done by clicking on the "Save" button in the
estimation of parameters window and a menu
will appear in which the user can assign the file
name as shown in Fig.15. The file is written in
a certain format and it is recommended that the
user does not change or edit this file unless it is
necessary. Saving the parameters in a file is
important since this file will be used by SAMS in
the generation of data as we will see in the next

16

Fig.14 SAMS modeling menu

sections.

Fig.15 SAMS model parameter window

After the model has been fitted and the estimated parameters have been saved, it is
recommended that the fitted model be tested to ensure that it is appropriate for the data at hand. In
general, this can be done by testing the residuals and comparing the model and historical properties
of the data. SAMS has the ability to perform such testing. Testing of the residuals is an important
part of the modeling process by which the modeler can test whether the fitted model is adequate.
In all the models available in the current version of SAMS except the GAR(1) model, the basic
assumptions about the residuals are that they are normal and independent. SAMS performs certain
statistical tests to check the validity of these assumptions. The hypothesis that the residuals are
normally distributed is tested based on the skewness test of normality. The results are presented in
terms of rejecting or not rejecting the hypothesis. In addition, the residuals are plotted on normal

17

probability paper in order to check graphically whether the residuals are normally distributed. For
testing the independence of the residuals, the Porte Manteau test of independence (Salas, et al, 1980)
is utilized. The correlogram of the residuals is also plotted to help the user in checking the
independence of the residuals. Figure 16 shows an example of results of both normality and
independence tests of the residuals.

Fig.16 Testing the normality and the independence of the residuals


Once the model has been fitted to the data, the moments, e.g. the theoretical covariance
structure can be calculated based on the estimated parameters. Comparing the model and historical
covariance (correlation) structure is another method of testing. SAMS provides the user with the
ability to perform such comparisons. The user must click on "Comparing Model and Historical
Correlations" button and then a window will appear in which the theoretical and historical

18

correlograms are presented in graphical or tabular format. Figure 17 is an example of graphical


comparison of model and historical month-to-month correlations. Additional examination of the
model can be made regarding model parsimony. The so called Akaike Information Criteria (AIC)
may be used for this purpose. SAMS uses AIC for testing model parsimony when stationary ARMA
models are utilized.

Fig.17 Comparing the model and the historical correlograms

Figure 18 illustrates the seasonal disaggregation menu when scheme 1 is chosen under
multivariate seasonal disaggregation (refer to Fig.13). In disaggregation modeling, the user should
conduct the process step by step following the menus order. The steps that have been done will be
marked successively with relevant text or double arrows to update the user. At the end of
disaggregation modeling, the user may click on "Definition of Spatial and Temporal Adjustment "
to define the "adjustment methods" (refer to Fig.19) and the corresponding system structure (refer

19

Fig.18 Seasonal disaggregation modeling menu

to Fig.20) for the stations (sites) that are subject to


modeling. This is necessary if adjustments are needed
for the generated series. The system structure for
adjustment usually depends upon the orders and
positions

of

the

stations

relative

to

each

other. This is important when adjustments need to be


done to the generated series based on spatial
disaggregation. The system structure means defining
for each main river system the sequence of stations
(sites)

that

conform

the

river

network.

SAMS uses the concept of key stations and subkey


stations (substations and subsequent stations). A key
station is the farthest downstream station along a main

Fig.19 Spatial and temporal


adjustment method menu

stream. For instance, station 1 is a key station in the


river system shown in Fig.21. Likewise, 2 and 3 are also key stations. On the other hand, if station

20

1 would not exist (or not used in the analysis), then in this case stations 4 and 5 will become key
stations. Let us continue the explanation assuming that stations 1, 2, and 3 in Fig.21 are key
stations. Substations are the next upstream stations draining to a key station. For instance, stations
4 and 5 are substations draining to key station 1. Likewise, stations 6 and 7 and 8 and 9 are,
respectively, substations for key stations 2 and 3. Subsequent stations are the next upstream stations
draining into a substation. For instance, stations 11 and 12 are subsequent stations relative to
substation 5 and station 10 is a subsequent station regarding substation 4.
On the other hand, for defining a
"disaggregation configuration" SAMS uses the
concept of groups. As shown in Fig.22, a group
consists of one or more key stations and their
corresponding substations.

Groups must be

defined in each disaggregation step. Each group


contains a certain number of stations to be
modeled in a multivariate fashion or "jointly" in
order to preserve their cross-correlations. For
instance, if a certain group has two key stations
and three substations, then the disaggregation
process will preserve the cross-correlations
between all the key and the substations. On the
other hand, if two separate groups are selected,
then the cross-correlations between the stations

Fig.20 System structure input menu


for key station and substations
stations

that belong to the same group will be preserved,


but the cross-correlations between

belonging to different groups will not be


preserved.
The definition of a group is very important in the disaggregation process. For instance,
referring to Fig. 22, key stations 1 and 2 and substations 4, 5, 6, and 7 form one group in which the
flows of all these stations are modeled jointly in a multivariate framework, while key station 3 and
its substations 8 and 9 form another group. In this case, the cross-correlations between the stations

21

Fig.21 Schematic representation of a streamflow network


within each group will be preserved but the crosscorrelations among stations in different groups will
not be preserved.

For example, in the above

configuration,

cross-correlations

the

between

stations 1 and 3 will not be preserved but the crosscorrelations between stations 1 and 2 will be
preserved. On the other hand, if all the stations are
defined in a single group, then the cross-correlations
between all the stations will be preserved. In the
final step of disaggregation, a group may contain
stations 4, 5, 10, 11, and 12. In the current version
of SAMS, the total combined number of stations in

Fig.22 Disaggregation
configuration input menu for
key station and substations
above

any defined group must not exceed 10 stations.


After modeling the annual flows using the

22

configuration, the annual flows can be disaggregated into seasonal flows. This is handled again by
using the concept of groups as was explained above. The user, for example, can choose stations 3,
8, 9, 17, 18, and 19 as one group. In this case, the annual flows for these stations will be
disaggregated into seasonal flows by a multivariate disaggregation model so as to preserve the
seasonal cross-correlations between all the stations.
Currently, SAMS has two schemes for modeling the key stations. The first scheme, denoted
as scheme 1 (see the modeling menus of Figs.12 and 13), will aggregate the annual flows of the key
stations that belong to a certain group, then use a univariate ARMA(p,q) to model the aggregated
flows, then the aggregated annual flows are disaggregated (spatially) back to each key station by
using the Valencia and Schaake or the Mejia and Rouselle disagregation method. The second
scheme, denoted as scheme 2, will model the annual flows of the key stations belonging to a given
group by a multivariate MAR(p) model. Once the flows at key stations are modeled, the rest of the
procedure for generating annual flows at all substations and subsequent stations and then for
generating the seasonal flows at all stations is the same as in scheme 1 (as above mentioned).
Additional details about disaggregation modeling are shown in chapter 3, where a mathematical
description of the disaggregation methods is presented, and in chapter 4, where an example of
disaggregation modeling applied to real data is given.
2.4 Generating Synthetic Series
Data generation is an important subject in stochastic hydrology and has received a lot of
attention in hydrologic literature. Data generation is used by hydrologists for many purposes. These
include, for example, reservoir sizing, planning and management of an existing reservoir, and
reliability of a water resources system such as a water supply or irrigation system (Salas et al,1980).
Stochastic data generation can aid in making key management decisions especially in critical
situations such as extended droughts periods (Frevert et al, 1989). The main philosophy behind
synthetic data generation is that synthetic samples are generated which preserve certain statistical
properties that exist in the natural hydrologic process (Lane and Frevert, 1990). As a result, each
generated sample and the historic sample are equally likely to occur in the future. The historic
sample is not more likely to occur than any of the generated samples (Lane and Frevert, 1990).
Generation of synthetic time series is based on the models, approaches and schemes
presented in section 2.3 of this manual. Once the model has been defined and the parameters have

23

been estimated, one can generate synthetic samples based on this model. SAMS allows the user to
generate synthetic data and eventually compare important statistical characteristics of the historical
and the generated data. Such comparison is important for checking whether the model used in
generation is adequate or not. If important historical and generated statistics are comparable, then
one can argue that the model is adequate. The generated data is stored in a file. This allows the user
to further analyze the generated data as needed. Furthermore, when data generation is based on
spatial or temporal disaggregation, one may like to make adjustments to the generated data. This
may be necessary in many cases to enforce that the sum of the disaggregated quantities will add up
to the original total quantity. For example, spacial adjustments may be necessary if the annual flows
at a key station is exactly the sum of the annual flows at the corresponding substations. Likewise,
in the case of temporal disaggregation, one may like to assure that the sum of monthly values will
add up to the annual value. Various options of adjustments are included in SAMS. Further
description on spacial and temporal adjustments are described in Section 4.8.2.
Figure 23 shows the data generation menu.
In this menu the user must specify necessary
information for the generation process. The type of
data to generate (either annual or seasonal) and the
type of modeling, which is either univariate (single
site) or multivariate (multisite) must be selected.
For example, if the user wants to generate annual
data at a single station by using an ARMA model,
then the option "Annual" and "Single site" must be
selected. On the other hand, to generate seasonal
data at several stations from a disaggregation model,
one must select "Seasonal" and "Multisite". In
addition, the data length (in years) and the number
of samples to be generated, and a seed number to
initiate the generation process need to be specified.
In this version of SAMS, both the number of
samples and the length of data to be generated are

24

Fig.23 SAMS generation menu

unlimited. The user should consider however the computer time it will take to generate many
samples or very long samples especially if the generation is to be done for multisite seasonal data.
Furthermore, one of four options regarding the generation model, as shown in the dialog box
in Fig.23, must be chosen. One must select "Yes" if SAMS was used to fit the model from which
data are to be generated. On the other hand, if one would like to generate data using one of the
models available in SAMS, but the model was not fitted by SAMS, then the "No" option must be
selected. To illustrate this point further, lets assume that the user fitted an ARMA (1,1) model by
using an estimation method which is not available in the current version of SAMS or by using a
different package but he wants to generate data using SAMS. Then, the user should select either the
first or the second "No" option to generate the required data. Another difference between the "Yes"
and the "No"options is that after generating the data SAMS will compare the generated and
historical statistics only if the "Yes" option is selected. In the second "No" option the user will open
a (parameter) file which must have the model parameters. This parameter file has to be in a certain
format to be recognized by SAMS. The format of this file must be exactly the same as the format
of the parameter file that SAMS generates after fitting a stochastic model as mentioned in section
2.3. To make sure of this, the user may like to run SAMS to generate a parameter file using the
model desired, then edit the parameter file to insert the new parameter set. Again for clarification,
lets consider the ARMA(1,1) model where a method different than those available in SAMS was
used to estimate the parameters. SAMS can be used to fit an ARMA(1,1) model to the same data
but using say MOM estimation. Then the MOM parameters can be saved on a file and then the file
can be edited to replace the MOM parameters by the desired set of parameters. In this case, the user
needs to change the parameters , , and 2 (refer to Section 4.2 for details). One must be aware
that this file must also contain the transformation parameters if transformation was used. Finally,
SAMS will generate data from the referred model based on the parameters contained in the edited
file.
After providing all the information needed for data generation, the user can click on the "Ok"
button shown in Fig.23. A generation menu will appear on the screen which will allow the user to
open the file which contains the model parameters. For example, Fig.24 will appear if the options
to generate single site and seasonal data were chosen. By clicking on the "Open Model Parameters
File" button, a window will appear which will allow the user to select the file that contains the model

25

Fig.24 Univariate seasonal generation menu


parameters as shown in Fig.24. After clicking on the "Generate and Save Data" button ( also shown
in Fig. 24) another menu will appear so that a file name (with an extension .gen automatically
attached) can be assigned to store the generated data. If the generation is based on a disaggregation
model, a menu as shown in Fig.19 will appear to remind the user about the adjustment methods
(which should have been read from the previously referred parameter file.) One can also make
changes to the adjustment methods at this point. Next, if statistical analysis of the generated data
is desired, the "Statistical Analysis of Generated Data" button must be clicked on and another menu
box as in Fig.25 will appear which will enable one to view the results. For example, the time series
of the generated data will be shown by clicking on the "Plot Time Series" button. In the case of
analysis pertaining drought, surplus, and storage related statistics, SAMS will ask the user to input
the desired threshold demand level, as shown in Fig.26. The default demand level is the sample
mean, but one can change it by keying a fraction of the sample mean or the actual desired demand
level. The results of the statistical analysis of the generated data can be saved into a file by clicking

26

on "Save Statistical Analysis" button. This will create a file with the extension .gst automatically
attached to store the results. Note that the referred feature of the statistical comparison of the
historical and generated data can be also used for further testing and verifying whether the fitted
model performs as desired.

Fig.25 Seasonal statistical characteristics of generated data menu

In estimating the generated statistics,


the statistics of each generated sample are
firstly estimated then the means and standard
deviations of those statistics are computed
which will be used to compare with their
historical counterparts.

The results are

presented in graphical or tabular formats.


Figure 27 shows a comparison of the
(observed) historical annual series and the

Fig.26 Window regarding the demand


level

27

Fig.27 Time series plots of the historical and generated annual flows

generated series for one sample. The user can change the station number, sample number, and the
graph scale as needed. For annual series, the comparisons of the historical and generated mean,
standard deviation, skewness coefficient, coefficient of variation, and sample maximum and
minimum are presented in tabular form. For seasonal series, the comparisons are presented in both
graphical and tabular formats as shown in Fig.28. The comparisons of correlations for annual and
seasonal data may be presented in graphical or tabular formats as shown in Fig.29 (for seasonal
data). The comparisons of drought, surplus, and storage related statistics include the longest
drought, maximum deficit, longest surplus, maximum surplus, storage capacity, rescaled range, and
Hurst coefficient. Before showing these results, a window as in Fig.26 will pop up again to allow
the user to change the demand level if needed. The results are presented in tabular format and box
plots as shown in Fig.30. The box plots reflect the ratios of the means, quartiles, maximums, and

28

minimums of those statistics calculated from the generated series to the observed historical values.
The scale of the box plot can be adjusted by the user based on the ratio ranges provided in the dialog
box.

Fig.28 Comparison between the historical and the generated monthly


mean and standard deviations

29

Fig.30 Comparison of drought, surplus, and storage related


statistics

Finally, the Status button has been added in all window menus in order to keep track of
all major results and options selected throughout the analysis, modeling, and generation exercise.

Fig.29 Comparisons of the historical and generated seasonal crosscorrelations


30

For example, by clicking on the Status button under any menu or window, the user can review the
transformation methods and coefficients utilized for each site, the fitted model including parameters
and adjustments options, etc. and information related to the data generation as that shown in Fig.31.

Fig.31 Example of update information regarding the transformation,


modeling, and generation steps. This view is shown by clicking on
Status

31

3 DEFINITION OF STATISTICAL CHARACTERISTICS


A time series process can be characterized by a number of statistical properties such as the
mean, standard deviation, coefficient of variation, skewness coefficient, season-to-season
correlations, autocorrelations, cross-correlations, and storage and drought related statistics. These
statistics are defined for both annual and seasonal data as shown below.
3.1

Basic Statistics

3.1.1 Annual Data


The mean and the standard deviation of a time series yt are estimated by
N

y = (1 / N ) yt

(3.1)

t =1

and

1 N
( y t y )2
N t =1

s=

(3.2)

respectively, where N is the sample size. The coefficient of variation is defined as cv = s / y .


Likewise, the skewness coefficient is estimated by
1 N
( yt y ) 3
N t =1
g=
s3

(3.3)

The sample autocorrelation coefficients rk of a time series may be estimated by


rk =

where

mk
m0

(3.4)

N k

mk = (1 / N ) ( y t + k y )( yt y )

(3.5)

t =1

and k = time lag. Likewise, for multisite series, the lag-k sample cross-correlations between site i
and site j, denoted by rkij , may be estimated by
rkij =

where
ij

N k

(m

ii
0

mkij
m0jj

1/ 2

)(

(i )

mk = (1 / N ) y t + k y (i ) y t( j ) y ( j )
t =1

(3.6)

(3.7)

in which m0ii is the sample variance for site i.


3.1.2 Seasonal data
Seasonal hydrologic time series, such as monthly flows, are better characterized by seasonal
32

statistics. Let y, be the seasonal time series, where represents years and seasons; =1,...,N
with N=number of years, and =1,...,, and =number of seasons. The mean and standard
deviation for season can be estimated by
1 N
y
N =1 ,

(3.8)

1 N
( y y )2
N = 1 ,

(3.9)

y =

and
s =

respectively. The seasonal coefficient of variation is cv = s / y . Similarly, the seasonal


skewness coefficient is estimated by

1 N
( y y )3
N = 1 ,
g =
s3

(3.10)

The sample lag-k season-to-season correlation coefficient may be estimated by


rk , =

mk ,

where
mk , =

(3.11)

(m0, m0, k )1/ 2

1 N
y y
N =1 ,

)( y

, k

y k

(3.12)

in which m0, represents the sample variance for season . Likewise, for multisite series, the
lag-k
sample cross-correlations between site i and site j, for season , rkij, may be estimated by

rkij, =
and
ij

mk , =

in which

ii

m0,

(m

ii
0,

mkij,
m0jj, k

(3.13)

1/ 2

][

1 N (i )
( j)
( j)
y , y(i ) y , k y k
N =1

(3.14)

represents the sample variance for season and site i. Note that in Eqs. (3.11)

through (3.14) when k < 1 , the terms, = 1 , y , k , y , k , m0, k , y( ,j) k , y( j )k , and


m0,jj k are replaced by = 2 , y 1, + k , y , + k , m0, + k , y( j )1, + k , y( j+) k , and
m0,jj + k , respectively.

3.2

Storage, Drought, and Surplus Related Statistics

3.2.1 Storage Related Statistics


The storage-related statistics are particularly important in modeling time series for
simulation studies of reservoir systems. Such characteristics are generally functions of the
variance and autocovariance structure of a time series. Consider the time series yi , i = 1, ..., N
33

and a subsample y1 , ..., yn with n # N. Form the sequence of partial sums Si as


Si = Si 1 + ( yi yn )

i = 1,..., n

(3.15)

where S0 = 0 and y n is the sample mean of y1 , ..., yn which is determined by Eq.(3.1). Then,
the adjusted range Rn* and the rescaled adjusted range Rn** can be calculated by
Rn* = max( S 0 , S1 ,..., S n ) min( S 0 , S1 ,..., S n )

and

Rn** =

respectively, in which

sn

Rn*

(3.16)
(3.17)

sn

is the standard deviation of y1 , ..., yn which is determined by Eq.

(3.2). Likewise, the Hurst coefficient for a series is estimated by


ln( Rn** )
K=
,
ln( n / 2)

n>2

(3.18)

The calculation of the storage capacity is based on the sequent peak algorithm (Loucks,
et al., 1981) which is equivalent to the Rippl mass curve method. The algorithm, applied to the
time series yi , i = 1, ..., N may be described as follows. Based on yi and the demand level d, a
new sequence Si can be determined as

Si1 + d yi
Si =
0

if positive
otherwise

(3.19)

where S0 = 0. Then the storage capacity is obtained as


S c = max[S1' ,..., S N' ]

(3.20)

Note that algorithms described in Eqs.(3.15) to (3.20) apply also to seasonal series. In
this case, the underlying seasonal series y , is simply denoted as yt .
3.2.2 Drought Related Statistics
The drought-related statistics are also important in modeling hydrologic time series. For
the series yi , i = 1, ..., N, the demand level d may be defined as y ,0 < 1 (for example, for
= 1, d = y . ) A deficit occurs when yi < d consecutively during one or more years until yi >

d again. Such a deficit can be defined by its duration L, by its magnitude M, and by its intensity
I = M/L. Assume that m deficits occur in a given hydrologic sample, then the maximum deficit
duration (longest drought or maximum run-length) is given by
L* = max( L1 ,..., Lm ) min( L1 ,..., Lm )

(3.21)

and the maximum deficit magnitude (maximum run-sum) is defined by


M * = max( M1 ,..., M m )

34

(3.22)

In SAMS, the longest drought duration and the maximum deficit magnitude are estimated for
both annual and seasonal series.
3.2.3 Surplus Related Statistics
For our purpose here, surplus related statistics are simply the opposite of drought related
statistics. Considering the same threshold level d, a surplus occurs when yi > d consecutively
until yi < d again. Then, assuming that m surpluses occur during a given time period N, the
maximum surplus period L* and maximum surplus magnitude M* may be determined also from
Eqs. (3.21) and (3.22).
4 MATHEMATICAL MODELS
4.1 Data Transformations and Standardization
In cases where the normality tests indicate that the observed series are not normally
distributed, the data has to be transformed into normal before applying the models. To normalize
the data, the following transformations are available in SAMS:
- Logarithmic transformation
Y = ln( X + a )

(4.1)

Y = ( X + a )b

(4.2)

- Power transformation
- Box-Cox transformation
Y=

( X + a )b 1
, b0
b

(4.3)

where Y is the normalized series, X is the original observed series, and a and b are transformation
coefficients. Note that the logarithmic transformation is simply the limiting form of the Box-Cox
transform as the coefficient b approaches zero. Also, the power transformation is a shifted and
scaled form of the Box-Cox transform. The variables Y and X can represent either annual or
seasonal data. For seasonal data a and b can be chosen to vary with the season. The normalized
data can then be standardized by subtracting the mean and dividing by the standard deviation
(standardization is actually an option in SAMS).
standardization may be expressed as:
Z , =

Y , Y
S (Y )

For example, for seasonal series, the


(4.4)

where Z , is the standardized series, and Y and S (Y ) are the mean and the standard deviation
of the transformed series for month . Then, the stochastic models can be fitted to the
35

standardized series Z , . For generating flows, the reverse procedure is followed. After
generating Z , then Y , can be obtained by
Y , = Y + S (Y )Z ,

(4.5)

and X , can be generated by applying the appropriate inverse transformation to the Y ,


process. For example, if X , was transformed by a natural log transformation, the process
X , can be obtained from Y , by applying the following inverse transformation:
X , = exp(Y , ) a

(4.6)

4.2 Univariate ARMA(p,q) Model


The ARMA(p,q) model may be expressed as:
( B) Yt = ( B) et

(4.7)

where Yt represents the streamflow process for year t, it is normally distributed with mean zero
and variance 2(Y) , et is the uncorrelated noise term with mean zero and variance

(e) and

also is normally distributed; and ( B) and ( B) are polynomials in B defined as


( B) = 1 1 B1 2 B 2 p B p

(4.8a)

( B) = 1 1 B1 2 B 2 q B q

(4.8b)

where 1 , 2 , . . ., p are the autoregressive parameters; 1 ,2 , . . .,q are the moving average
parameters; B is the backward shift operator, i.e., B c Yt = Yt c , and p and q define the order of
the ARMA model.
Method of moments (MOM) may be used in parameter estimation of ARMA(p, q)
models. For example, the moment estimators for the ARMA (1,0) , ARMA (1,1) and ARMA
(2,1) models are shown below:
- ARMA (1,0) model:
Yt = 1Yt 1 + et

$1 = m1
$ 2 (e) = (1 $12 ) s 2

(4.9)
(4.10)
(4.11)

- ARMA (1,1) model:


Yt = 1Yt 1 + et 1et 1

m
$1 = 2
m1
( s 2 $1m1 ) 1
$1 = $1 +

($1 s 2 m1 ) $1

36

(4.12)
(4.13)
(4.14)

$ 2 ( e) =

$1s 2 m1
$1

(4.15)

in which $1 can be obtained by solving Eq. (4.14)


- ARMA (2,1) model:
Yt = 1Yt 1 + 2Yt 2 + et 1et 1

(4.16)

m m s2 m
$1 = 2 2 1 2 3
m1 s m2

(4.17)

m m m m
$2 = 3 2 1 2 2 3
m1 s m2
( s 2 $1m1 $2 m2 )
($1s 2 m1 + $2 m1 )

$1 = $1 +
($1 s 2 m1 + $2 m1 ) ($1 s 2 m1 + $2 m1 )$1

$1s 2 + $2 m1 m1
$

$ 2 ( e) =

(4.18)

(4.19)

(4.20)

where s2 is the variance of Yt and mk is the estimate of the lag-k autocovariance of Yt which is
defined as Mk = E[Yt Yt-k]. In the foregoing model it is assumed that the mean has been removed
or E(Yt)=0. Note also that s2 = m0.
However, the Least Squares (LS) method is generally a more efficient parameter
estimation method. In this method, the parameters s and s are estimated by minimizing the
sum of squares of the residuals defined by
N

F = et2

(4.21)

t =1

where N is the number of years of data. For the ARMA (p,q) model, the residuals are defined
as
p

i =1

i =1

et = Yt i Yt i + i et i

(4.22)

Once the s and s are determined, then the noise variance 2 (e) is determined by
(1 / N ) et2 . The minimization of the sum of squares of Eq. (4.21) may be obtained by a

numerical scheme.

Powell's algorithm has been commonly employed for least squares

estimation of parameters of ARMA models. The Powell algorithm (Gill et al, 1981 and
Himmelblau, 1972), is an expanded version of the univariate gradient search which is a useful
optimization technique that does not require derivatives. The moment estimates of ARMA(p,q)

37

models may be taken as the initial values in the search algorithm. The non-derivative
optimization techniques depend very much on the starting points when the objective function is
not convex. In these cases there is no guarantee that the solution found corresponds to the global
minimum. The solution may be improved by choosing a different starting point.
To generate synthetic series from an ARMA model , Eq. (4.7) can be used. First, a
standard uncorrelated normal random variable t is generated, then et is calculated as
et = (e)t
(4.23)
To generate the correlated series Yt , the warm-up procedure is followed. In this procedure,
values of Yt prior to t=1 are assumed to be equal to the mean of the process (which is zero in this
case). Thus, Y1 , Y2 ,

...,

YN+L can be generated using Eq. (4.7) by generating e1-q , e2-q , e3-q , ...

from Eq. (4.23) where N is the required length to be generated and L is the warm-up length
required to remove the effect of the initial assumptions of Yt . L is arbitrarily chosen as 50. The
advantage of the warm up procedure is that it can be used for low order and high order stationary
and periodic models while exact generation procedures available in the literature apply only for
stationary ARMA models or the low order periodic models.
4.3 Univariate GAR(1) model
Gamma-autoregressive (GAR) models assume that the underlying series is dependent
with a gamma marginal distribution and the models do not require variable transformation.
SAMS provides modeling and data generation based on the GAR(1) model. The model
parameters are estimated based on a procedure suggested by Fernandez and Salas (1990).
The GAR(1) model can be expressed as (Lawrence and Lewis, 1981)
X t = X t 1 + t

(4.24)

where Xt is a gamma variable defined at time t, is the autoregression coefficient, and t is the
independent noise term. Xt is a three-parameter gamma distributed variable with marginal
density function given by:

( x ) 1 exp[ ( x )]
f X (x) =
( )

(4.25)

where , , and are the location, scale, and shape parameters, respectively. Lawrence (1982)
found that gt can be obtained by the following scheme:

= (1 ) +
where

38

(4.26)

= 0

M
= Y ()U j
j

j =1

if

M=0
(4.27)

if

M>0

where M is an integer random variable Poisson distributed with mean ln() and Uj , j =1,2,
....are independent identically distributed (iid) random variables with uniform (0,1) distribution.
Additionally, Yj ,j =1,2, ....are iid random variables exponential distributed with mean 1 / .
The stationary GAR(1) process of Eq. (4.24) has four parameters, namely , , , and .
It may be shown that the relationships between the model parameters and the population
moments of the underlying variable X t are:

= +

2 =
=

(4.28)
(4.29)
(4.30)

1 =

(4.31)
where , 2 , , and 1 are the mean, variance, skewness coefficient, and the lag-one
autocorrelation coefficient, respectively.
Based on results given by Kendall (1968), Wallis and OConnell (1972), and Matalas
(1966) and based on extensive simulation experiments conducted by Fernandez and Salas
(1990), they suggested the following estimation procedure:
$1 =

r1 N + 1
N 4

(4.32)

N 1 2
$ 2 =
s
N K

K=

(4.33)

[ N (1 $12 ) 2 $1 (1 $1N )]
[ N (1 $1 ) 2 ]

(4.34)

in which r1 is the lag-1 sample autocorrelation coefficient and s 2 is the sample variance. In
addition,

$ =

$ 0
1 312
. $13.7 N 0.49

39

(4.35)

where $0 is the skewness coefficient suggested by Bobee and Robitaille (1975) as

$ 0 =

L2
Lg1 A + B g12
N

(4.36)

in which g1 is the sample skewness coefficient and the constants A, B, and L are given by

and

A = 1 + 6.51N 1 + 20.2 N 2

(4.37)

. N 1 + 6.77 N 2 ,
B = 148

(4.38)

L=

(N 2)
(N 1)

(4.39)

respectively. Furthermore, the mean is estimated by the usual sample mean x . Therefore,
substituting the population statistics , , , and 1 in Eqs.(4.28) through (4.31) by the
corresponding estimates x, $ , $ , and $1 as above suggested and solving the equations
simultaneously give the MOM estimates of the GAR(1) model parameters. For more details, the
interested reader is referred to Fernandez and Salas (1990).
4.4 Univariate PARMA(p,q) Model
Stationary ARMA models have been widely applied in stochastic hydrology to annual
time series where the mean, variance, and the correlation structure do not depend on time.
Seasonal statistics such as the mean and standard deviation may be reproduced by a stationary
ARMA model by means of standardizing the underlying seasonal series. However, this
procedure does not account for the season-to-season correlations that are generally exhibited by
hydrologic time series such as monthly streamflows. Thus, periodic ARMA (PARMA) models
have been suggested in the literature for this purpose.
A PARMA(p,q) model may be expressed as (Salas, 1993):

( B)Y , = ( B)e ,

(4.40)

where Y , represents the streamflow process for year and season , it has mean zero and
variance 2 (Y ) and is normally distributed; e , is the uncorrelated noise term which is

normally distributed with mean zero and variance

2 (e) ; ( B) and ( B)

are periodic polynomials in B defined as


( B) = 1 1, B1 2, B 2 ... p , B p

40

(4.41a)

( B) = 1 1, B1 2 , B 2 ... q , B q

(4.41b)

where 1, , , p , are the seasonal autoregressive parameters; 1, ,...,q , are the seasonal
moving average parameters; B is the backward shift operator, i.e., B cY , = Y , c , and p and q
define the order of the PARMA model.
Method of moments (MOM) may be used in parameter estimation of low order
PARMA(p, q) models. In SAMS the MOM estimates are available for the PARMA(p,1) model.
For example, the moment estimators for the PARMA (1,1) and PARMA (2, 1) models are shown
below (Salas et al, 1982):
- PARMA (1,1) model:
Y , = 1, Y , 1 + e , 1, e , 1

$1, =

m2,

(4.43)

m1, 1

($1, +1s2 m1, +1 )


( s 2 $1, m1, )
$
$
1, = 1, + $ 2

(1, s 1 m1, ) ($1, s21 m1, )$1, +1

$ 2 (e) =

(4.42)

$1, +1 s21 m1, +1


$

(4.44)

(4.45)

1, +1

- PARMA (2,1) model:

Y , = 1, Y , 1 + 2 , Y , 2 + e , 1, e , 1
(4.46)

$1, =

$2, =

m2, m1, 2 s2 2 m3,


m1, 1 m1, 2 s2 2 m2 , 1
m3, m1, 1 m2 , m2 , 1
m1, 1 m1, 2 s2 2 m2 , 1

(4.47)

(4.48)

($1, +1 s2 m1, +1` + $2 , +1 m1, )


( s2 $1, m1, $2, m2 , )
$
$
(4.49)
1, = 1, + $ 2

(1, s 1 m1, + $2, m1, 1 ) ($1, s21 m1, + $2, m1, 1 )$1, +1

$ 2 (e)

$1, +1 s2 + $2, +1 m1, m1, +1


=
$
1, +1

41

(4.50)

where

is the seasonal variance and

mk,

is the estimate of the lag-k season-to-season

covariance of Y , which is equal to


M k , = E[Y , Y , k ]

(4.51)

because E (Y , ) = 0. Note also that s2 = m0, .


In a similar manner as for the ARMA(p,q) model, the Least Squares (LS) method can be
used to estimate the model parameters of PARMA(p,q) models. In this case, the parameters s
and s are estimated by minimizing the sum of squares of the residuals defined by
N

F = e2,

(4.52)

=1 =1

where

is the number of seasons and N is the number of years of data. For the PARMA (p,q)

model, the residuals are defined as


p

i =1

i =1

e , = Y , i , Y , i + i , e , i

(4.53)

Once the s and s are determined the seasonal noise variance 2 (e) can be estimated by
(1 / N) e2, . Alternatively, the method of moments can be applied but this later option is

still not available in the current version of SAMS. In using Powells algorithm, for obtaining
the least squares estimates of the ' s and ' s the moment estimates of low order PARMA(p,q)
models such as PARMA(p,1) may be taken as the initial values in the search algorithm.
Generation of data from PARMA (p,q) models is carried out in a similar manner as for
ARMA(p,q) models. The warm up length procedure can be used again to generate seasonal
sequences of the Y , process by assuming that values of Y , prior to season 1 of year 1 are
equal to zero and generating uncorrelated random sequences of e , as needed in a similar
manner as for the ARMA (p,q) model. The warm-up period is taken as 50 years.
4.5 Multivariate MAR(p) Model
The MAR(p) model can be expressed as

( B)Yt = et

(4.54)

where ( B) is a square matrix of polynomials in B which is defined as

( B) = I 1 B1 2 B 2 p B p
in which I is an (nn) identity matrix; j , j = 1,..., p, are nn parameter matrices;
42

(4.55)

Bj

is a

scalar difference operator such that B j Zt = Zt j ; Yt is an (n1) column vector with elements Yti,
i = 1, ... , n; and

et

is an (n x 1) vector of normally distributed noise terms with mean 0 and

variance - covariance matrix G. The noises et are independent in time but are dependent in space
and n is the number of sites. Such spatially correlated noise can be modeled by

et = B t

(4.56)

where gt is a (n x 1) vector of standardized normal variables independent in both time and space
and B is an (n x n) parameter matrix.
It can be shown that the moment equations of the MAR(p) model are given by
p

M 0 = i M iT + G

(4.57)

i =1

M k = i M k-i ,
i =1

where

Mk

k 1

is the lag-k cross covariance matrix of

Yt

(4.58)
defined as:

T
M k = E[Yt Yt-k
]

(4.59)
in which the superscript T indicates a matrix transpose and E(Yt)=0. In finding the MOM
estimates, Eq.(4.58) for k=1, ..., p, is solved simultaneously for the parameter matrices j , j =

in Eq. (4.58) the population covariance


m a t r i c e s Mk , k ' 1, 2, ..., p ,
by
the
sample
covariance
matrices Mk , k ' 1, 2, ... p . Then Eq.(4.57) is used to estimate the variancecovariance matrix of the residuals G . For example, the moment estimators of the MAR(1)
1,

...,

p,

by

substituting

model are:
$ = M$ M$ 1

1 0

(4.60)
$
G$ = M
0

$ M
$ 1 M
$T
M
1
0
1

(4.61)
in which superscript -1 indicates a matrix inverse.
After estimating j , j = 1,..., p and G as indicated above, B of Eq. (4.56) can be
determined from
G$ = BB T

(4.62)

The above matrix equation can have more than one solution. However, a unique solution can
be obtained by assuming that B is a lower triangular matrix. This solution, however, requires
that G be a positive definite matrix.
43

4.6 Multivariate CARMA(p,q) Model


When modeling multivariate

hydrologic

processes based on the full

multivariate ARMA model, often problems arises in parameter estimation. The CARMA
(Contemporaneous Autoregressive Moving Average) model was suggested as a simpler
alternative to the full multivariate ARMA model (Salas, et al., 1980). In the CARMA model,
both autoregressive and moving average parameter matrices are assumed to be diagonal such that
a multivariate model can be decoupled into component univariate models. Thus, the model
parameters and do not need to be estimated jointly, but, instead, they can be estimated
independently for each single site by regular univariate ARMA model estimation procedures.
This allows that the best univariate ARMA model can be identified for each single station.
The CARMA(p, q) model can be expressed as
p

j =1

j =1

Z t = j Zt j + t t j t j

Zt
t, gt

(4.63)

where

is a multi-dimensional vector of the normalized and mean corrected observations at

time

is the multi-dimensional vector of noises (residuals) of the processes at time

t,

are the diagonal autoregressive parameter matrices, and

are the diagonal moving

average parameter matrices. Equation (4.63) can be decoupled into the model components as
p

j =1

j =1

Zti = ij Zti j + ti ij ti j

(4.64)

Thus, Eq.(4.64) is the expression of a univariate ARMA(p,q) model for site i such that the
parameters ij and ij can be estimated by the regular ARMA model estimation methods.
The matrix of residual (noise) terms t = [t1 , t2 ,..., tn ] can be expressed as

t = Bt

(4.65)

where, the random vector t is uncorrelated in time and space, i.e. E (t tT ) = I . It may be
shown that the variance covariance matrix G of the correlated series t is equal to
G = E (t tT ) = BB T

(4.66)

Thus, a CARMA model implies that the cross-correlations between sites are carried through the
residuals.
Two methods are used for estimating the G matrix:
1. The MLE estimate of G is obtained by

1
G$ = $t $tT
n t
44

(4.67)

where

g t

are the residuals calculated from each single site models by using the estimated

parameters j and j .
2. The moment (MOM) estimate of G computed from the moment estimator as a function of the
given parameters and the cross-covariances of the data, i.e.,

G$ = f (m , r , M k )
where,

Mk

(4.68)

are the lag-k variance-covariance matrices of processes Z, m = 1, ..., p; r = 1, ..., q,

and k = 0, ..., max(p, q) - 1.


A moment estimator of the G matrix for a general CARMA model is obtained as follows.

Zt

By multiplying both sides of Eq. (4.63) by

(the transpose of

Zt

) one may obtain

Z t ZtT = 1Zt 1ZtT + + p Zt p ZtT + t Z tT 1t 1Z tT q t q ZtT

Because

E(ZtZt&k) ' Mk

and

E(gtgt ) ' G

(4.69)

, the lag-0, lag-1, ..., lag-k moment

equations M0 , M1 , . . . , Mp can be obtained by taking expectations on both sides of Eq.(4.69).


Then, the (i, j) elements of the moment matrices, M 0ij , M1ij , M 2ij ,..., M ijp , can be expressed as
functions of (1i , 1j ) , (2i , 2j ) , . . ., (ip pj ) ; (1i ,1j ), (2i ,2j ) , . . ., (qi ,qj ) and Gij; which are
the elements of the matrices 1 , 2 ,..., p ; 1 , 2 ,..., q ; and G; respectively. Analogously,
another p sets of equations for the (j,i) elements M 0ji , M1ji , M 2ji ,..., M pji can be obtained by
switching the site indices because on the symmetric structure of the CARMA model moment
matrices. Since G ij = G ji , and M 0ij = M 0ji are estimated from the observed processes, a
system of 2p+1 linear equations with 2p+1 unknowns, namely, for G ij , M1ij , M 2ij ,..., M ijp , etc.
is formed. Solving each system of linear equations indexed (i, j), the

matrix estimate can

be obtained.
To obtain G ij let
ij

k =1

l =1

K0 = 1 ki (l l )k l
and

k m

(4.70)
j

Kmij = mi + ki (l l )k ml
k = m+1 l =1

(4.71)

where, m = 1, ..., p and 0j = 1 . For instance, for a CARMA(3, q) model M 0ij , M1ij , M 2ij , M 3ij
can be expressed as
M 0ij = 1i M1ji + 2i M 2ji + 3i M pji + K0ij G ij

(4.72)

M1ij = 1i M 0ij + 2i M1ji + 3i M 2ji K1ij G ij

(4.73)

M 2ij = 1i M1ij + 2i M 0ij + 3i M1ji K2ij G ij

(4.74)

45

M 3ij = 1i M 2ij + 2i M1ij + 3i M 0ij K3ij G ij

(4.75)

Thus, based on Eqs. (4.70) and (4.71) a system of 2p+1 linear moment equations analogous to
Eqs. (4.72) to (4.75) can be written as
AX = B

(4.76)
where

[G ij,

is

ij
M1 ,

ji
M1 ,

(2p+1

ij
M2 ,

ji
M2 ,

1)
ij
Mp ,

vector

of

unknown

variables,

ji T
Mp ] , A is a (2p+1 2p+1) square matrix of

coefficients, and B is a (2p+1 1) vector of known constants, which can be written as


K
0ij
K1
ij
K2

A = K ijp
ji
K1
K ji
2

ji
K p

1i

2i

0i

2i

i 1

3i

1i( p 1)

1i

ip 1

3i

ip +1

0i

ip 2

4i

ip + 2

j
p +1

j
p 1

j
p 2

j
p + 2

M ij

i 0 ij
i
1+( p 1)
1 M 0

i M ij
2i +( p 1)
2 0

i ij
ip +( p 1) , B = p M 0 ,

j
j M ij
1+( p 1)
1j 0ij
j

2 M 0
2 +( p 1)

j
pj M 0ij
p +( p 1)

ip

2i ( p 1)

ip ( p 1)

2 ( p 1)

j
p ( p 1)

1( p 1)
j

where, 0i = 1 and ki = 0 if k < 0 or k > p. Thus, X can directly solved by X = A-1B.


4.7 Multivariate MPAR(p) Model
The MPAR(p) model can be expressed as
( B)Y , = e ,

(4.77)
where ( B) is a square diagonal matrix of periodic polynomials in B which is defined as
( B) = I 1, B1 2 , B 2 p , B p

in which I is an (nn) identity matrix;


season

Bj

j,

, j=1,..., p are nn diagonal parameter matrices for

is a scalar difference operator such that

(n1) column vector with elements

Y,

(4.78)

, i=1, ... , n; and

B j Z, ' Z, & j
e,

e,

Y,

is an

is an (n x 1) vector of normally

distributed noise terms with mean 0 and variance - covariance matrix


noises

The

are independent in time but are dependent in space and n is the number of sites.

Such spatially correlated noise can be modeled by


46

e , = B ,
(4.79)
where

space and

is a (n x 1) vector of standardized normal variables independent in both time and

is an (n x n) parameter matrix.

The parameters of the MPAR(p) model are estimated by the MOM by substituting the
sample moments into the moment equations in a similar manner as for the MAR(p) model. The
moment equations of the MPAR(p) model may be shown to be:
p

M 0, = i , M iT, + G
i =1

(4.80)
p

for - i 0 and k 1

M k , = i , M k i , i
i =1

M k , = i , M iT k , k

for - i < 0 and k 1

i =1

where

Mk,

(4.81a)

(4.81b)

is the seasonal lag-k cross covariance matrix of

Y,

defined as:

M k , = E[Y , YT, k ]
in which

E(Y,)'0

(4.82)

. In a similar manner as for the MAR(p) model, the MOM estimates can

be found by solving Eq. (4.81) for k=1,2,..., p simultaneously for s by substituting the
population covariance matrices M k , , k = 1,..., p by the corresponding sample covariance
$
matrices M
k , , k = 1 ,..., p . Then Eq.(4.80) is used to estimate the variance-covariance matrix

of the residuals

After estimating

j, ,

j=1,..., p

and

as indicated above,

can be

estimated from
G = B BT

(4.83)
As for the MAR(p) model, a solution for the above equation can be obtained by assuming that

B
4.8

is a lower triangular matrix. Note that

must be positive definite.

Disaggregation Models

4.8.1 General
Disaggregation stochastic modeling of hydrologic time series are efficient techniques for
47

cases where the preservation of statistical characteristics of both annual and seasonal scales is
essential for the project under study. Valencia and Schaake (1973) and later extension by Mejia
and Rousselle (1976) introduced the basic disaggregation model for temporal disaggregation of
annual flows into seasonal flows. However, the same model can also be used for spatial
disaggregation. For example, the sum of flows of several stations can be disaggregated into
flows at each of these stations or the total flows at key stations can be disaggregated into flows
at substations which usually, but not necessarily, sum to form the flows of the key stations. The
Valencia and Schaake and the Mejia and Rousselle models require that many parameters to be
estimated especially for the temporal disaggregation. For example, Valencia and Schaake model
requires 156 parameters for the case of disaggregating annual flows into 12 seasons for one
station. Mejia and Rouselle model require 168 parameters. If the same disaggregation is to be
held for 3 sites, the models require 1,404 and 1,512 for both models, respectively. Lane (1979)
introduced the condensed model for temporal disaggregation which reduces the number of
parameters required drastically. For example, for the cases mentioned above, Lane's model
requires 36 parameters for the one site case and 324 parameters for the 3 site case.
In SAMS, Lanes model will be used for temporal (seasonal) disaggregation. The
Valencia and Schaake and Mejia and Rousselle models will be used for spatial disaggregation
and univariate seasonal disaggregation where the annual flows for only one site will be
disaggregated into seasonal flows for the same site.
In using disaggregation models for data generation, adjustments may be needed to ensure
additivity constraints. For instance, in spatial disaggregation, to ensure that the generated flows
at substations (or at subsequent stations) add to the total or a fraction (depending on the
particular case at hand) of the corresponding generated flow at a key station (or subkey station)
or, in temporal disaggregation, to ensure that the generated seasonal values add exactly to the
generated annual value, three methods of adjustment based on Lane and Frevert (1990) are
provided in SAMS. These methods will be described in detail in the following sections.
4.8.2 Model Formulations
Valencia and Schaake Model
The model can be expressed as (Valencia and Schaake, 1973)
Yt = AX t + B t

(4.84)
in which

Yt

is an (f1) column vector with elements


48

Yt

, i = 1, ... , f ;

Xt

is an (h1)

column vector with elements

Xt

, i = 1, ... , h where h and f are appropriate matrix

dimensions. For example, in the key station to substation disaggregation f and h represent the
number of key and substations, respectively.

is an (f x 1) vector of normally distributed noise

terms with mean 0 and the identity matrix as its variance - covariance matrix. The noises

are

independent in both time and space. A and B are (f x h) and (h x h) parameter matrices. The
number of key stations f in the above equations can be more than one so the above model can
be used to disaggregate annual flows at several key stations to their corresponding flows at
substations in a multivariate form which would be able to preserve the inter (cross) correlations
among the stations.
The model parameter matrices A and B can be estimated by using the MOM as (Valencia
and Schaake, 1973):
A = M 0 (YX ) M 01 ( X )
BB T = M 0 (Y ) M 0 (YX ) M 01 ( X ) M 0 ( XY )

(4.85)
(4.86)

where
M k ( X ) = E[ X t X tT k ]
M k (Y ) = E [Yt YtT k ]

M k (YX ) = E[Yt X tT k ]

M k ( XY ) = E[ X t YtT k ]

Equations (4.85) and (4.86) can be used to obtain estimates of A and B by substituting the

M0(X), M0(Y), M0(X Y), and M0(Y X)

population moments

by their corresponding

sample estimates.
Mejia and Rousselle Model
This model can be expressed as
Yt = AX t + Bt + CYt 1

(4.87)
in which

Yt

Xt

, A, and B are defined in the same way as for the Valencia and Schaake

model and C is an additional (h x h) parameter matrix. As for the Valencia and Schaake model,
the number of key stations f in the above equations can be more than one so the above model can
be used to disaggregate annual flows at several key stations to their corresponding flows at
substations.
The model parameter matrices A, B, and C can be estimated by using the MOM as:

49

A = {[ M 0 (YX ) M1 (Y ) M 01 (Y ) M1T ( XY )]

(4.88)

[ M 0 ( XX ) M1 ( XY ) M 01 (Y ) M1T ( XY )]1}
C = [ M1 (Y ) AM1 ( XY )] M 01 (Y )

(4.89)

BB T = M 0 (Y ) AM 0 ( XY ) CM1T (Y )

(4.90)

Equations (4.88) through (4.89) can be used to obtain estimates of A, B, and C by substituting
t

M0(X) , M0(Y) , M0(X Y) , M0(Y X) , M1(X) , M1(Y) , M1(X Y) ,


M1(Y X) by their corresponding sample estimates. Lane (1981) showed that some

moments
and

problems exist if one uses the above equations to estimate the parameters. Specifically, the
problem is in using

M1(XY)

. He showed that the generated moments are affected and some

key moments are not preserved. As a result, he suggested that, instead of using a sample

M1(XY)

estimate of

, one should use the model (population)

M1(XY)

that would result

from the model structure (for further details, the reader is referred to Lane and Frevert,1991).
In the final analysis, the suggested equation is
M1* ( XY ) = M1 ( X ) M 01 ( X ) M 0 ( XY )

(4.91)

equation should be used for calculating M1(XY) .


The value
(
of M1 (XY) calculated in Eq. (4.91) should be used in Eqs. (4.88) through (4.90) for
The

above

estimating the model parameters. Lane suggested also that

M1(Y)

should be calculated as:

M1* (Y ) = M1 (Y ) + M 0 (YX ) M 01 (Y )[ M1* ( XY ) M1 ( XY )]

(4.92)

The reader is referred to Lane and Frevert (1991) for more in depth details about these
adjustments.
Lane's Condensed Model
The model can be expressed as
Y , = A X + B , + C Y , 1
(4.93)
i
in which Y, is an (n 1) column vector with elements Y, , i=1, ... ,n; Xt is an (n1)
i
column vector with elements X t , i = 1, ... , n ; , is an (n x 1) vector of normally

distributed noise terms with mean 0 and the identity matrix as its variance-covariance matrix.
The noises

are independent in time and space and n is the number of sites.

The model parameter matrices A, B, and C can be estimated by using the MOM as (Lane

50

and Frevert, 1991):


A = {[ M 0, (YX ) M1, (Y ) M 0,1 1 (Y ) M1T, ( XY )]
[ M 0 ( X ) M1, ( XY ) M 0,1 1 (Y ) M1T, ( XY )]1}

(4.94)

C = [ M1, (Y ) A M1, ( XY )] M 0,1 (Y )

(4.95)

B BT = M 0, (Y ) A M 0, ( XY ) C M1T, (Y )

(4.96)

where

M k ( X ) = E[ X X T k ]
M k , (Y ) = E[Y , YT, k ]

M k , (YX ) = E[Y , X T k ]

M k , ( XY ) = E[ X YT, k ]

The MOM parameter matrices can be estimated by substituting the population moments by their
corresponding sample estimates and solving Eqs. (4.94) through (4.96) for the parameters.
In a similar manner as for the Mejia and Rousselles model, Lane (1981) suggested that
the following moments should be adjusted as follows:
M1*, ( XY ) = M1 ( X ) M 01 ( X ) M 0, 1 ( XY )

(4.97)

M1*, (Y ) = [ M1, (Y ) + M 0, (YX ) M 01 ( X )][ M1*, ( XY ) M1, ( XY )]

(4.98)

The above adjustments are needed only for the first season.
Adjustment for spatial disaggregation
Three approaches are available for the adjustment of spatial disaggregated data. They
are:
approach 1:

(i )

q$ t*(i ) = q$ t(i ) + [rq$ t q$ t( j ) ]


j =1

approach 2:

| q$ t(i ) $ |
n

| q$t( j )
j =1
q$ t*(i )

51

( j)
$ |

q$ t(i ) (r q$ t )
n

q$t( j )
j =1

(4.99)

(4.100)

approach 3:

q$ t*(i ) = q$ t(i ) + (rq$t q$ t( j ) )


j =1

(4.101)
where

r = (1 / N ) rt
t =1

(4.102)

qt( j )

rt =

j =1

(4.103)

qt

and N is the number of observations, n is the number of substations (or subsequent


stations),

qt

(j)

qt

is the t-th observed value at a key station (or substation),

observed value at substation (or subsequent station) j,


station (or substation),

(i)

q t

qt

is the t-th

is the generated value at the key

is the generated value at substation

i (or subsequent

((i)
station), q t
is the adjusted generated value at substation i (or subsequent station),
(i) is
(i)
the estimated mean of q t for site i, and
(i) is the estimated standard deviation of
(i)
q t for site i.

Adjustment for temporal disaggregation


Three approaches are also available for the adjustment of temporal disaggregated data.
They are:

approach 1:

| q$ , $ |

q$*, = q$ , + (Q$ q$ ,t )
t =1

| q$ ,t $ t |(4.104)

t =1

approach 2:

q$*, =

q$ , Q$

q$ ,

(4.105)

t =1

and

$
q$*, = q$ , + Q$ q$ ,t

t =1
$ t2

approach 3:

where is the number of seasons,


seasonal value,

q,

q,

t =1

is the generated annual value,

is the adjusted generated seasonal value,

for season , and

(4.106)

is the estimated standard deviation of


52

q,

is the generated

is the estimated mean of

q,

for season .

(i )
$
n

$ ( j )

j =1

4.9 Model Testing


The fitted model must be tested to determine whether the model complies with the model
assumptions and whether the model is capable of reproducing the historical statistical properties
of the data at hand. Essentially the key assumptions of the models refer to the underlying
characteristics of the residuals such as normality and independence.
Testing the properties of the residuals
Testing the residuals properties generally involves testing the normality and the
independence of the residuals. First, the residuals are obtained from the specified models after
the parameters are estimated. For instance, in the case of the univariate PARMA model of Eq.
(4.40), the residuals are the numbers

e1 , 1 , e1 , 2 , e1 , 3 , ...

that are derived from the model.

On the other hand, in the case of the MPAR model of Eq. (4.77), the residuals are the set of
numbers

(i)

(i)

(i)

e1 , 1 , e1 , 2 , e1 , 3 , ...

, i = 1,..., n each set i corresponding to each site or station.

Testing the residual properties can be done in several ways depending on how the residuals are
arranged.
Several tests are available for testing the normality of the residuals. Common normality
tests include the skewness test, the chi-square goodness of fit test, the Kolmogorov-Smirnov test,
and the product moment correlation test (Salas et al, 1999). For periodic-stochastic models, the
normality tests should be applied on a month-by-month basis. Often though the tests are applied
considering the entire sample of residuals. In the case of multivariate models, the normality tests
should be applied for each set of data (site by site). In SAMS, the skewness test of normality
is applied on a month-by-month basis and on a site by site basis.
Likewise, several tests are available for testing the independence of the residuals. The
Portmanteau lack of fit test and the Anderson test (Salas et al, 1980) are commonly used for
testing independence in time when the residuals are derived from stationary stochastic models.
On the other hand, the cross-correlation t-test may be used for testing independence in time when
the residuals are derived from periodic-stochastic models such as those described in the previous
sections. The t-test is applied for the correlation between the residuals of two successive months,
i.e. twelve tests for monthly data. However, the Portmanteau or Anderson tests may be also
applied for testing the independence of residuals derived from periodic-stochastic models, based
on the autocorrelation of the entire residuals series. In SAMS, the Portmanteau test of
independence was applied. For testing the independence between residuals of two different sites
(independence in space), the usual test is based on the cross-correlation t-test. Also this test
53

should be applied for the cross-correlation between residuals of two sites on a season-by-season
basis (twelve tests for monthly data), although the test can be applied based on the crosscorrelation of the entire residual series for each pair of sites.
Testing ARMA model parsimony
For a fitted ARMA(p,q) model, SAMS tests its model parsimony using Akaike
Information Criterion (AIC) (Salas, et al., 1980). For comparing among competing ARMA(p,q)
models, the following equation is used:

AIC(p,q) = N ln($ 2 ) + 2( p + q )

(4.107)

where N is the sample size and $2 is the maximum likelihood estimate of the residual variance.
Under this criterion the model which gives the minimum AIC is the one to be selected. SAMS
computes AICs for the fitted model and the models of both one step higher order and one step
lower order for comparison. For instance, for a fitted ARMA(1,1) model, SAMS will compute
the AIC values for ARMA(1,1), ARMA(2,1), ARMA(1,2), ARMA(1,0), and ARMA(0,1)
models for comparison. Besides, to test the assumption of white noise, the AIC of the
ARMA(0,0) is also computed.
Testing the properties of the process
Testing the properties of the process generally means comparing the statistical properties
(statistics) of the process being modeled, for instance, the process

Y ,

in Eq.(4.40), with

those of the historical sample. In general, one would like the model to be capable of reproducing
the necessary statistics that affect the variability of the data. Furthermore, the model should be
capable of reproducing certain statistics that are related to the intended use of the model.
If

Y ,

has been previously transformed from

X ,

, the original non-normal process,

then one must test, in addition to the statistical properties of Y, some of the properties of X.
Generally, the properties of Y include the seasonal mean, seasonal variance, seasonal skewness,
and season-to-season correlations and cross-correlations (in the case of multisite processes), and
the properties of X include the seasonal mean, variance, skewness, correlations, and crosscorrelations (for multisite systems). Furthermore, additional properties of

X ,

such as those

related to low flows, high flows, droughts, and storage may be included depending on the
particular problem at hand.
In addition, it is often the case that not only the properties of the seasonal processes

Y ,

and

X ,

must be tested but also the properties of the corresponding annual processes

AY and AX . For example, this case arises when designing the storage capacity of reservoir
54

systems or when testing the performance of reservoir systems of given capacities, in which one
or more reservoirs are for over year regulation. In such cases the annual properties considered
are usually the mean, variance, skewness, autocorrelations, cross-correlations (for multisite
systems), and more complex properties such as those related to droughts and storage.
The comparison of the statistical properties of the process being modeled versus the
historical properties may be done in two ways. Depending on the type of model, certain
properties of the Y process such as the mean(s), variance(s), and covariance(s), can be derived
from the model in close form. If the method of moments is used for parameter estimation, the
mean(s), variance(s), and some of the covariances should be reproduced exactly, however,
except for the mean, that may not be the case for other estimation methods. Finding properties
of the Y process in close form beyond the first two moments, for instance, drought related
properties, are complex and generally are not available for most models. Likewise, except for
simple models, finding properties in close form for the corresponding annual process AY, is not
simple either. In such cases, the required statistical properties are derived by data generation.
Data generation studies for comparing statistical properties of the underlying process Y
(and other derived processes such as AY, X and AX) are generally undertaken based on samples
of equal length as the length of the historical record and based on a certain number of samples
which can give enough precision for estimating the statistical properties of concern. While there
are some statistical rules that can be derived to determine the number of samples required, a
practical rule is to generate say 100 samples which can give an idea of the distribution of the
statistic of interest say

the 100 samples and the mean


deviation,

( i ) , i = 1,...,100 are estimated from


S 2 ( ) are determined. Then, the mean

. In any case, the statistics

and variance

MD ( )
MD( ) = ( H )

and the relative root mean square deviations,

(4.108)

RRMSD ( )

100

RRMSD() =
are obtained in which
statistic). The statistics

[(i ) ( H )]2

i =1

(4.109)

( H)

(H) is the statistic derived from the historical sample (historical


MD ( ) and RRMSD ( ) are useful for comparing between the

historical and model statistics derived by data generation. In addition, one can observe
where

(H)

falls relative to

& S ( )

and

55

% S ( )

. Also graphical comparisons

such as the Box-Cox diagrams can be useful.

56

5 EXAMPLES
5.1 Statistical Analysis of Data
In this section, SAMS operations will be used to model actual hydrologic data. The data
used is the monthly data of the Yakima basin. The data will be read from the file yakima.dat
which can be obtained from the diskette accompanying this manual. The file contains data for
12 stations in the Yakima basin. Each station's data consists of 12 seasons and is 48 years long.
As an illustration a sample of the data file is shown in Appendix A. SAMS was used to analyze
the statistics of the seasonal and annual data. Some of the statistics calculated by SAMS are
shown below.
Annual Statistics
Site Number:

KEECHELUS_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum
Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10
Lag-0 Cross Correlations
Sites
1 and 1 (KE & KE)
1 and 2 (KE & KA)
1 and 3 (KE & YA)
1 and 4 (KE & CL)
1 and 5 (KE & YA)
1 and 6 (KE & YA)
1 and 7 (KE & BU)
1 and 8 (KE & NA)
1 and 9 (KE & TI)
1 and 10 (KE & TI)
1 and 11 (KE & NA)
1 and 12 (KE & YA)

Historical
242.9312
55.3134
0.3416
0.2277
375.5001
151.7000

1.0000
0.2773
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0.1669
-0.0322
-0.1162
0.0034

1.0000
0.9877
0.7864
0.9826
0.9834
0.9525
0.9190
0.8831
0.8787
0.8698
0.8626
0.9243

Storage and Drought Statistics

57

Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
Site Number:
***********

7.0000
344.2187
6.0000
244.0125
576.3561
10.4198
0.7375
KACHESS_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum

Historical
211.7479
52.4475
0.2010
0.2477
324.6000
120.1000

Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10

1.0000
0.2790
-0.0329
0.0957
-0.0304
0.0323
-0.1500
-0.1782
-0.0666
-0.1703
-0.0300

Lag-0 Cross Correlations


Sites
2 and 1 (KA & KE)
2 and 2 (KA & KA)
2 and 3 (KA & YA)
2 and 4 (KA & CL)
2 and 5 (KA & YA)
2 and 6 (KA & YA)
2 and 7 (KA & BU)
2 and 8 (KA & NA)
2 and 9 (KA & TI)
2 and 10 (KA & TI)
2 and 11 (KA & NA)
2 and 12 (KA & YA)

0.9877
1.0000
0.7712
0.9913
0.9923
0.9632
0.9470
0.9157
0.9072
0.9017
0.9027
0.9425

Storage and Drought Statistics


Demand Level =
Longest
Maximum
Longest
Maximum

Drought
Deficit
Surplus
Surplus

1.0000 * sample mean


7.0000
310.3353
6.0000
234.6083

58

Storage Capacity
Rescaled Range
Hurst Coefficient

503.2062
9.5945
0.7115

Seasonal Statistics
Site Number:
***********
Season

KEECHELUS_RESERVOIR
Historical

Mean
1
2
3
4
5
6
7
8
9
10
11
12

21.6250
22.5979
17.8708
14.1542
15.5708
26.8333
47.4375
38.1917
14.9604
4.7375
5.4792
13.4729
Standard Deviation

1
2
3
4
5
6
7
8
9
10
11
12

13.5856
13.9981
10.2554
8.9925
8.5916
8.5001
14.4123
19.0200
11.6909
2.6210
4.3821
8.4761
Skewness Coefficient

1
2
3
4
5
6
7
8
9
10
11
12

1.0570
1.6400
0.8679
1.0953
2.2601
0.2109
0.1997
0.2420
1.1964
1.3112
2.8219
0.8688
Season to Season Correlations
LAG

1
2
3

1
0.5775
0.2969
0.2198

59

4
5
6
7
8
9
10
11
12

0.4555
0.4143
0.3211
-0.0872
0.5527
0.8343
0.8618
0.2814
0.4562
LAG

1
2
3
4
5
6
7
8
9
10
11
12

0.3728
0.4746
0.1630
0.0556
0.2264
-0.0199
-0.1219
-0.3637
0.3692
0.7047
0.2319
0.1770
Lag-0 Season to Season Cross Correlations

Sites

1 and

(KE & KA)

1
2
3
4
5
6
7
8
9
10
11
12

0.9853
0.9828
0.9793
0.9847
0.9924
0.9632
0.9788
0.9906
0.9888
0.8572
0.9504
0.9888

Storage and Drought Statistics


Demand Level =
Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient

5.2

1.0000 * sample mean


11.0000
123.7427
7.0000
163.8901
640.1103
39.0407
0.6471

Stochastic Modeling and Generation of Streamflow Data


SAMS was used to model the annual and monthly flows of site 1 of Yakima basin (refer
60

to file yakima.dat). Both annual and monthly data used in the following examples are
transformed using logarithmic transformation and the transformation coefficients are shown in
Appendix C.
5.2.1 Univariate ARMA(p,q) Model
SAMS was used to model the annual flows of site 1 with an ARMA(1,1) model. The
MOM was used to estimate the model parameters. SAMS was also used to generate 150
samples each 48 years long using the estimated parameters. The following is a summary of the
results of the model fitting and generation by using the ARMA(1,1) model.

of fitting an ARMA(1,1) model to the transformed and standardized annual

Results
flows of site 1:
Model:ARMA

Number_of_sites:
1
Site(s)_ID:
1
Data_Transformations:
Site_1:
LOG
a-coef=
49.000000
Data_Standardization:
YES
Mean_of_the_process:
5.658607
Standard_deviation_of_the_process:
0.189585
Model_order(p,q):

1 1

phi_parameters: (Annual)
phi_1
-0.138036
theta_parameters: (Annual)
theta_1
-0.494947
Variance_of_the_residuals:
0.885071

(Annual)

Results of statistical analysis of the data generated from the ARMA(1,1) model:

Model: Univariate ARMA,

Site Number:
***********

(Statistical Analysis of Generated Data)

KEECHELUS_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient

Historical
242.9312
55.3134
0.3416

61

Generated
242.9985
53.8040
0.4131

Coef. Variation
Maximum
Minimum
Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10

0.2277
375.5001
151.7000

0.2212
385.1967
138.7450

1.0000
0.2773
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0.1669
-0.0322
-0.1162
0.0034

1.0000
0.2691
-0.0625
-0.0349
-0.0237
-0.0202
-0.0310
-0.0308
-0.0448
-0.0426
-0.0277

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient

7.0000
344.2187
6.0000
244.0125
576.3561
10.4198
0.7375

6.0267
287.2662
5.2533
311.2614
488.3525
9.1089
0.6879

SAMS was also used to model the transformed and standardized annual flows of site 7
with an ARMA(2,2) model using the Approximate LS method. The result of modeling for this
site are shown below:

Model:ARMA
Number_of_sites:
1
Site(s)_ID:
7
Data_Transformations:
Site_7:
LOG
a-coef=
450.000000
Data_Standardization:
YES
Mean_of_the_process:
6.488171
Standard_deviation_of_the_process:
0.081923
Model_order(p,q):

2 2

phi_parameters: (Annual)
phi_1
0.316854
phi_2
-0.122860

62

theta_parameters: (Annual)
theta_1
-0.002752
theta_2
0.003944
Variance_of_the_residuals:
0.918059

(Annual)

150 samples each 48 years long were generated using these estimated parameters. The
statistical analysis results of the generated data are shown below:
Model: Univariate ARMA,
Site Number:
***********

(Statistical Analysis of Generated Data)

BUMPING_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum

Historical
209.5250
53.9224
0.1097
0.2574
316.4000
112.1000

Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10

1.0000
0.2548
-0.0238
0.0770
-0.0034
0.0430
-0.1625
-0.1544
-0.1121
-0.2085
-0.0532

Generated
209.2238
53.1033
0.1912
0.2541
338.9804
96.7291

1.0000
0.2532
-0.0711
-0.0782
-0.0399
-0.0203
-0.0320
-0.0294
-0.0311
-0.0229
-0.0273

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient

4.0000
255.5000
6.0000
268.4500
498.2249
9.2397
0.6996

5.8067
287.1717
5.4667
295.3221
461.1258
8.7080
0.6742

5.2.2 Univariate GAR(1) Model


An GAR(1) model was fitted to the annual data of site 1. Based on this model, the
skewness coefficient of the historical data can be preserved without data transformation. The
estimated parameters of the model are shown below:
63

Model:GAR
Number_of_sites:
1
Site(s)_ID:
1
Data_Transformations:
Site_1:
NONE
Data_Standardization:
NO
Mean_of_the_process:
242.931244
Standard_deviation_of_the_process:
55.313374
Skewness_coefficient_of_the_process:
0.341578
beta_parameters:
25.621111
alpha_parameters:
0.089647
lamda_parameters:
-42.867931
ph_parameters:
0.325271

150 samples each 48 years long were generated using these estimated parameters. The
statistical analysis results of the generated data are shown below:
Model: Univariate GAR(1),

Site Number:
***********

(Statistical Analysis of Generated Data)

KEECHELUS_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum

Historical
242.9312
55.3134
0.3416
0.2277
375.5001
151.7000

Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10

1.0000
0.2773
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0.1669
-0.0322
-0.1162
0.0034

Generated
241.6216
55.1494
0.3156
0.2282
379.7300
131.1173

1.0000
0.2614
0.0640
0.0105
-0.0150
-0.0288
-0.0384
-0.0416
-0.0436
-0.0421
-0.0410

Storage and Drought Statistics


Demand Level =
Longest Drought
Maximum Deficit

1.0000 * sample mean


7.0000
344.2187

6.4067
332.5194

64

Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient

6.0000
244.0125
576.3561
10.4198
0.7375

6.0267
354.8967
535.3652
9.6257
0.7045

5.2.3 Univariate PARMA(p,q) Model


A PARMA (1,1) model was fitted to the transformed and standardized monthly data of
site 1 of the Yakima basin using MOM. Part of the modeling result obtained by SAMS are
shown below:
Model:PARMA
Number_of_seasons:
Number_of_sites:
Site(s)_ID:
Data_Transformations:
Site_1:
a-coef= 8.0 3.5 1.7 -1.7
Data_Standardization:
Model_order(p,q):

12
1
1
-7.3

LOG
40.0
YES

120.0

80.0

-1.4

0.0

-1.1

2.5

1 1

parameters:
Season_1
Season_2
Season_3
Season_4
Season_5
Season_6
Season_7
Season_8
Season_9
Season_10
Season_11
Season_12

phi_1
0.799617
0.601515
0.562456
0.351970
0.372740
0.083416
-0.546931
4.603543
0.845774
0.706037
0.432385
0.265219

theta_1
0.176386
0.325764
0.475817
-0.173212
0.031907
-0.256987
-0.523320
4.142385
-0.414590
0.137241
-0.196226
-0.149505

Variance_of_the_residuals
0.576008
0.802792
0.931586
0.734578
0.877790
0.897436
0.968819
0.133168
0.168387
0.530972
0.702498
0.858247

The estimated parameters were used to generate 100 samples of seasonal (12 seasons)
data each sample 48 years long. The statistical analysis result of the generated data are shown
below:

Model: Univariate PARMA,

(Statistical

Analysis of Generated Data)

Site Number:

KEECHELUS_RESERVOIR

***********

65

Season
Mean
1
2
3
4
5
6
7
8
9
10
4.7413
11
5.4180
12
13.2594

Historical

Generated

21.6250
22.5979
17.8708
14.1542
15.5708
26.8333
47.4375
38.1917
14.9604
4.7375

21.4531
22.5754
17.8748
13.9850
15.4822
26.5404
47.5850
38.5255
15.4387

5.4792
13.4729

Standard Deviation
1
13.5856
13.3797
2
13.9981
13.4388
3
10.2554
10.6862
4
8.9925
5
8.5916
6
8.5001
7
14.4123
8
19.0200
9
11.6909
10
2.6210
11
4.3821
12
8.4761

9.4890
8.8690
8.3496
13.9888
18.9623
13.9993
2.5598
4.1501
8.5231

Skewness Coefficient
1
1.0570
1.0899
2
1.6400
1.2611
3
0.8679
1.3163
4
1.0953
1.8644
5
2.2601
2.4466
6
0.2109
0.3551
7
0.1997
0.2544
8
0.2420
9
1.1964
10
1.3112
11
2.8219
12
0.8688

0.4822
2.3082
1.3478
2.1814
1.2928

Season to Season Correlations


LAG

66

1
0.6249
2
0.4015
3
0.1513
4
0.4693
5
0.2756
6
0.2770
7
0.0946
8
9
10
11
12
LAG

0.5775
0.2969
0.2198
0.4555
0.4143
0.3211
-0.0872

0.5527
0.8343
0.8618
0.2814
0.4562

0.5754
0.8147
0.6320
0.4625
0.3269

0.3728
0.4746
0.1630
0.0556
0.2264
-0.0199
-0.1219
-0.3637
0.3692
0.7047
0.2319
0.1770

0.2491
0.3682
0.2180
0.0162
0.1639
0.0267
-0.1336
-0.3810
0.4268
0.6075
0.2310
0.1110

1
2
3
4
5
6
7
8
9
10
11
12

Storage and Drought Statistics


Demand Level =
mean
Longest Drought
10.7400
Maximum Deficit
131.8937
Longest Surplus
6.5900
Maximum Surplus
177.8746
Storage Capacity
487.0978
Rescaled Range
29.0030
Hurst Coefficient

1.0000 * sample
11.0000
123.7427
7.0000
163.8901
640.1103
39.0407
0.6471

0.5907

67

5.2.4 Multivariate MAR(p) Model


SAMS was also used to model the transformed and standardized annual data of sites 3,
5, and 7 of the Yakima basin using the MAR (1) model. The modeling results are shown below:
Model:MAR
Number_of_sites:
3
Site(s)_ID:
3 5 7
Data_Transformations:
Site_3:
LOG
a-coef=
-205.000000
Site_5:
LOG
a-coef=
2000.000000
Site_7:
LOG
a-coef=
450.000000
Data_Standardization:
YES
Mean_of_the_process:
6.096067
8.147832
6.488171
Standard_deviation_of_the_process:
0.461667
0.103274
0.081923
Model_order(p,q):

1 0

phi_parameters: (Annual)
phi_1
0.802852
0.180350
0.127788

-0.091863
0.241441
0.243420

-0.271925
-0.103272
-0.083069

Variance_of_the_residuals: (Annual)

68

0.716938
0.736062
0.704988

0.736062
0.900586
0.868521

0.704988
0.868521
0.919150

These estimated parameters were used to generate 150 samples annual data each of 48
years long for the three sites. The statistical analysis result of the generated data is shown
below:
Model: Multivariate AR (MAR), (Statistical Analysis of Generated Data)
Site Number: 3
YAKIMA_RIVER_AT_EASTON_DIVERSION_DAM
***********
Historical
Generated
Mean
699.3479
687.1061
Standard Deviation
246.3507
218.7747
Skewness Coefficient
1.8333
1.1163
Coef. Variation
0.3523
0.3161
Maximum
1726.4000
1400.4399
Minimum
346.9000
367.8550
Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10
Lag-0 Cross Correlations
Sites
3 and 3 (YA & YA)
3 and 5 (YA & YA)
3 and 7 (YA & BU)

1.0000
0.4976
0.2140
0.1931
0.0206
0.0005
-0.1358
-0.1159
-0.0234
-0.0729
0.0363

1.0000
0.3898
0.1702
0.0740
0.0299
0.0106
-0.0216
-0.0458
-0.0382
-0.0508
-0.0590

1.0000
0.8040
0.7269

1.0000
0.8653
0.8142

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
Site Number:
***********

7.0000
1338.4355
6.0000
2412.7124
2420.0576
9.8236
0.7189

8.8200
1530.0778
6.0600
1740.6711
2481.6851
11.1852
0.7527

YAKIMA_RIVER_AT_CLE_ELUM

Mean
Standard Deviation
Skewness Coefficient

Historical
1474.3375
358.9830
0.2136

69

Generated
1461.5977
348.3850
0.2240

Coef. Variation
Maximum
Minimum
Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10
Lag-0 Cross Correlations
Sites
5 and 3 (YA & YA)
5 and 5 (YA & YA)
5 and 7 (YA & BU)

0.2435
2345.5000
826.0001

0.2386
2300.8103
732.6721

1.0000
0.2872
-0.0224
0.1007
0.0092
0.0426
-0.1397
-0.1650
-0.0598
-0.1297
-0.0224

1.0000
0.2546
0.0445
-0.0004
-0.0187
-0.0137
-0.0363
-0.0478
-0.0251
-0.0347
-0.0406

0.8040
1.0000
0.9536

0.8653
1.0000
0.9563

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
Site Number:
***********

7.0000
2220.5625
6.0000
1561.5746
3803.9871
10.5966
0.7428

6.2200
2088.8184
5.6133
2044.6892
3397.3022
9.7093
0.7070

BUMPING_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum

Historical
209.5250
53.9224
0.1097
0.2574
316.4000
112.1000

Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10

1.0000
0.2548
-0.0238
0.0770
-0.0034
0.0430
-0.1625
-0.1544
-0.1121
-0.2085
-0.0532

Generated
207.7169
52.5678
0.1658
0.2534
332.2505
95.6784

1.0000
0.2156
0.0339
-0.0114
-0.0204
-0.0103
-0.0307
-0.0409
-0.0252
-0.0357
-0.0369

Lag-0 Cross Correlations

70

Sites
7 and
7 and
7 and

3
5
7

(BU & YA)


(BU & YA)
(BU & BU)

0.7269
0.9536
1.0000

0.8142
0.9563
1.0000

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient

4.0000
255.5000
6.0000
268.4500
498.2249
9.2397
0.6996

6.0933
303.3795
5.5733
299.3968
495.2468
9.3981
0.6966

5.2.5 Multivariate CARMA(p,q) Model


A CARMA(2,2) model was also fitted to sites 3, 5, and 7 of the Yakima basin. The
modeling results are shown below:
Model:CARMA
Number_of_sites:
3
Site(s)_ID:
3 5 7
Data_Transformations:
Site_3:
LOG
a-coef=
-205.000000
Site_5:
LOG
a-coef=
2000.000000
Site_7:
LOG
a-coef=
450.000000
Data_Standardization:
YES
Mean_of_the_process:
6.096067
8.147832
6.488171
Standard_deviation_of_the_process:
0.461667
0.103274
0.081923
Model_order(p,q):

2 2

phi_parameters: (Annual)
phi_1
0.558511
0.000000
0.000000
phi_2
-0.222751
0.000000
0.000000

0.000000
0.397362
0.000000

0.000000
0.000000
0.316854

0.000000
-0.169891
0.000000

0.000000
0.000000
-0.122860

theta_parameters: (Annual)
tht_1

71

0.000000
0.000000
0.000000
tht_2
-0.000000
0.000000
0.000000

0.000000
0.000000
0.000000

0.000000
0.000000
-0.002752

0.000000
-0.000000
0.000000

0.000000
0.000000
0.003944

Variance_of_the_residuals: (Annual)
0.752099
0.716343
0.702230

0.716343
0.859100
0.847136

0.702230
0.847136
0.904820

These estimated parameters were used to generate 150 samples annual data each of 48
years long for the three sites. The statistical analysis result of the generated data is shown
below:
Model: Contemporaneous ARMA (CARMA), (Statistical Analysis of Generated Data)
Site Number:

YAKIMA_RIVER_AT_EASTON_DIVERSION_DAM
Historical
Generated
Mean
699.3479
699.0977
Standard Deviation
246.3507
226.7137
Skewness Coefficient
1.8333
1.1653
Coef. Variation
0.3523
0.3232
Maximum
1726.4000
1436.5048
Minimum
346.9000
374.4970
Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10
Lag-0 Cross Correlations
Sites
3 and 3 (YA & YA)
3 and 5 (YA & YA)
3 and 7 (YA & BU)

1.0000
0.4976
0.2140
0.1931
0.0206
0.0005
-0.1358
-0.1159
-0.0234
-0.0729
0.0363

1.0000
0.3893
-0.0032
-0.0993
-0.0777
-0.0441
-0.0301
-0.0107
-0.0129
-0.0340
-0.0464

1.0000
0.8040
0.7269

1.0000
0.8490
0.7923

Storage and Drought Statistics


Demand Level =
Longest
Maximum
Longest
Maximum
Storage

Drought
Deficit
Surplus
Surplus
Capacity

1.0000 * sample mean


7.0000
1338.4355
6.0000
2412.7124
2420.0576

7.7133
1302.8521
5.2867
1569.8881
2132.3733

72

Rescaled Range
Hurst Coefficient
Site Number:
***********

9.8236
0.7189

9.3943
0.6974

YAKIMA_RIVER_AT_CLE_ELUM

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum

Historical
1474.3375
358.9830
0.2136
0.2435
2345.5000
826.0001

Correlation Structure
LAG
0
1
2
3
4
5
6
7
8
9
10

1.0000
0.2872
-0.0224
0.1007
0.0092
0.0426
-0.1397
-0.1650
-0.0598
-0.1297
-0.0224

1.0000
0.2943
-0.0618
-0.0956
-0.0564
-0.0236
-0.0129
-0.0007
-0.0195
-0.0348
-0.0441

0.8040
1.0000
0.9536

0.8490
1.0000
0.9543

Lag-0 Cross Correlations


Sites
5 and 3 (YA & YA)
5 and 5 (YA & YA)
5 and 7 (YA & BU)

Generated
1480.3129
346.5107
0.2796
0.2344
2343.4680
772.9506

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
Site Number:
***********

7.0000
2220.5625
6.0000
1561.5746
3803.9871
10.5966
0.7428

6.0867
1927.5646
5.4867
1992.9727
3076.5337
8.8885
0.6790

BUMPING_RESERVOIR

Mean
Standard Deviation
Skewness Coefficient
Coef. Variation
Maximum
Minimum

Historical
209.5250
53.9224
0.1097
0.2574
316.4000
112.1000

Correlation Structure
LAG
0
1
2
3

1.0000
0.2548
-0.0238
0.0770

Generated
210.4516
52.2904
0.2260
0.2489
338.8398
100.5427

1.0000
0.2406
-0.0693
-0.0752

73

4
5
6
7
8
9
10

-0.0034
0.0430
-0.1625
-0.1544
-0.1121
-0.2085
-0.0532

-0.0457
-0.0238
-0.0168
-0.0106
-0.0201
-0.0345
-0.0279

0.7269
0.9536
1.0000

0.7923
0.9543
1.0000

Lag-0 Cross Correlations


Sites
7 and 3 (BU & YA)
7 and 5 (BU & YA)
7 and 7 (BU & BU)

Storage and Drought Statistics


Demand Level =

1.0000 * sample mean

Longest Drought
Maximum Deficit
Longest Surplus
Maximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient

4.0000
255.5000
6.0000
268.4500
498.2249
9.2397
0.6996

5.8000
275.3951
5.2667
284.3499
453.1582
8.6618
0.6714

5.2.6 Disaggregation Models


A spatial-temporal disaggregation modeling and generation example using SAMS based
on multivariate data of the Yakima basin is demonstrated here. In this example both annual and
monthly data being modeled are transformed using logarithmic transformation. The schematic
representation of the stations locations in the basin are shown in Fig.26. Clearly, stations 5 and
11 can be considered as key stations. Stations 3, 4, 8, and 10 are substations and 1, 2, 7, and 9
are subsequent stations. Scheme 1 will be used to model the key stations so that the annual flows
of the key stations will be added together to form one series of annual data as an index station.
The index station data will be fitted with an ARMA(1,1) model and then a disaggregation model
(either Valencia and Schaake or Mejia and Rousselle) will be used to disaggregate the annual
flows of the index station into the annual flows at the key stations. The key station to substation
disaggregation will be done using two groups. The first group contains key station 5 and
substations 3 and 4. The second group contains key station 11 and substations 8 and 10. The
substation to subsequent station disaggregation was also done based on two groups. The first
group contains substations 3 and 4 and subsequent stations 1 and 2. The second group contains
substations 8 and 10 and subsequent stations 7 and 9. The modeling results for the annual and
monthly data are summarized below.

74

Fig.32 Schematic representation of the river network for the


disaggregation
example

Annual (spatial) disaggregation


Disaggregation Model: Valencia and Schaake
!Modeling of Key stations
Disaggregation scheme:1
Key stations Id : 5 and 11
Model of Index station: ARMA(1,1)
!Key station to substation disaggregation modeling
Number of groups: 2
Group #: 1
Keystations Id : 5
Substations Id : 3 and 4
Group #: 2
Keystations Id : 11
Substations Id : 8 and 10
!Substation to subsequent station disaggregation modeling
Number of groups: 2
Group #: 1
Substations Id : 3 and 4
Subsequent stations Id : 1 and 2
75

Group #: 2
Substations Id : 8 and 10
Subsequent stations Id : 7 and 9
Using the above configuration the estimated model parameters are given below.
Modeling of Key stations
Basic_statistics_of_the_index_station:
Mean:
16.337555
Standard_deviation:
0.196377
Model_order(p,q):

1 1

phi_parameters: (Annual)
phi_1
0.095154
theta_parameters: (Annual)
theta_1
-0.151970
Variance_of_the_residuals:
0.036331

(Annual)

Disaggregation_of_index_to_key_stations:
A_matrix
0.515312
0.484688
B_matrix
0.020397
-0.020397

0.000000
0.000001

Disaggregation of Key stations to substations


Number_of_groups:

group_#:

Number_of_key_stations:
Key_stations_ID:
Data_Transformations:
Station_5:
a-coef=

1
5
LOG
2000.000000

Basic_statistics_of_the_key_stations:
Mean_of_the_process:
8.147832
Standard_deviation_of_the_process:
0.103274
Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:

2
3

76

Station_3:
a-coef=
Station_4:
a-coef=

LOG
-205.000000
LOG
1000.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
6.096067
7.417926
Standard_deviation_of_the_process:
0.461667
0.094175
A_matrix
3.933447
0.904966
B_matrix
0.219627
-0.002626

0.000000
0.011408

group_#:

Number_of_key_stations:
Key_stations_ID:
Data_Transformations:
Station_11:
a-coef=

1
11
LOG
2406.000000

Basic_statistics_of_the_key_stations:
Mean_of_the_process:
8.189722
Standard_deviation_of_the_process:
0.097387
Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:
Station_8:
a-coef=
Station_10:
a-coef=

2
8

10

LOG
2500.000000
LOG
100.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
8.090804
6.195611
Standard_deviation_of_the_process:
0.072572
0.205165
A_matrix
0.738420
1.995138
B_matrix
0.010106
0.052522

0.000000
0.040097

77

Disaggregation of substations to subsequent stations


Number_of_groups:

group_#:

Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:
Station_3:
a-coef=
Station_4:
a-coef=

2
3

LOG
-205.000000
LOG
1000.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
6.096067
7.417926
Standard_deviation_of_the_process:
0.461667
0.094175
Number_of_subsequent_stations: 2
Subsequent_stations_ID:
1 2
Data_Transformations:
Station_1:
LOG
a-coef=
49.000000
Station_2:
LOG
a-coef=
210.000000
Basic_statistics_of_the_subsequent_stations:
Mean_of_the_process:
5.658607
6.036669
Standard_deviation_of_the_process:
0.189585
0.124544
A_matrix
0.027025
0.005409

1.867341
1.288824

B_matrix
0.033196
0.008933

0.000000
0.013417

group_#:

Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:
Station_8:
a-coef=
Station_10:
a-coef=

2
8

10

LOG
2500.000000
LOG
100.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
8.090804
6.195611
Standard_deviation_of_the_process:

78

0.072572
0.205165
Number_of_subsequent_stations: 2
Subsequent_stations_ID:
7 9
Data_Transformations:
Station_7:
LOG
a-coef=
450.000000
Station_9:
LOG
a-coef=
40.000000
Basic_statistics_of_the_subsequent_stations:
Mean_of_the_process:
6.488171
5.980681
Standard_deviation_of_the_process:
0.081923
0.220482
A_matrix
0.841615
-0.007637

0.093955
1.071983

B_matrix
0.017719
0.007666

0.000000
0.020592

Seasonal disaggregation
For annual-monthly disaggregation modeling, the stations were divided into two groups.
The first group contains the stations 1, 2, 3, 4, and 5, while the second group contains stations
7, 8, 9, 10, and 11. Part of the annual-monthly disaggregation modeling results are shown below.
Disaggregation Model: Lane condensed Model
Number of groups: 2
Group #: 1
stations id :1, 2, 3, 4, and 5
Group #: 2
stations id : 7, 8, 9, 10, and 11
group #: 1
Season : 1
A matrix

0.100187
-0.790256
-1.087746
-0.736562
-0.812205

B matrix

0.272571
0.316740
0.313677
0.337333
0.342454

-5.591171
-5.794002
-6.127742
-10.566408
-7.984121

-0.175279
-0.220576
0.154692
-0.291538
-0.239491

2.833674
3.963215
3.974580
9.683748
5.462620

7.173420
8.640571
8.148174
9.577835
10.251024

0.000000
0.062122
0.037441
0.087106
0.064863

0.000000
0.000000
0.051319
0.001906
0.021514

0.000000
0.000000
0.000000
0.104401
0.041748

0.000000
0.000000
0.000000
0.000000
0.024983

79

C matrix

-0.553959
-0.669728
-0.739637
-0.610305
-0.730781

1.056880
1.411542
1.300074
1.144258
1.344898

0.162744
0.154738
0.319091
0.254001
0.245251

0.342263
0.321621
0.432287
0.507121
0.458153

-0.967182
-1.119348
-1.188065
-1.012993
-1.139128

22.856783
20.105253
23.010061
16.965302
24.837963

1.439505
1.168698
1.391239
0.467788
1.053099

-12.453279
-11.709853
-15.063147
-8.150460
-15.114661

0.000000
0.166081
-0.003222
0.054581
0.000507

0.000000
0.000000
0.186866
-0.011022
0.141137

0.000000
0.000000
0.000000
0.191075
0.098642

0.000000
0.000000
0.000000
0.000000
0.103991

-2.866502
-2.221176
-2.149964
-1.867283
-2.100979

-2.391560
-2.099235
-0.642270
-1.273152
-1.783531

0.463517
0.547366
0.927850
1.024580
0.725862

1.038808
1.114273
-0.697620
-0.114892
0.423544

16.572737
23.627403
20.584972
15.794100
22.826544

-3.392607
-2.921220
-1.793802
-2.042270
-3.548427

3.279173
1.744206
1.468768
1.905422
1.898677

-9.404758
-10.550832
-11.500857
-8.688657
-8.968649

0.000000
0.108808
0.072522
0.057508
0.099421

0.000000
0.000000
0.067637
0.059324
0.026707

0.000000
0.000000
0.000000
0.035298
0.014496

0.000000
0.000000
0.000000
0.000000
0.050552

-0.406830
-0.334373
-0.131294
-0.066460
-0.362385

-0.629304
-0.576146
-0.130068
-0.164665
-0.476519

0.654334
0.674303
0.182709
0.211029
0.584160

0.595607
0.592925
0.381739
0.244746
0.659288

-2.220287
-3.768532

1.720777
2.314499

1.226575
6.803477

group #: 1
Season : 5
A matrix

-7.485024
-6.270320
-8.476122
-8.248028
-9.367930

B matrix

0.812630
0.701064
0.697859
0.728257
0.736211

C matrix

4.226117
3.171489
2.938035
2.794332
3.306370

-6.186196
-5.100427
-1.557045
0.988259
-1.299358

group #: 2
Season : 1
A matrix

1.716307
-0.044698
-0.326232
-0.327360
-0.147128

B matrix

0.356707
0.429611
0.285874
0.248105
0.420281

C matrix

0.298232
0.272947
0.103876
0.106382
0.254331

group #: 2
Season : 5
A matrix

-3.034014
-10.645264

6.733072
11.803424

80

-3.917177
-5.067391
-8.766766

B matrix

0.490879
0.408971
0.435884
0.417643
0.338970

C matrix

2.416789
1.659479
1.618642
1.504343
1.396754

-0.284301
3.801157
9.480949

-0.828180
-2.485281
-3.613741

0.804703
2.491937
2.021952

6.347968
4.987008
6.793612

0.000000
0.285451
0.160045
0.209446
0.236938

0.000000
0.000000
0.127493
0.115327
-0.016704

0.000000
0.000000
0.000000
0.075774
-0.030911

0.000000
0.000000
0.000000
0.000000
0.049310

1.505506
1.830661
1.827437
1.679651
0.869733

-3.527168
-2.907587
-2.917736
-2.833496
-2.370874

2.026513
1.818996
2.342135
2.304887
1.566433

-2.032238
-2.139591
-2.449902
-2.352898
-1.248207

These estimated parameters were used to generate 100 samples of monthly data each of
48 years long for the 10 sites. Part of the statistical analysis results of the generated data is
shown below:
Model: Seasonal Disaggregation,(Statistical Analysis of Generated Data)
Site Number: 2
KACHESS_RESERVOIR
Season
Historical
Generated
Mean
1
16.8646
16.9702
2
19.7521
20.0277
3
16.1458
16.1352
4
13.3875
13.4198
5
15.2688
15.3021
6
26.2375
26.3170
7
44.6521
44.3891
8
33.4583
33.1077
9
11.4625
11.5481
10
2.5000
2.5730
11
3.0542
3.0691
12
8.9646
9.0435
Standard Deviation
1
12.1013
12.1722
2
12.8655
12.9882
3
9.2932
9.5560
4
7.9937
8.6000
5
8.5009
8.7934
6
8.1718

81

7.9901
7
8
9
10
11
12

14.4182
16.2124
9.5621
2.2897
3.0264
6.7700

13.9078
16.0398
11.1432
2.8500
2.7782
7.4574

Skewness Coefficient
1
1.1320
1.4399
2
1.8967
1.5356
3
0.7127
1.2846
4
0.9671
1.7533
5
2.2952
2.4070
6
0.1600
0.2967
7
0.3599
8
0.2885
9
1.1013
10
1.2974
11
3.1266
12
1.1720

0.3659
0.5599
2.1439
2.2914
1.9573
1.8414

Season to Season Correlations


LAG
1
0.4416
2
0.5262
3
0.2913
4
0.3904
5
0.3056
6
0.1841
7
0.0670
8
9
10
11
12

1
0.6589
0.4100
0.3546
0.4388
0.4377
0.2811
0.0489
0.5925
0.8565
0.8978
0.3768
0.6031

0.6309
0.8436
0.6757
0.4301
0.4574

Storage and Drought Statistics


Demand Level =
Longest Drought
Maximum Deficit
Longest Surplus

1.0000 * sample mean


10.0000
112.4566
8.0000

10.7700
124.0420
7.6300

82

Maximum Surplus
163.0347
Storage Capacity 564.9718
Rescaled Range
36.5715
Hurst Coefficient
0.6356

Site Number:
Season
Mean
1
2
77.0317
3
64.7072
4
63.0430
5
77.0965
6
152.9251
7
280.5401
8
236.6873
9
10
11
12

11

181.7904
593.0753
37.6897
0.6359

NACHES_R_BELOW_TIETON_R_NEAR_NACHES
Historical
Generated
56.0854
76.6458

56.1426

64.6104
62.5875
77.3479
151.5771
280.5313
236.7167
103.7792
40.6542
28.0208
36.2292

103.1596
40.6756
28.1190
35.8892

Standard Deviation
1
40.6182
44.2509
2
70.4072
64.4993
3
42.6869
41.9235
4
39.8475
41.8386
5
48.1696
43.1282

83

6
7
8
9
10
11
12

59.2644
96.9022
103.3385
57.9129
17.0232
10.6177
20.6300

58.0738
94.0675
100.5874
57.7420
16.5544
10.5495
19.6031

Skewness Coefficient
1
0.9662
1.8455
2
3.4510
2.1245
3
2.3219
1.8524
4
1.3324
1.8464
5
2.7078
1.7876
6
0.4732
0.5181
7
0.5250
0.6177
8
0.2663
9
0.8792
10
0.0272
11
-0.4915
12
1.3003

0.4145
1.2173
0.1560
0.1691
1.3090

Season to Season Correlations


LAG
1
0.4309
2
0.6573
3
0.5482
4
0.5406
5
0.5438
6
0.4710
7
0.3264
8
9
10
11
12

1
0.6809
0.5099
0.7709
0.5542
0.4644
0.3119
0.3550
0.5396
0.8317
0.8759
0.7917
0.5972

0.5680
0.8473
0.8521
0.7977
0.6452

Storage and Drought Statistics


Demand Level =
Longest
Maximum
Longest
Maximum
Storage

Drought
Deficit
Surplus
Surplus
Capacity

1.0000 * sample mean


10.0000
693.8212
7.0000
1063.9717
3839.3513

11.2400
767.2662
7.7000
1208.4659
3697.9873

84

Rescaled Range
38.2223
Hurst Coefficient
0.6381

39.6617
0.6499

Lag-0 Season to Season Cross Correlations


Sites 1 and 2 (KE & KA)
1
2
3
4
5
6
7
0.9761
8
0.9891
9
0.9578
10
0.6456
11
0.8028
12
0.9815
Sites
1
2
3
4
5
6
7
8
9
10

0.9853
0.9828
0.9793
0.9847
0.9924
0.9632
0.9788

0.9844
0.9780
0.9725
0.9738
0.9650
0.9615

0.9906
0.9888
0.8572
0.9504
0.9888

3 and

(YA & NA)


0.9068
0.8623
0.6949
0.8251
0.9108
0.7394
0.7722
0.8394
0.7933
0.4031

0.3507
0.1565
0.2006
0.1275
0.0830
0.0994
0.2210
0.3087
0.2841
0.1972

85

11
12
Sites

0.2735
0.8937
5 and 11

1
2
3
0.2284
4
0.1553
5
0.0695
6
0.0676
7
0.2773
8
0.3916
9
0.3967
10
11
12
Sites
1
0.9557
2
0.9086
3
0.9642
4
0.9656
5
0.9274
6
0.9773
7
0.9752
8
9
10
11
12

0.2272
0.2705

(YA & NA)


0.9392
0.8905
0.8189

0.3422
0.1526

0.9137
0.9296
0.9286
0.9512
0.9699
0.9462
0.6776
0.3608
0.9007
8 and 10

0.4131
0.3005
0.2319

(NA & TI)


0.9755
0.9867
0.9796
0.9847
0.9827
0.9781
0.9833
0.9897
0.9770
0.7619
0.5003
0.9741

0.9888
0.9533
0.7168
0.4811
0.9092

86

REFERENCES
Fernandez, B., and J. D. Salas, 1990, Gamma-Autoregressive Models for Stream-Flow
Simulation, ASCE Journal of Hydraulic Engineering, vol. 116, no. 11, pp. 1403-1414.
Frevert, D. K., M. S. Cowan, and W. L. Lane, 1989, Use of Stochastic Hydrology in Reservoir
Operation, J. Irrig. Drain. Eng., vol. 115, no. 3, pp. 334-343.
Gill, P. E., W. Murray, and M. H. Wright, 1981, Practical Optimization, Academic Press, N.
York.
Grygier, J. C., and J. R. Stedinger, 1990, SPIGOT, A Synthetic Streamflow Generation Software
Package, technical description, version 2.5, School of Civil and Environmental
Engineering, Cornell University, Ithaca, N.Y..
Himmenlblau, D. M., 1972, Applied Nonlinear Programming, McGraw-Hill, New York.
Hipel, K. and McLeod, A.I. 1994. "Time Series Modeling of Water Resources and
Environmental Systems", Elsevier, Amsterdam, 1013 pages.
Kendall, M. G., 1963, The advanced theory of statistics, vol. 3, 2nd Ed., Charles Griffin and Co.
Ltd., London, England.
Lane, W. L., 1979, Applied Stochastic Techniques (Last Computer Package); User Manual,
Division of Planning Technical Services, U.S. Bureau of Reclamation, Denver, Colo..
Lane, W. L., 1981, Corrected Parameter Estimates for Disaggregation Schemes, Inter. Symp.
On
Rainfall Runoff Modeling, Mississippi State University.
Lane, W. L., and D. K. Frevert, 1990, Applied Stochastic Techniques, personal computer version
5.2,
users manual, Bureau of Reclamation, U.S. Dep. of Interior, Denver, Colorado.
Loucks, D. P., J. R. Stedinger, and D. A. Haith, 1981, Water Resources Systems Planning and
Analysis, Prentice-Hall, Englewood Cliffs, N.J..
Lawrance, A. J., 1982, The innovation distribution of a gamma distributed autoregressive
process,
Scandinavian J. Statistics, 9(4), 234-236.
Lawrance, A. J. and P. A. W. Lewis, 1981, A New Autoregressive Time Series Model in
Exponential Variables [NEAR(1)], Adv. Appl. Prob., 13(4), pp. 826-845.
Matalas, N. C., 1966, Time Series Analysis, Water Resour. Res., 3(4), pp. 817-829.
Mejia, J. M., and J. Rousselle, 1976, Disaggregation Models in Hydrology Revisited, Water
Resources Research, vol. 12, no. 2, pp.185-186.
OConnell, P. E., 1977, ARIMA Models in Synthetic Hydrology, Mathematical Models for
Surface Water Hydrology, in T. Ciriani, V. Maione, and J. Wallis, eds., Wiley & Sons,
N. Y., 51-68.
Salas, J. D., 1993, Analysis and Modeling of Hydrologic Time Series, Handbook of Hydrology,
Chap. 19, pp.19.1-19.72, edited by D. R. Maidment, McGraw-Hill, Inc., New York.
Salas, J. D., D. C. Boes, and R. A. Smith, 1982, Estimation of ARMA Models with Seasonal
Parameters, Water Resources Res., vol. 18, no. 4, pp. 1006-1010.
Salas, J. D., et al, 1999, Statistical Computer Techniques for Water Resources and
Environmental
Engineering, forthcoming book.
Salas, J. D., J. W. Delleur, V. Yevjevich, and W. L. Lane, 1980, Applied Modeling of Hydrologic
Time Series, WWP, Littleton, Colorado.
Stedinger, J. R., D. P. Lettenmaier and R. M. Vogel, 1985, Multisite ARMA(1,1) and
Disaggregation Models for Annual Stream flow Generation, Water Resour. Res., 21(4),
pp. 497-509.
U. S. Army Corps of Engineers, 1971, HEC-4 Monthly Streamflow Simulation, Hydrologic
Engineering Center, Davis, Calif..
Valencia, D., and J. C. Schaake, Jr., 1973, Disaggregation Processes in Stochastic Hydrology,
Water Resources Research, vol. 9, no. 3, pp.580-585.

87

APPENDIX A
This appendix contains a sample of a monthly input data file used in this manual that
corresponds to 12 stations of monthly flows for the Yakima basin. The data file name is
YAKIMA.DAT. Printed below for illustration is data for only two stations. Note that except the
first block entitled station containing the stations names, all other items must be included in
the data file.

station
1
2
3
4
5
6
7
8
9
10
11
12

KEECHELUS RESERVOIR
KACHESS RESERVOIR
YAKIMA RIVER AT EASTON DIVERSION DAM
CLE ELUM RESERVOIR
YAKIMA RIVER AT CLE ELUM
YAKIMA RIVER AT UMTANUM
BUMPING RESERVOIR
NACHES RIVER AT NACHES-SELAH DIVERSION
TIETON RESERVOIR
TIETON RIVER AT TIETON DIVERSION
NACHES R BELOW TIETON R NEAR NACHES
YAKIMA R AT SUNNYSIDE DIV(PARKER)

tot_num_stats 12
Years 48
Seasonal 12
Station 1
Station_id KEECHELUS_RESERVOIR
Duration 1926 1973
10.2
14.8
48.5
8.5
2.8
10.2
19.6
61.4
31.9
32.4
5.9
2.9
40.8
16.4
14
11.6
19.3
23.2
9.3
9.5
20.9
11.8
34.3
16
37.6
33.4

35
19.3
19
6.5
7.3
5.9
7.6
27.9
80
20.2
8.1
22.3
23.6
26.6
28.6
17.9
20.1
20.5
24.8
15.1
16.2
38.4
16.3
14.6
22.4
34.7

14.9
8.9
29.1
4.2
7.1
12.9
18.1
25.8
37
39.4
12.6
5.9
14.3
26.7
7.6
6.7
5.8
12.9
6.3
32.1
13.9
23.1
9.9
9.1
13.9
16.9

12.4
6.8
6.2
3.2
23.9
14
22.2
7.8
17.8
14.5
6.8
7.1
5.4
8.9
15.3
5.9
5.7
7.4
9.3
18.8
6.6
17.9
9.3
9.1
10.8
33.4

24.7
7.9
21.5
11.3
18.2
21.9
32.2
8.4
47.7
14.5
15.1
12
9.7
14.1
22.8
18.2
9.6
13.8
12
9.3
12.1
25.4
9.3
12.4
21.6
10.2

30.3
19.1
21
15.3
36.9
26.1
32.4
18.1
48.2
15
34.9
22.4
34.7
29.2
33.4
23.1
30.9
39.1
21.5
17
26.6
42
17.5
31.1
19.2
30.3

19.1
44.5
55.8
50.1
26.9
43.9
50.8
37.3
22.4
44
74.2
45.4
46.6
43.6
35.6
19.8
31.3
46.5
37.1
58.4
67
45.3
64.3
71.9
46.1
57.6

88

4.2
57.7
20.4
38.4
17
20.2
53
65
7.1
41.7
44.5
56.4
26.5
25.2
12.5
10
28.5
50.6
18.4
27.4
59.6
24.9
72.2
48.1
78.6
37.7

3.2
16.1
6
11.6
5.8
6.6
18.7
37
2.7
13.9
7.5
14
5.8
9.6
4.9
4.9
7.9
23
3.4
8.3
23.9
7.6
17
22.9
38.4
8.8

2.9
4.8
4.4
2
3.2
2.8
7.9
8.9
1.3
4.3
3.6
3.5
2.7
3.3
3.9
4
2.5
4.8
5.1
4.2
6.8
1.3
6.6
6.9
9
3.3

5.3
11.5
2
2.4
2.8
2.6
4.3
11.3
2.8
4.2
3.4
2.9
1.8
3.2
2
11
3
3.6
8.9
9.2
4.7
4.7
6.8
5.1
4.5
2.9

20.9
23
17.3
2.7
8.5
9.5
11.7
36.8
22.7
4.7
3.1
6.3
4
6.8
5.5
18.6
3.6
4.9
6
14.5
15.5
29.4
11.8
19.7
22.4
20.1

16.3
2.1
20.2
23.1
36.9
18.9
9.9
50
56.1
27.9
15.2
33
21.1
15.8
14.8
15.2
21.2
26
10.3
19.3
24.5
13

10.9
4.2
38.8
10.3
25.5
56.3
19.6
39.9
32.6
10.2
21.4
24.7
9.8
26.9
11.9
33.9
42.2
17
8.1
9.7
16.9
35

6.4
49.3
16.6
10.2
9.7
8.5
13.6
30.4
7.4
24.7
28.5
18.2
16
28.2
11.1
28.8
32.3
20.3
13.2
28.1
22
19.2

9.6
24.9
10.9
15.5
6.6
7.4
19.2
10.9
12.3
32
15.5
28.9
7.4
22.4
5.8
20.2
37
5.4
9.1
27.8
37.9
6.2

8.8
10.9
10.5
8.4
8.7
10.9
12.3
17.2
15.5
19.8
8.4
14.9
11
13
11.9
11.4
23.4
9.8
16.4
10.1
47.4
10.8

30.4
24.3
22
11.9
31.9
29.5
29.3
39.9
30
32.7
39
20.3
19.8
38.2
35.2
10.8
17.1
28.1
23.2
15.1
23.8
20.2

51.9
44.1
55.4
37.3
77.7
55.5
55
43.5
46.2
56.6
29.3
28.3
46.9
43.1
48.8
44.9
31.9
67.9
48.4
69
78.3
31.5

28
36.4
58
67.9
63
21.8
18.2
44.7
36.4
47.1
25.5
14.5
71.6
34.2
29
46.2
23.9
40.1
42.7
55.1
64.8
18.3

12.9
20.9
40.1
39.4
33.8
6.2
4.3
17.9
8.4
9.1
11.9
4.9
45
10.8
12.3
10.8
6.2
8.2
8.8
39.5
31.4
5.8

3.4
4.6
12.6
9.8
6.4
3.8
4.3
3.4
4.3
2.7
4.2
2.5
12.4
3.7
2
1.4
4.5
4.8
3.1
7.2
9.1
3.2

3.6
2.9
7.5
4.8
5
2.6
4.3
27.2
4.6
4.4
4.1
2
9
2.8
2
1.4
13.8
7
5.2
4.8
12.7
4.4

3.3
7.5
10.6
29.5
20
4.5
19.7
34
13.3
13.2
13.1
6.5
13.5
7.8
9.6
26.7
17.6
10.7
7.6
12.7
6.7
8.6

Station 2
Station_id KACHESS_RESERVOIR
Duration 1926 1973
6.6
27.8
13.5
11.7
12.3
19.2
8.5
7.9
39.2
25.3
28.5
5.8
6.3
5.7
4.3
3.8
1.4
6.5
3.7
19.7
8
4.8
8.6
12.5
12.7
7.5
11.8
25.6
48.1
27.3
24.5
8
27.6
75.9
35.2
18.8
28.8
19.1
39
13.9
3.5
5.6
10.3
6.8
1.5
14.2
5.1
7.4
29.2
21.4
13.6
5.9
11.1
19.5
22.8
9.2
9.3
21.2
6.5
12
7.3
13.3
6.4
4.6
15.7
18.5
5.8
5.9
16
18.1
12.5
8.4
5.2
17.7
5.1
7.7
6.6
11.4
27.4
16.6
16.1
13.6
14.4
6.6
10.1
34.6
21.8
16.6
28.3
14.4
10
10.1
13.9
14.3
6.7
11.4
31.3
23
13.6
9.7
28.4
31.9
17.7
31.8
14.7
10.4
5.9
8
1.2
4
38.1
24.6
12.7
32.6
17.2
11.7
17.9
9.6
7.4
14.1
36.8
23.5
10.6
7.3
13.4
49.3
9.1
7.3
6.4
16.2
11.1
18.2
41.6
37
27.4
10.7
52.3
29.8
8.3
10.8
22.2
8.4
20.8
29.1
11.1
17.3
27.4
16.7
28
21.5
14.9
24.7
13
9.3
16.3
7.5
11
23.2
22.8
21.7
9.8
8.2
9.4
5.1

23.3
7.1
20.9
11.9
19.1
21.8
34
7.4
46.9
15.7
15.9
11.3
10.3
15
20.2
17.2
9.1
13.5
11.1
10.5
11.4
24.1
9.2
11.4
19.8
10.7
7.9
11.8
9.8
8.4
9.3
10
11.4
16.8
15.4
20.2
8.2
14.7
11.7
15.3
9.9

29.2
22.4
22.1
16
34.5
24.3
34.4
20
47.6
16.8
38.1
21
33
30.1
30.3
23.2
27.9
41
20.9
16.8
28.8
33.7
18.5
32.5
18.5
31.8
29.3
23.5
22.9
12.3
35.5
29.1
26.6
35.5
30.2
32.2
37.2
17.1
21.1
35.6
31.3

18.3
45.8
54.3
44.7
24
40.5
48.3
38.9
24.4
45.6
72.2
44.9
46.3
39.9
34.4
17.4
27.3
44.7
29.6
47.8
67.9
38.8
60
73.6
48
54.5
44.8
41.6
55.6
37.5
78
53.3
54.4
40.2
37.3
50.4
28.7
26.5
44.4
38.2
43.3

8.3
52
21.5
32.6
15
17.8
41.1
57.9
8.2
35.8
40.1
49.1
26.8
20.5
12.2
8.2
20.3
43.3
15
22.3
44.9
18.8
62
42.6
67.5
31.9
24.6
33.2
50.7
60.4
51.2
19.1
18.2
39.9
31.4
42.3
27
13
62.1
30.2
24.3

1.1
14.2
4.4
7.9
5
5.2
14.2
28.4
2.1
12.1
7.3
12.4
2.6
7.8
1.8
1.6
5.4
16.2
2
3
16.9
5
14.7
18.4
29.9
6.1
8.8
18.4
34.1
29.6
24.3
3
3.6
14
6.4
8.6
9.6
2.5
33.5
7.6
8.2

0.4
3.6
0.5
0.3
0.3
1.5
2.6
6.9
0.8
1.7
3.3
4.1
0.4
1.5
0.9
2
1.6
1.2
0.7
0.3
2.1
1.4
3.5
4
5.9
0.4
1.7
4
8.9
5.6
3.5
0.7
0.3
3.8
2.4
1.7
2.8
0.6
9.4
3.2
0.4

3.5
8.9
0.2
0.2
1.1
1.2
1.5
8.5
1.4
2.5
2.7
1
0.6
1.7
0.4
7
1.6
0.5
3
4.8
0.9
3.1
2.4
2
2.2
2.7
1.9
2.8
4.2
3.1
3
0.7
3.9
18.7
2.1
2.3
2.5
1.6
5.3
2.1
1

14.4
18.4
10.9
1
4.5
5.8
6.5
27.8
15.6
2.2
1.5
2.3
2.2
4.1
4.3
11.9
2.1
2.6
3.8
8
10.7
21.8
6.8
12.9
15.7
14
1.1
5.2
6.4
20.5
13.2
3.4
12.8
28.8
8.5
9.1
8.9
3.5
8.5
4.1
6.4

89

9.9
18.1
21.5
6.1
13.2
15
9.1

28.1
35.7
16.4
6.3
9.2
11.8
28.5

25.3
31.6
20.3
10.7
25
20.3
17.8

18.4
32.7
5.5
7.8
26.2
30.9
5.2

11.3
21.8
9.5
13.3
10.7
46.9
9.8

9.3
14.2
28.4
21.6
15.6
21.2
16.3

39.4
27.5
62
42.8
66.1
72.3
26.9

40.9
20.6
37.2
39.7
48.9
59.8
15.6

8.8
4.3
6.4
7.7
32.2
29.1
3.8

0.5
2.9
1.4
0.3
6.1
6.9
1

1.2
6.7
3.6
2.9
3.3
5.8
2.3

20.7
12
5
4.1
6.8
4.6
4.9

Remarks:
1. Data values are in free format but they must be separated by at least one space.
2. The item titles including tot_num_stats, Years, Seasonal, Station, Station_id, and
Duration depend on the case at hand.
3. The station names following the item title Station_id must be one word. If the name has
more than one word, the words must be connected by underline _ such as
KEECHELUS_RESERVOIR as shown on page 87.
4. The Station_id term is optional. Note the if a data file does not include the Station_id
term, the results in tables and graphs will not show the stations identification.
5. The top portion of the sample data file shown on page 87 with a block of station names, can
be avoided. It does not affect the data reading result by SAMS.

90

APPENDIX B
This appendix contains a sample of an annual input data file used by SAMS
corresponding to 12 stations of annual flows for the Yakima basin. Printed below for illustration
are data for only two stations. Note that the data can also be arranged as a single column for
each station.

tot_num_stats 12
Years 48
Annual
Station 1
Station_id
KEECHELUS_RESERVOIR
Duration 1926 1973
183.1 234.4 251.2 156.2 160.4 176.6
321.6 248.8 219.7 201.1 215.9 213.6
168.2 250.3 162.1 223.8 273.8 271.8
324.5 289.3 185.5 232.1 303.2 268.1
209.7 359.0 267.1 280.4 216.1 198.7
194.4 251.7 271.1 245.3 196.1 298.4

278.5
186.1
275.3
325.2
283.5
375.5

345.7
151.7
266.9
225.9
246.9
176.2

Station 2
Station_id
KACHESS_RESERVOIR
Duration 1926 1973
158.1 220.3 233.6 134.7 134.8 152.0
304.5 233.2 207.3 174.3 192.3 183.2
141.2 218.0 121.8 175.5 234.3 229.8
285.1 261.9 159.1 208.4 266.8 226.4
183.1 314.4 234.9 247.3 197.4 168.6
157.3 213.8 228.1 217.2 163.3 263.3

240.2
153.5
239.9
296.2
242.1
324.6

303.7
120.1
243.7
198.4
215.0
141.2

91

APPENDIX C
The logarithmic transformation coefficients for both annual and monthly flows for each
site of the example data file YAKIMA.DAT are given below. Refer to Eq. (4.1) for detail.
Transformation coefficients for annual flows Site

Coef. a
1

49

210

-205

1000

2000

2500

450

2500

40

10

100

11

2406

12

2500

Transformation coefficients for monthly flows


Month
Site

10

11

12

3.5

1.7

-1.7

-7.3

40

120

80

-1.4

-1.1

2.5

4.5

-2

-6.4

65

50

40

0.5

0.19

0.3

1.2

13

10

10

-7

-21

210

25

-3

-1.8

12

-4

-14.5

40

100

200

-4

2.7

15

16

10

-11

-47

187

366

300

-2.5

12

-3.8

15

14

12

-11

-62.5

105

380

290

15

100

27.5

2.27

1.15

2.7

-0.6

-2.83

14.6

18

90

-0.88

-0.15

-1.25

4.5

0.7

2.5

1.8

-11.1

112

133

340

9.4

55

86

-1.3

4.7

-1.4

-3.3

-2.7

-9.2

70

-10

90

-6

-6.5

-4.4

10

10.5

3.4

-4

-3.7

-10.5

44

-7

160

-7.5

-5.6

5.9

-3.5

11

1.7

-4.7

-4

-4.5

-14.6

138

101

350

18

296

98

1.2

12

-2

-30

-99

175

410

510

-20

-15

-35

92

You might also like