You are on page 1of 16

HYDROLOGICAL PROCESSES

Hydrol[ Process[ 03\ 1046Ð1061 "1999#

Comparing neural network and autoregressive moving


average techniques for the provision of continuous river flow
forecasts in two contrasting catchments
Robert J[ Abrahart0 and Linda See1
0
School of Earth and Environmental Sciences\ University of Greenwich\ Medway Campus\ Central Avenue\ Chatham Maritime\ Kent
ME3 3TB\ UK
1
Centre for Computational Geo`raphy\ University of Leeds\ Leeds LS1 8JT\ UK

Abstract]
The forecasting power of neural network "NN# and autoregressive moving average "ARMA# models are
compared[ Modelling experiments were based on a 2!year period of continuous river ~ow data for two
contrasting catchments] the Upper River Wye and the River Ouse[ Model performance was assessed using
global and storm!speci_c quantitative evaluation procedures[ The NN and ARMA solutions provided similar
results\ although na(ve predictions yielded poorer estimates[ The annual data were then grouped into a set of
distinct hydrological event types using a self!organizing map and two rising event clusters were modelled using
the NN technique[ These alternative investigations provided encouraging results[ Copyright Þ 1999 John
Wiley + Sons\ Ltd[

KEY WORDS neural network model^ ARMA model^ hydrological forecasting

INTRODUCTION
Operational hydrological forecasting and water resource management require e.cient tools to provide
accurate estimates of future river level conditions and meet real world demand[ Physical models\ based on
continuum mechanics\ o}er one possible forecasting method[ Such tools\ however\ will often be too complex\
or too demanding in terms of data and resources\ for widespread practical application "Jakeman\ 0882#[
Thus simpler approaches o}ered through {conceptual modelling| or {black!box| solutions are fast becoming
attractive alternatives[ One area of recent interest in this emerging _eld of hydrological research is the
application of neural network "NN# modelling "e[g[ Daniell\ 0880^ French et al[\ 0881^ Kang et al[\ 0882^
Karunanithi et al[\ 0883^ Lorrai and Sechi\ 0884^ Smith and Eli\ 0884^ Cheng and Noguchi\ 0885^ Minns
and Hall\ 0885\ 0886^ Yang\ 0886^ Dawson and Wilby\ 0887^ Minns\ 0887^ Abrahart and See\ 0888^ Campolo
et al[\ 0888#[ Neural network forecasting and prediction o}ers various bene_ts "Abrahart\ 0888# and models
can be developed to meet the four guiding principles of hydrological modelling] parsimony\ modesty\
accuracy and testability "Hillel\ 0875#[ Neural solutions also can be developed with basin transfer capabilities
"Minns\ 0885#\ be derived from small data sets "Abrahart and White\ in press#\ and could accommodate
long!term changes or relationships that are not constant and evolve over time[ These new technologies\
however\ require evaluation against conventional models and statistical tools\ in order to determine their
relative performance and to establish what are appropriate circumstances for their use[ Hsu et al[ "0884#\
for example\ compared NN models against a statistical ARMAX "autoregressive moving average with
exogenous input# model and the lumped conceptual SAC!SMA "Sacramento soil moisture accounting#
model using daily data for the Leaf River basin[ The NN models generated better performance statistics[
This paper\ in contrast\ evaluates the numerical performance of feedforward NN models trained with the
Received 2 March 1998
Copyright © 2000 John Wiley & Sons, Ltd. Accepted 10 July 1999
1047 R[ J[ ABRAHART AND LINDA SEE

backpropagation algorithm against linear statistical autoregressive moving average "ARMA# models for the
purpose of operational forecasting[ It therefore embraces the ideas of Maier and Dandy "0885\ 0886# and
Fortin et al[ "0886#\ who suggested that a comparison of these two approaches might prove useful[ Na(ve
predictions\ which use the current value as the prediction\ were also included in this comparison as a
bottomline benchmark[
Modelling operations were implemented on two contrasting catchments] the Upper Wye "Central Wales#
and the River Ouse "Yorkshire#[ This work was limited to an investigation of models developed on a
restricted set of input variables\ i[e[ river ~ow data[ Other variables\ such as meteorological inputs or
catchment characteristics\ for instance the degree of urbanization\ were excluded in order that the whole
modelling procedure could be reproduced at a later date on alternative catchments where additional data
might not be available[ Models were developed on annual data for 0874\ with 0873 and 0875 data being
used for model validation purposes[ The main method of assessment was based on an evaluation of several
di}erent global {goodness!of!_t| statistics[ Two alternative performance measures\ more relevant to high
~ow situations and ~ood forecasting\ were also investigated[ The _nal part of this paper illustrates the
potential for multi!model solutions based on the application of a self!organizing map "SOM# to the same
data[ This tool was used to perform a classi_cation of the data series\ which produced a set of distinct
hydrological event types\ two of which were then modelled on an individual basis\ to determine if more
accurate NN solutions could be achieved[

NEURAL NETWORK BASICS


NN solutions o}er an important alternative to traditional hydrological methods\ for both data analysis and
deterministic modelling[ In conventional computing a model is expressed as a series of equations\ which are
then translated into 2GL code\ such as C or Pascal\ and run on a computer[ NN solutions\ in contrast\ are
trained to represent the implicit relationships and processes that are inherent within each data set[ Individual
networks are also able to represent di}erent levels of generalization and can work with di}erent types\ and
combinations\ of input and output data\ e[g[ nominal data\ fractal dimension and _eld measurements at
di}erent scales and resolutions[ There are several di}erent types of NN[ The network that is of greatest
interest at the moment is the feedforward multilayered perceptron "Figure 0#[ The basic structure is not
complicated[ It consists of a number of simple processing units "also known as processing elements\ neurons
or nodes#\ which are arranged in a number of di}erent layers\ and joined together to form a network[ Data
enters the network through the input units "left# and is then fed forward through successive layers to emerge
from the output units "right#[ This is called a feedforward network because the ~ow of information is in one
direction going from input units to output units[ The outer layer\ where information is presented to the

Figure 0[ Basic con_guration of a feedforward multilayered perceptron "Dawson and Wilby\ 0887#

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1048

network\ is called the input layer[ The layer on the far side\ where processed information is retrieved\ is
called the output layer[ The layers between the two outer ones are called hidden layers "being hidden from
direct contact with the external world#[ To avoid confusion the recommended method for describing a
network is based on the number of hidden layers[ Figure 0\ for example\ is a one hidden!layer network[
There are weights on each of the interconnections and it is these weights that are altered during the training
process\ to ensure that the inputs produce an output that is close to the desired value\ with an appropriate
{training rule| being used to adjust the weights in accordance with the data that are presented to the network[
The most popular mechanism for training a network is {backpropagation of error| "Rumelhart et al[\ 0875^
Tveter\ 0885Ð7#[ Backpropagation for a multilayered network works as follows[ The weighted input to each
processing unit is
n
Ii  s wij xj
j0

The output from each processing element is a sigmoid function\ the most common being
0
f "I# 
0¦eI
with the derivative
df "I#
f ?"I#   f "I#ð0−f "I#Ł
dI
Weight updates are based on a variation of the generalized delta rule

Dwij  bEf "I#¦aDwijprevious


where b is the learning rate\ E is the error\ f "I# is the output from a processing unit in the previous layer
"incoming transmission#\ a is the momentum factor\ and where 9=9 ³ b ³ 0=9 and 9=9 ³ a ³ 0=9[ Error for
the output layer is desired output minus actual output

E joutput  yjdesired −yjactual


whereas error for a hidden processing unit is derived from error that has been passed back from each
processing unit in the next forward layer[ This error is weighted using the same connection weights that
modi_ed the forward output activation value\ and the total error for a hidden unit is thus the weighted sum
of the error contributions from each individual unit in the next forward layer[ To ensure a stable mathematical
solution\ the total error for each unit is then multiplied using the derivative of the output activation\ for that
unit\ in the forward pass

df "I ihidden # n
E ihidden  s "wij E joutput #
dI j0

which is an operation that is propagated backwards across the network[ Following training\ input data are
then passed through the trained network in its non!training mode\ where the presented data are transformed
within the hidden layers to provide the modelling output values[

STUDY AREAS AND DATABASES


Two areas were chosen for this exercise] the Upper River Wye in Central Wales and the River Ouse in
Yorkshire "Figure 1#[ The Upper Wye is an upland research catchment that has been used on several
previous occasions for various hydrological modelling purposes "e[g[ Bathurst\ 0875^ Quinn and Beven\

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1059 R[ J[ ABRAHART AND LINDA SEE

Figure 1[ Location map] "A# River Ouse catchment\ "B# Upper River Wye catchment

0882#[ The basin covers an area of some 09=44 km1 and has a quick response[ Hydrological data were
available from the gauging station at Cefn Brwyn[ The Ouse has a much larger catchment\ which covers an
area of 2175 km1\ and contains a mixed pattern of urban and rural land uses[ Gauging stations are distributed
throughout the catchment along each of its three main tributaries] the Nidd\ Swale and Ure[ Two gauging
stations were chosen for this exercise] "i# Skelton\ located just north of York on the River Ouse^ and "ii#
Kilgram\ located further upstream on the River Ure[ Skelton\ with a downstream location\ has a less ~ashier
regime than either Kilgram or the Upper Wye and has ~ood types that are easier to predict[
Data were available on a 0!h time!step for the period 0873Ð75[ The modelling data comprised] two
seasonal factors "sinðCLOCKŁ and cosðCLOCKŁ#^ 5 h of previous ~ow data "FLOWt−5 to FLOWt−0#^ 5 h
of di}erenced ~ow data "DIFFt−5 to DIFFt−0#^ and either FLOW\ or DIFF\ at time t "the value to be
predicted#[ There were in total 03 input variables and one output variable[ The CLOCK input variables\
based on annual hour count\ were intended to allow for variation in system output according to seasonal or
annual in~uences\ because an agricultural catchment might be expected to produce di}erent responses in
summer "drier# and winter "wetter#[ The selection of a 5 h historical record relates to other work where input
saliency analysis has shown that this period contains the most in~uential variables for continuous NN river!
level forecasting purposes "Abrahart et al[\ 0888#[ The variables were subjected to linear normalization
between 9=0 "lowest value per variable per station# and 9=8 "highest value per variable per station#[ Three
annual data sets were then created from each of the two main pattern _les for each river gauging station[
The results are reported in normalized ~ow units "nfu#[

EVALUATION MEASURES
One major problem in assessing NN solutions is the use of global statistics[ When these mechanisms are
used to model one!step!ahead predictions\ the solution will in most cases produce a high or near!perfect
{goodness!of!_t| statistic[ Such measures give no real indication of what the network is getting right or
wrong or where improvements could be made\ i[e[ for particular periods of time when the predictions were
poor[ NN solutions are designed to minimize global measures\ and a more appropriate metric that identi_es
real problems\ or between!network di}erences\ is perhaps now long overdue[ As there is no one de_nitive
evaluation test\ a multicriteria assessment was therefore carried out\ with _ve di}erent global evaluation
procedures and two di}erent storm!speci_c evaluation measures being analysed and reported in this paper[

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1050

Global evaluation measures


0[ Mean absolute error "MAE#

S ij ="obs−exp#=
MAE 
N
1[ Root mean squared error "RMSE#

S ij "obs−exp# 1
RMSE 
X N
2[ Mean higher order error function "MS3E#

Sij "obs−exp# 3
MS3E 
N
This statistic places emphasis on peak ~ow prediction "Blackie and Eeles\ 0874#[
3[ Coe.cient of e.ciency "Nash and Sutcli}e\ 0869^ Diskin and Simon\ 0866#

Sij "err−err# 1 Sij "x−x¹# 1


)COE  0−
0 N > N 1×099

4[ Percentage of predictions grouped according to degree of error]


) correct predictions
) predictions 9 × 4) of observed
) predictions 4 × 09) of observed
) predictions 09 × 14) of observed
) greater than 214) of observed

Event!speci_c evaluation measures


Global error statistics provide relevant information on overall performance but do not provide speci_c
information about model performance at high levels of ~ow\ which in a ~ood forecasting context\ is of
critical importance[ Two additional storm!speci_c evaluation procedures were therefore implemented] "i#
average di}erence in peak prediction over all ~ood events calculated using MAEpp and RMSEpp^ and "ii#
percentage of early\ on!time\ or late occurrences for the prediction of individual peaks[ Both measures
should\ in combination with the global statistics\ provide a better insight into the modelling performance of
individual solutions for ~ood forecasting purposes[

ÝVE PREDICTION
ARMA MODELLING AND NAI
The ARMA modelling "Box and Jenkins\ 0865# was implemented using NPREDICT "Masters\ 0884#\ from
which appropriate statistical tools were created based on the following standard formulation
xt  89 ¦80 xt−0 ¦81 xt−1 ¦= = =¦u0 at−0 ¦u1 at−1 ¦= = =¦at
where xt is the predicted value\ f9 is a constant o}set\ fi are weights associated with each previous
observation\ xt−i are previous observations\ ui are weights associated with each previous shock\ at−i are
previous shocks or noise terms and at is current shock[
From plots of the autocorrelation function the mean of each time!series at each station was determined
to be non!stationary[ The original data were therefore detrended using single adjacent point di}erencing\
which ful_lled the parametric requirements of this approach\ and created an alternative set of modelling

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1051 R[ J[ ABRAHART AND LINDA SEE

patterns comprising six DIFF "t−0 to t−5# inputs and one DIFF "t# output[ This formal obligation to
work with DIFF data prohibited the use of FLOW input or output variables in these modi_ed data sets
and\ likewise\ from this part of the analysis[ The ARMA solutions were then _tted to the di}erenced annual
data for 0874\ using an iterative approach to determine the optimal number of terms and weights\ and tested
with the di}erenced annual data for 0873 and 0875[ The _nal models were ARMAð0\ 0Ł for Kilgram and the
Upper Wye and ARMAð0\ 1Ł for Skelton[ The ðp\ qŁ notation refers to the number of autoregressive and
moving average terms that were included in each model[
Na(ve prediction\ or persistence\ substitutes the last known _gure as the current prediction and represents
a good bottomline benchmark against which other one!step!ahead predictions can be measured[ Relevant
evaluation measures were therefore calculated for each individual data set[

NEURAL NETWORK MODELLING


The NN modelling was implemented using SNNS "SNNS Group\ 0889Ð87#[ The selection of an appropriate
initial architecture was problematic\ because there is no single correct procedure to determine the optimum
number of units or layers\ although one or two {rules of thumb| have been put forward "Sarle\ 0887# and
various automated growing\ pruning and network breeding algorithms exist "e[g[ Fahlman and Lebiere\
0889^ SNNS Group\ 0889Ð87^ Braun and Ragg\ 0886#[ The number of input and output units is _xed
according to the number of variables in each training pattern[ Selecting the optimal number of hidden units
is\ however\ a di}erent matter\ which often is considered to be problem dependent[ Intuition suggests that
{more is better| but this is not always the case[ The number of hidden units and layers will control the power
of the model to perform more complex modelling\ with an associated trade!o} between training time "i[e[
number of epochs in which one epoch represents one complete presentation of the training data# and model
performance "i[e[ validation error#[ The use of large hidden layers also could be counterproductive because
an excessive number of free parameters will encourage over_tting of the network solution to the training
data\ and so reduce the generalization capabilities of the _nal product[ The other question that needs to be
addressed concerns the number of hidden layers and the balance of hidden units between these layers[
Theoretical results have shown that a one!hidden!layer feedforward network is capable of approximating
any measurable function to any desired degree of accuracy^ and that if errors occur\ these will be due to
inadequate learning or too few hidden units\ or because the data contain an insu.cient deterministic
relationship "Hornik et al[\ 0878#[ There remains some debate about this theoretical justi_cation\ however\
and there might well be some advantage in considering the use of two hidden layers\ in order to provide an
additional degree of representational power "Openshaw and Openshaw\ 0886#[
Trial and error\ based on systematic studies using di}erent architectures and di}erent training procedures\
is still the preferred choice of most users "e[g[ Fischer and Gopal\ 0883# and is the method of selection that
was adopted in this work[ SNNS was therefore used to construct 01 di}erent one!hidden!layer and 01
di}erent two!hidden!layer feedforward networks with a range of 5 to 61 hidden units in each set[ These
networks all had 03 input units and one output unit[ The networks were trained to predict _rst FLOW and
then DIFF values using the 0874 pattern sets for each individual station[ The stopping condition for each
run was set at 799 epochs and trained networks were saved at 099 epoch intervals[ Training was undertaken
with {enhanced backpropagation| using decreasing levels of learning and momentum[ The three relevant
annual data sets were then passed through each saved network and a set of annual error statistics computed[
The _nal results for the di}erent architectures were all quite similar and no clear!cut overall trend could be
observed between them[ This suggests that the use of additional hidden units had little or no real impact on
the end result and that most simple networks of modest size are able to provide an acceptable solution[
Moreover\ no substantial di}erence could be found between the one!hidden!layer and two!hidden!layer
networks\ which accords with other recent work in which the bene_ts of using a second hidden layer were
considered marginal to the rainfallÐruno} modelling problem "Minns and Hall\ 0885#[
As no optimal architecture emerged\ two representative one!hidden!layer networks were chosen for more

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1052

extensive training\ comprising 03 ] 5 ] 0 and 03 ] 01 ] 0 con_gurations[ Both networks were trained to predict
_rst FLOW and then DIFF values using the 0874 pattern sets for each individual station[ The stopping
condition for each run was increased to 3999 epochs and in each case there was again little di}erence between
the _nal error statistics[ The 03 ] 5 ] 0 solutions\ which had the simpler architecture\ were therefore selected
for full numerical testing and analysis[

RESULTS
This section summarizes the main results\ using an analysis of aggregated graphs\ from which important
relative di}erences between the stations and the models can be observed[ Tables that contain a comprehensive
set of numerical statistics can be found on the world!wide web in Abrahart and See "0887#[

Global evaluation statistics


RMSE statistics for the training and validation data\ which provide a general illustration of the overall
results\ are depicted in Figure 2[ The MAE and MS3E measures showed similar results to RMSE except for
the e}ect of 04 or so large underpredictions in the validation data from the NN!FLOW model for the Upper
Wye\ which was accentuated in the MS3E results\ and minimized in the MAE[ Two main trends can be
observed with respect to annual data]
0[ training data error statistics for the Upper Wye were a little higher than Kilgram\ whereas those for
Skelton were much lower^
1[ validation data error statistics showed a more progressive di}erentiation\ and were greatest for the Upper
Wye\ then Kilgram\ and then Skelton[
This pattern is thought to be a direct re~ection of the hydrological characteristics at each station\ with
steep rising limbs and spiked peaks\ on the ~ashier catchments\ being more di.cult to learn\ and producing

Figure 2[ RMSE calculated on annual data for the Upper Wye\ Kilgram and Skelton

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1053 R[ J[ ABRAHART AND LINDA SEE

Figure 3[ )COE calculated on annual data for the Upper Wye\ Kilgram and Skelton "limited graphical range used to obtain maximum
di}erentiation#

a more inferior result in proportion to the ~ashiness of the catchment when these modelling solutions were
transferred to their validation data periods[ In terms of relative performance between the di}erent modelling
solutions\ the NN and ARMA forecasts were quite similar at each station\ whereas na(ve prediction often
produced a much higher error[
)COE statistics for each station are shown in Figure 3[ This measure generated between station di}erences
with a relative pattern that was similar to the one produced using the other statistics[ As this particular
evaluation measure is assessing a positive as opposed to a negative attribute\ the vertical pattern is reversed[
The training and validation data exhibited similar results\ most of the reported e.ciencies were quite high\
and the actual di}erences quite small[ Kilgram and Skelton are\ however\ this time more similar and the
Upper Wye more distinct[ E.ciencies associated with the NN!DIFF predictions were also lower than those
recorded for the other three solutions and the magnitude of these di}erences is observed to increase in
proportion to falling e.ciencies associated with the poor modelling of quicker catchment responses[
Figure 4 depicts the spread of prediction errors and highlights the fact that similar levels of error were
found to occur within the training and validation data sets across all models and locations[ Most predictions
were within 4) of the true value\ with minor percentages occurring in the high error bands[ Na(ve prediction\
in all cases\ had the largest number of percentage correct[ This re~ects a large number of low ~ow situations
in which there was no change over time[ The NN and ARMA solutions produced a limited number of
correct predictions\ with the ARMA solutions doing somewhat better on the Upper Wye[ The greatest errors
at each station occurred in proportion to the ~ashiness of the river at each location\ with the highest number
of large errors being produced from na(ve prediction[ The ARMA solutions produced fewer exceptional
errors for the Upper Wye\ whereas NN solutions produced fewer exceptional errors for Kilgram and Skelton[
Event!speci_c evaluation statistics
Kilgram had the highest number of storm events for this period "0873 ] 0874 ] 0875#\ which was 63
"10 ] 19 ] 22#[ The Upper Wye had 58 "8 ] 11 ] 27# and Skelton had 47 "07 ] 03 ] 15#[ Error statistics associated

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1054

Figure 4[ Error in annual prediction for all three stations visualized according to percentage class groups

with peak prediction are much higher than those computed on the complete hydrographic record\ because
this subset of the data contained a larger proportion of extreme responses\ which are more di.cult to
estimate[ RMSE for peak prediction is depicted in Figure 5] MAEpp showed similar results to RMSEpp[ The
overall pattern\ in both cases\ is also consistent with results obtained from global testing with annual data
sets] the Upper Wye had the highest levels of error\ then Kilgram\ and then Skelton[ Two other patterns
also can be observed with respect to peak prediction]
0[ training data error statistics now show a more progressive di}erentiation between the stations\ although
the Upper Wye errors are much higher\ in comparison with those for both Kilgram and Skelton^
1[ validation data error statistics now show little or no progressive di}erentiation * large errors were
computed for the Upper Wye\ with much lower errors for Kilgram and Skelton\ which are both of a
similar magnitude[
These _ndings suggest that global measures are not good indicators of peak prediction\ owing to the
overwhelming presence of a large number of low ~ow situations\ which are easier to predict[ Shifting the
focus to peak prediction has also highlighted signi_cant variation in the forecasting power of the two more
~ashier modelling solutions\ and produced clear di}erentiation between the station on the Upper Wye "more
rapid response# and at Kilgram "less rapid response#[ Na(ve prediction obtained better relative performance
statistics with respect to the other solutions and in comparison with the di}erences produced from testing
based on annual data sets[ The NN and ARMA forecasts were still quite similar\ although the ARMA
forecasts for Kilgram generated higher error than the other three predictors\ and validation data with respect
to the NN!FLOW model for the Upper Wye was once again a problematic outlier[ Detailed statistical
analysis\ based on this extracted subset\ has thus once again indicated that there is no substantial di}erence
in forecasting potential between the two main modelling techniques\ and that important aspects of observed
behaviour had more to do with the problem of peak ~ow prediction than the estimation power of individual
solutions[
Figure 6 shows the percentage of peak predictions that were early\ on!time or late[ Na(ve predictions were

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1055 R[ J[ ABRAHART AND LINDA SEE

Figure 5[ RMSE calculated on peak prediction for the Upper Wye\ Kilgram and Skelton

not included because these are always late[ Skelton had the largest percentage of late predictions in the
training data\ followed by the Upper Wye\ and then Kilgram[ The training and validation data sets produced
similar results for the Upper Wye and Kilgram\ whereas the validation data for Skelton was the better of
the two\ and had a larger percentage of correct predictions[ However\ these improved statistical results for
Skelton could be due to small number e}ects\ because there were fewer storms in 0874 "training data# than
in either 0873 or 0875 "validation data#[

IMPROVING THE NEURAL NETWORK SOLUTION] SOM!BASED MULTINETWORK


MODELLING
Thus far NN solutions have been compared with ARMA forecasters and na(ve predictions all trained and
validated on annual data sets[ The NN solutions and ARMA forecasters were observed to provide a similar
level of performance\ which was better than na(ve prediction[ The ARMA model building was\ however\ far
more tedious than NN model building\ requiring a substantial amount of hands!on iterative experimentation\
with graphical analysis of residual error at each stage[ Moreover\ subjective decisions were required at
various points\ for instance on items such as which terms should be included within\ or excluded from\ the
_nal equation[ From a model building perspective the automated NN procedure is therefore simpler and
much quicker to implement[ However\ it is also appropriate to consider alternative methods or approaches\
through which improved performance could be achieved[ There are at least two possibilities[ The _rst
method involves using additional inputs in the model building process and is applicable to both NN
and ARMA forecasters\ e[g[ river level data from upstream stations\ or other relevant hydrological or
meteorological information[ The second method involves the implementation of a multimodel approach\
which has been shown to provide improved performance with respect to alternative forecasts at Skelton "See
et al[\ 0886#[ The original river ~ow data are _rst grouped or clustered into distinct hydrological event types\
where an event is taken to mean a short section of the hydrograph record\ and each individual cluster is

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1056

Figure 6[ Error in peak prediction for all three stations visualized according to timing of event

then modelled as an independent item within a set of such models[ This type of modelling\ however\ cannot
be performed with the ARMA technique\ because the nature of this statistical method is such that it requires
continuous time!series data as opposed to out of sequence event!related groupings[
Statistical methods could be used to perform this classi_cation[ The neural network alternative would be
a self!organizing map "SOM# "Kohonen\ 0884#[ This network algorithm is based on unsupervized classi!
_cation\ where the processing units compete against each other to discover important relationships that exist
within the data\ with no prior knowledge[ The traditional architecture contains two layers of processing
units\ a one!dimensional input layer and a two!dimensional competitive layer[ The competitive layer\ or
feature map\ is organized into a regular grid of processing units and each unit in the input layer is connected
to each unit in the competitive layer[ The feature map has connections between the competitive units and
each competitive unit also has one or more additional weights\ or reference vectors\ which will be trained to
represent the fundamental pattern associated with each class group[ Training consists of random weight
initialization\ presenting a data pattern to the network and determining which unit has the closest match\
then updating both the winning unit and those around it[ This process is repeated over numerous epochs
until a stopping condition is reached[ The training rule is]
Dwi  b"xi −wiold #
where wi is the weight on the ith reference vector\ b is the learning rate\ xi is the transmission along the ith
weighted reference vector and where 9=9 ³ b ³ 0=9[
The winning unit and its neighbours will adapt their reference vector to better _t the current pattern\ in
proportion to the strength of the learning coe.cient\ whereas other units are either inhibited or experience
no learning whatsoever[ Lateral interaction is introduced between neighbouring units within a certain
distance using excitors^ beyond this area\ a processing unit either inhibits the response of other processing
units\ e[g[ Mexican Hat Function "Caudill and Butler\ 0881\ p[ 73#\ or does not in~uence them at all\ e[g[
Square Block Function "Openshaw\ 0883\ p[ 53#[ The weight adjustment of the neighbouring units is
instrumental in preserving the topological ordering of the input space[ The neighbourhood for updating is

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1057 R[ J[ ABRAHART AND LINDA SEE

then reduced\ as is the learning coe.cient\ in two broad stages] a short initial training phase in which a
feature map is trained to re~ect the coarser and more general details and a much longer\ _ne tuning stage\
in which the local details of the organization are re_ned[ This process continues until the network has
stabilized and weight vectors associated with each unit de_ne a multidimensional partitioning of the input
data[
The partitioning was implemented using SOM software "NNRC\ 0887#[ Feature maps * 1×1\ 3×3\ 5×5
and 7×7 * were examined using various sets of data[ The 03 input variables from the original network
modelling operation produced _nal clusters di}erentiated according to season and not on di}ering level
behaviour[ The use of adjacent!point di}erences in river ~ow levels also added little to the clustering process^
hence the _nal input data chosen for the classi_cation exercise were FLOWs t−0 to t−5[ The best
results were produced using an 7×7 SOM "53 clusters#[ This gave reasonable di}erentiation between event
behaviours at high levels of ~ow[ It also produced a large number of similar events at low levels of ~ow[
Figure 7 illustrates some of the di}erent types of event behaviour that were di}erentiated over the 5 h time
period for Kilgram\ using data for all 2 years[ To facilitate a better presentation\ the pro_les have been
forced through the origin\ and all clusters with near!identical behaviour have been omitted from the plot[
The three main types of hydrograph event can be seen in this diagram\ comprising ~at\ rising and falling
behaviours[ Each of these items can be further partitioned into low\ medium and high ~ow situations[
To examine the potential bene_ts of this data!splitting technique a dedicated 7×7 classi_er was produced
for each station[ To ensure a su.cient number of cases in each cluster\ for subsequent training purposes\
individual clusters were created from the complete 2!year record[ The two most prevalent rising events at
each station were then identi_ed for modelling purposes\ from the complete set of events\ as shown in Figure
7[ Table I lists the total number of cases in each of these rising clusters[ Six trained NN solutions "two cluster
types for three stations# were then developed using identical network architectures and training procedures\
to those reported earlier\ which enabled comparison[ RMSE\ MAE and )COE statistics were calculated
on the network outputs[ Corresponding ARMA forecasts and na(ve predictions were then extracted for the
relevant subsets and likewise assessed[ These values are also listed in Abrahart and See "0887#[
RMSE statistics are shown in Figure 8 and )COE statistics in Figure 09[ The error values and recorded

Figure 7[ SOM classi_cation of di}erent event types at Kilgram

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1058

Table I[ Number of cases in each rising event cluster

Station Rising cluster 0 Rising cluster 1

Kilgram 079 009


Skelton 64 78
Upper Wye 83 096

Figure 8[ RMSE calculated on _rst high!level rising event cluster at Kilgram

e.ciencies indicated potential improvement in NN performance\ over and above that produced from the
ARMA equations\ on both rising event clusters[ Enabling the modelling procedure to concentrate on a small
well!de_ned task\ rather than the entire spectrum of global hydrograph behaviours\ has therefore facilitated
better approximation with respect to the rising limb of the hydrograph\ although the actual statistics that
were produced are in fact much poorer than their global counterparts[

CONCLUSIONS
0[ The NN solutions produced global error statistics that were similar to\ and sometimes better than\ a
standard statistical time!series predictor using common data inputs for two contrasting catchments[ The
_nal decision on which technique is better suited to the modelling operation described in this paper must
therefore rest on alternative factors[
1[ The main distinction between the two techniques that have been evaluated was in the level of user input
required and the speed of model building[ The NN solutions were less demanding in terms of subjective
testing and thus much faster to construct * which are important real world considerations[
2[ Multinetwork modelling\ using a separate solution for distinct individual hydrological event types\

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1069 R[ J[ ABRAHART AND LINDA SEE

Figure 09[ )COE calculated on _rst high!level rising event cluster at Kilgram

provided improved performance on two rising event clusters and appears to o}er considerable scope and
promise for future developments in applied operational forecasting[

ACKNOWLEDGEMENTS

SNNS "Stuttgart Neural Network Simulator# was developed in the Institute for Parallel and Distributed
High Performance Systems at the University of Stuttgart[ The SOM package was developed in the Laboratory
of Computer and Information Science at the Helsinki University of Technology[ Upper River Wye data
were collected by the UK Institute of Hydrology[ River Ouse data were provided by the UK Environment
Agency[

REFERENCES

Abrahart RJ[ 0888[ Neurohydrology] implementation options and a research agenda[ Area 20"1#] 030Ð038[
Abrahart RJ\ See L[ 0887[ Neural network vs[ ARMA modelling] constructing benchmark case studies of river ~ow prediction[
GeoComputation|87] Proceedin`s Third International Conference on GeoComputation\ University of Bristol\ 06Ð08 September[
http]::www[geog[port[ac[uk:geocomp:geo87:94:gcÐ94[htm
Abrahart RJ\ See L[ 0888[ Fusing multi!model hydrological data[ IJCNN|88] Proceedin`s International Joint Conference on Neural
Networks\ Washin`ton DC\ 09Ð05 July ðCD!ROMŁ[
Abrahart RJ\ White S[ In press[ Modelling sediment transfer in Malawi] comparing backpropagation neural network solutions against
a multiple linear regression benchmark using small data sets[ Physics and Chemistry of the Earth[
Abrahart RJ\ See L\ Kneale PE[ 0888[ Applying saliency analysis to neural network rainfall!runo} modelling[ GeoComputation|88]
Proceedin`s Fourth International Conference on GeoComputation\ Mary Washin`ton Colle`e\ Fredericksbur`\ Vir`inia\ 14Ð17 July
ðCD!ROMŁ[
Bathurst J[ 0875[ Sensitivity analysis of the Systeme Hydrologique Europeen for an upland catchment[ Journal of Hydrolo`y 76] 092Ð
012[

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
NEURAL NETWORK AND ARMA 1060

Blackie JR\ Eeles WO[ 0874[ Lumped catchment models[ In Hydrolo`ical Forecastin`\ Anderson MG\ Burt TP "eds#[ Wiley] Chichester^
200Ð234[
Box GEP\ Jenkins GM[ 0865[ Time Series Analysis] Forecastin` and Control[ Holden!Day] Oakland\ CA[
Braun H\ Ragg T[ 0886[ ENZO*User Manual and Implementation Guide*Version 0[9[ Institute for Logic\ Complexity and Deduction
Systems\ University of Karlsruhe] Karlsruhe[
Campolo M\ Andreussi P\ Soldati A[ 0888[ River ~ood forecasting with a neural network model[ Water Resources Research 24] 0080Ð
0086[
Caudill M\ Butler C[ 0881[ Understandin` Neural Networks] Computer Explorations\ Vol[ 0\ Basic Networks[ MIT Press] Cambridge\
MA[
Cheng X\ Noguchi M[ 0885[ Rainfall!runo} modelling by neural network approach[ Proceedin`s International Conference on Water
Resources and Environment Research] Towards the 10st Century\ Kyoto\ Japan\ 18Ð20 October 0885\ 1] 032Ð049[
Daniell TM[ 0880[ Neural networks * applications in hydrology and water resources engineering[ Proceedin`s\ International Hydrolo`y
and Water Resources Symposium\ Vol[ 2[ National Conference Publication 80:11\ Institute of Engineering\ Australia] Barton\ ACT^
686Ð791[
Dawson CW\ Wilby RE[ 0887[ An arti_cial neural network approach to rainfallÐruno} modelling[ Hydrolo`ical Sciences Journal 32]
36Ð55[
Diskin MH\ Simon E[ 0866[ A procedure for the selection of objective functions for hydrological conceptual models[ Journal of
Hydrolo`y 23] 018Ð038[
Fahlman SE\ Lebiere C[ 0889[ The cascade!correlation learning architecture[ In Advances in Neural Information Processin` Systems\
Vol[ 1\ Touretzky D "ed[#[ Morgan Kaufmann^ San Mateo\ CA^ 413Ð421[
Fischer MM\ Gopal S[ 0883[ Arti_cial neural networks] a new approach to modelling interregional telecommunication ~ows[ Journal
of Re`ional Science 23] 492Ð416[
Fortin V\ Ouarda TBMJ\ Bobee B[ 0886[ Comment on {The use of arti_cial neural networks for the prediction of water quality
parameters| by H[ R[ Maier and G[ C[ Dandy[ Water Resources Research 22] 1312Ð1313[
French MN\ Krajewski WF\ Cuykendall RR[ 0881[ Rainfall forecasting in space and time using a neural network[ Journal of Hydrolo`y
026] 0Ð20[
Hillel D[ 0875[ Modeling in soil physics] A critical review[ In Future Developments in Soil Science Research[ Soil Science Society of
America] Madison\ WI^ 24Ð31[
Hornik K\ Stinchcombe M\ White H[ 0878[ Multilayer feedforward networks are universal approximators[ Neural Networks 1] 248Ð
255[
Hsu K!L\ Gupta HV\ Sorooshian S[ 0884[ Arti_cial neural network modeling of the rainfallÐruno} process[ Water Resources Research
20] 1406Ð1429[
Jakeman AJ[ 0882[ How much complexity is warranted in a rainfallÐruno} model< Water Resources Research 18] 1526Ð1538[
Kang KW\ Park CY\ Kim JH[ 0882[ Neural network and its application to rainfallÐruno} forecasting[ Korean Journal of Hydrosciences
3] 0Ð8[
Karunanithi N\ Grenney WJ\ Whitley D\ Bovee K[ 0883[ Neural networks for river ~ow prediction[ Journal of Computin` in Civil
En`ineerin` 7] 190Ð119[
Kohonen T[ 0884[ Self!Or`anizin` Maps[ Springer!Verlag] Heidelberg[
Lorrai M\ Sechi GM[ 0884[ Neural nets for modelling rainfallÐruno} transformations[ Water Resources Mana`ement 8] 188Ð202[
Maier HR\ Dandy GC[ 0885[ The use of arti_cial neural networks for the prediction of water quality parameters[ Water Resources
Research 21"3#] 0902Ð0911[
Maier HR\ Dandy GC[ 0886[ Reply[ Water Resources Research 22"09#] 1314Ð1316[
Masters T[ 0884[ Neural\ Novel and Hybrid Al`orithms for Time Series Prediction[ Wiley] New York[
Minns AW[ 0885[ Extended rainfallÐruno} modelling using arti_cial neural networks[ In Hydroinformatics |85] Proceedin`s 1nd
International Conference on Hydroinformatics\ Zurich\ Switzerland\ 8Ð02 September 0885\ Vol[ 0\ Muller A "ed[#[ A[ A[ Balkema]
Rotterdam^ 196Ð102[
Minns AW[ 0887[ Modelling of 0!D pure advection processes using arti_cial neural networks[ In Hydroinformatics |87] Proceedin`s
Third International Conference on Hydroinformatics\ Copenha`en\ Denmark\ 13Ð15 Au`ust\ Vol[ 1\ Babovic V\ Larsen CL "eds#[ A[
A[ Balkema] Rotterdam^ 794Ð701[
Minns AW\ Hall MJ[ 0885[ Arti_cial neutral networks as rainfallÐruno} models[ Hydrolo`ical Sciences Journal 30] 288Ð306[
Minns AW\ Hall MJ[ 0886[ Living with the ultimate black box] more on arti_cial neural networks[ Proceedin`s Sixth National
Hydrolo`y Symposium\ University of Salford\ 04Ð07 September^ 8[34Ð8[38[
Nash JE\ Sutcli}e JV[ 0869[ River ~ow forecasting through conceptual models[ Journal of Hydrolo`y 09] 171Ð189[
NNRC[ 0887[ Neural Network Research Centre[ http]::www[cis[hut[_:nnrc:nnrc!programs[html
Openshaw S[ 0883[ Neuroclassi_cation of spatial data[ In Neural Nets] Applications in Geo`raphy\ Hewitson BC\ Crane RG "eds#[
Kluwer Academic Publishers] Dordrecht^ 42Ð69[
Openshaw S\ Openshaw C[ 0886[ Arti_cial Intelli`ence in Geo`raphy[ Wiley] Chichester[
Quinn PF\ Beven KJ[ 0882[ Spatial and temporal predictions of soil moisture dynamics\ runo}\ variable source areas and eva!
potranspiration for Plynlimon\ Mid!Wales[ Hydrolo`ical Processes 6] 314Ð337[
Rumelhart DE\ Hinton GE\ Williams RJ[ 0875[ Learning internal representations by error propagations[ In Parallel Distributed
Processin`] Explorations in the Microstructures of Co`nition\ Vol[ 0\ Rumelhart DE\ McClelland JL "eds#[ MIT Press] Cambridge\
MA^ 207Ð251[
Sarle WS[ 0887[ FAQ document for Usenet newsgroup {comp[ai[neural!nets| ftp]::ftp[sas[com:pub:neural:FAQ[html ðDecember 0887Ł[
See L\ Corne S\ Dougherty M\ Openshaw S[ 0886[ Some initial experiments with neural network models of ~ood forecasting on the
River Ouse[ GeoComputation |86] Proceedin`s 1nd International Conference on GeoComputation\ University of Ota`o\ Dunedin\ New
Zealand\ 15Ð18 Au`ust[

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#
1061 R[ J[ ABRAHART AND LINDA SEE

Smith J\ Eli RN[ 0884[ Neural!network models of rainfallÐruno} process[ Journal of Water Resources Plannin` and Mana`ement 010]
388Ð498[
SNNS Group[ 0889Ð87[ Stutt`art Neural Network Simulator * User Manual * Version 3[0[ http]::www!ra[informatik[uni!tuebingen!
[de:SNNS:
Tveter D[ 0885Ð7[ Backpropa`ator|s Review[ http]::www[mcs[com:½drt:bprefs[html
Yang R[ 0886[ Application of neural networks and `enetic al`orithms to modellin` ~ood dischar`es and urban water quality[ Unpublished
PhD Thesis\ Department of Geography\ University of Manchester[

Copyright Þ 1999 John Wiley + Sons\ Ltd[ Hydrol[ Process[ 03\ 1046Ð1061 "1999#

You might also like