You are on page 1of 15

Information Sciences 121 (1999) 201215

www.elsevier.com/locate/ins

Adaptive interaction and its application to


neural networks q
Robert D. Brandt a, Feng Lin
b

b,*

a
Intelligent Devices, Inc., 465 Whittier Ave., Glen Ellyn, IL 60137, USA
Electrical and Computer Engineering, Wayne State University, 5050 Anthony Wayne Drive,
Detroit, MI 48202, USA

Received 26 June 1998; received in revised form 2 December 1998; accepted 2 April 1999

Abstract
Adaptive interaction is a new approach to introduce adaptability into man-made
systems. In this approach, a system is decomposed into interconnected subsystems that
we call devices and adaptation occurs in the interactions. More precisely, interaction
weights among these devices will be adapted in order to achieve the objective of minimizing a given cost function. The adaptation algorithm developed is mathematically
equivalent to a gradient descent algorithm but requires only local information in its
implementation. One particular application of adaptive interaction that we study in this
paper is in neural networks. By applying adaptive interaction, we can achieve essentially
the same adaptation as that using the well-known back-propagation algorithm but
without the need of a feedback network to propagate the errors, which has many advantages in practice. A simulation is provided to show the eectiveness of our approach. 1999 Elsevier Science Inc. All rights reserved.
Keywords: Adaptive interaction; Neural network; Back-propagation

1. Introduction
Adaptation is one of the most important mechanisms in living organisms (or
natural systems) [1,2,12,26]. Take human beings for example. Suppose that the
q

This research is supported in part by the National Science Foundation under grants ECS9315344 and INT-9602485.
*
Corresponding author. Fax: +1-313-577-1101.
E-mail address: in@ece.eng.wayne.edu (F. Lin).
0020-0255/99/$ - see front matter 1999 Elsevier Science Inc. All rights reserved.
PII: S 0 0 2 0 - 0 2 5 5 ( 9 9 ) 0 0 0 9 0 - 0

202

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

Detroit Pistons have just recruited a fresh new basketball star. In order for him
to become the second Grant Hill, he must practice with his teammates and
cooperate with other players by adapting his play. Another example is the
Castros family who performed a seven-person pyramid on high wire in Detroit
Hart Plaza. They maintain the human pyramid while walking over a suspended
cable. Needless to say that a great deal of adaptation must take place before
this can be done. As a matter of fact, the brothers, sisters, and nephews have
been adapting this for more than seven years.
These two examples (and many more can be given if we wish) show that
adaptation occurs naturally and constantly in natural systems. This, unfortunately, cannot be said for man-made systems. We are still waiting to see an
airplane that can adapt like a bird on its own (or a car, a train, for that matter).
Only very few man-made systems have the build-in capability of adaptation.
We suspect that this lack of adaptability in man-made systems is due to the
lack of understanding of adaptation mechanisms. This lack of understanding
has resulted in some ``unnatural'' approach to adaptation in man-made systems. Let us consider, for example, adaptive control systems [1416] which are
perhaps the most commonly used man-made adaptive systems. For an adaptive control system to work, we must rst develop a model of the system to be
controlled from physical laws; we must then identify the unknown parameters
of the system by some elaborated identication scheme; and nally we will use
some sophisticated synthesis methods to adjust the parameters of the controller
so that it can adapt.
Obviously, natural systems do not adapt in this way. Grant Hill does not
need to know the dynamics behind the trajectory of his basketball, nor does he
need to estimate parameters of his teammates. As a matter of fact, he does not
even have a model! However, he adapted to become successful. Similarly, the
Castros family may not even know Newton's law of gravity, but they managed
to perform the pyramid seven years without a fall.
Therefore, we submit that our view of adaptation must be modied (i.e.,
adapted). Adaptation in man-made systems must be made more ``natural''. We
must learn from adaptations of natural systems. In fact, such attempts have
been made in the past with success. One example is the use of (articial) neural
networks. Inspired by our own brain, neural networks incorporate adaptation
mechanisms that make them wonderful in many engineering applications, especially when the model of a system is unknown or dicult to obtain
[10,11,2325,27,28].
A question of great interest is thus the following: Are such adaptation
mechanisms unique to neural networks? That is, can devices other than neurons adapt in a similar manner that requires no precise modeling and identication? In other words, is neuron a unique creation of evolution or merely a
biological convenience? We will show that the answers to the above questions
are no, yes and no, respectively. As a matter of fact, we will show that any

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

203

devices interconnected and interactive can adapt by adjusting their interactions, much like neurons adjusting their synapses. This is true for dynamic or
static systems, and for linear or nonlinear systems [7,20].
One feature of our approach of adaptive interaction is the decomposition of
a complex adaptive system into subsystems that we called devices and their
interactions via connections. We assume that adaptation occurs in the interactions. This is done without loss of generality, because the partition of devices
and interactions is arbitrary and can be specied by the user. We model a
device by a general (causal) mapping from its input to its output. Thus, we can
handle linear and nonlinear systems in the same manner.
The result of our adaptation algorithm 1 is essentially equivalent to that of
gradient descent. However, our algorithm is implemented locally. In other
words, the adaptation of an interaction is based on information available locally (that is, in the devices which the interaction is connected to and from).
This is possible because in our algorithm, we do not attempt to calculate the
gradient directly, as the direct calculation may require global information.
Rather, we infer the gradient from locally available information. As we will
show, this localization not only is convenient, but also has important implications in its application in neural networks.
We have successfully applied this approach of adaptive interaction to
adaptive control and system identication [18,19]. These applications resulted
in methods that are very dierent from the traditional methods. In particular, a
self-tuning method for PID controllers based on adaptive interaction was developed in [18]. The advantages of this tuning method includes: (1) It is very
simple and can be easily implemented. (2) It requires virtually no knowledge of
the plant. (3) It works for nonlinear as well as linear systems. (4) It is automatic
and requires no human intervention. (5) It works on-line as well as o-line. (6)
Stability is guaranteed after convergence. (7) The initial system can be stable or
unstable.
In this paper, we will apply the approach of adaptive interaction to neural
networks. Before our approach, people use back-propagation algorithm to
adapt synapses (that is, interactions) in a neural network. To use the backpropagation algorithm, a dedicated companion (feedback) network to propagate the error back is required. This may complicate implementations of the
back-propagation algorithm, especially hardware implementations. On the
other hand, using adaptive interaction, we can eliminate the need for such a
feedback network, and hence signicantly reduce the complexity of adaptation
for complex neural networks. This is particularly important in VLSI implementations of neural networks [9,13,17,21,29]. The absence of the feedback

Here we use the word ``algorithm'' in a generalized sense to mean a (mathematical) description
or model for calculating and updating system parameters.

204

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

network means that adding trainability to a chip design does not involve additional wiring-layout complexity between neurons. A trainable neuron can be
designed as a standard unit without considering network topology. These
trainable neurons can then be connected in any way the designer wants. Obviously, this increases the potential for designing networks with dynamically
recongurable topologies.
Furthermore, our adaptation algorithm also has an important implication
on biological plausibility of similar adaptations occurring in biological neurons. Since the back-propagation algorithm was proposed in 1980s, researchers
have speculated about whether an analogous adaptation mechanism might be
observed in biological neural systems [30]. The consensus among neuroscientists is that this is not likely [8]. The main reason for this belief is that the
requirement of a separated feedback network is unlikely to be met in a biological neural system. This is not to say that reciprocal connections are rare in
biological neural systems (in fact they are ubiquitous); but rather that it is
unlikely that a biological neural system could satisfy the strict requirement that
there exists a one-to-one correspondence between connections in the feedforward and feedback networks and the corresponding connections in the two
networks maintain identical weights even as they adapt. This seems even less
likely given the fact that in most biological neural systems a connection between two neurons is composed of many (even hundreds) of synapses. With the
elimination of the feedback network, the problem of biological plausibility of
similar adaptation occurring in biological neurons may need to be reinvestigated.
2. Adaptive interaction
Adaptive interaction considers a complex system consisting of N subsystems
which we called devices. Each device (indexed by n 2 N : f1; 2; . . . ; N g) has
an integrable output signal yn and an integrable input signal xn . The dynamics
of each device is described by a (generally nonlinear) causal 2 functional
F n : Xn ! Yn ;

n 2 N;

where Xn and Yn are the input and output spaces, respectively. That is, the
output yn t of the nth device relates to its input xn t by
yn t Fn  xn t Fn xn t;

n 2 N;

where  denotes composition.


2

A functional Fn : Xn ! Yn is causal if yn t depends only on the previous history of


xn ; fxn s : s 6 tg.

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

205

Fig. 1. A typical decomposition of a system for adaptive interaction.

We assume the Fr


ochet derivative of Fn exists. 3 We further assume that
each device is a single-input single-output system.
An interaction between two devices consists of a (generally non-exclusive)
functional dependence of the input of one of the devices on the outputs of the
others and is mediated by an information carrying connection denoted by c.
The set of all connections is denoted by C. We assume that there is at most one
connection from one device to another. Let prec be the device whose output is
conveyed by connection c and postc the device whose input depends on the
signal conveyed by connection c. We denote the set of input connections for the
nth device by In fc : prec ng and the set of output connections by
On fc : postc ng. A typical system is illustrated in Fig. 1. In the gure, for
example, the set of input connections of Device 2 is I2 fc1 ; c3 g and the set of
output connections is O2 fc4 g. Also, c1 connects Devices 1 and 2, therefore
prec1 1; postc1 2.
For the purpose of this paper, we consider only linear interactions, that is,
we assume that the input to a device is a linear combination of the output of
other devices via connections in In and possibly an external input signal un t
X
xn t un t
ac yprec t;
n 2 N;
c2In

where ac are the connection weights.


With this linear interaction, the dynamics of the system is described by
"
#
X
ac yprec t ;
n 2 N:
yn t Fn un t
c2In

To simplify the notation, in the rest of the paper, we will eliminate when appropriate the explicit reference to time t.
3

The Fr
ochet derivative [22], F0n x, of Fn x, is dened as a functional such that
lim

kDk!0

kFn x DFn x F0n x  Dk


0:
kDk

206

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

The goal of our approach is to develop an algorithm to adapt the connection


weights ac so that some performance index Ey1 ; . . . ; yn will be minimized. The
only assumption we make to ensure the correctness of our adaptation algorithm is that the following equation:
0
1
0
dE
 Fpost
xposts
X
oE
dyposts
s
A
as a_ s c
a_ c @
0
dE
oypostc
 Fpost
xposts  ypostc
s2Opost dypost
s
c

0
xpostc  yprec ;
 Fpost
c

c 2 C;

has a unique solution for a_ c ; c 2 C. This requires the corresponding Jacobian


determinant to be nonzero in the region of interest.
The adaptation algorithm is given in the following theorem.
Theorem 1. For the system with dynamics given by
"
#
X
ac yprec ;
y n F n un
n 2 N;

if connection weights ac are adapted according to


0
1
0
dE
 Fpost
xposts
X
oE
dyposts
s
A
a_ c @
as a_ s dE
c
0
oypostc
 Fpost
xposts  ypostc
s2Opost
dypost
s

c2In

0
xpostc
Fpost
c

 yprec ;

c 2 C;

then the performance index E will decrease monotonically with time. In fact, the
following is always satisfied:
a_ c c

dE
;
dac

c 2 C;

where c > 0 is some adaptation coefficient.


Proof. Since by our assumption equation (2) has a unique solution, all we need
to prove is that equation (3) satises equation (2). Because E is a functional of
yn ; n 2 N, we have for any connection c 2 C,
dypostc dxpostc
dE
dE



dac
dac
dypostc dxpostc
dypostc
dE


 yprec
dypostc dxpostc
dE
0

 Fpost
xpostc  yprec :
c
dypostc
Also, for any n 2 N,

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

dypostc
dE
oE X dE


dyn
dyn
oyn c2On dypostc

oE X
dE
0

ac
 Fpost
xpostc :
c
oyn c2On dypostc

Using these two equations, we have


dE
oE X dE

ac
dyn oyn c2On dac

0
dE
 Fpost
xpostc
dypostc
c
0
dE
 Fpostc xpostc  yprec
dypostc

Substitute dE=dac by a_ c =c, we have


!
0
dE
 Fpost
xpostc
dE
oE X
a_ c
dypostc
c

ac
0
dE
c dypost  Fpostc xpostc  yprec
dyn
oyn c2On
c

0
dE
 Fpost
xpostc
oE 1 X
dypostc
c

ac a_ c dE
:
0
oyn c c2On
 Fpostc xpostc  yprec
dypost
c

Therefore,
a_ c c

dE
dac

dE
0
 Fpost
xpostc  yprec
c
dypostc
0
1
0
dE
 Fpost
x

X
post
oE
1
s
dyposts
s
A

c@
as a_ s dE
0
oypostc c s2Opost
 Fpost
x


y
post
pre
s
s
dypost
s
c

0
xpostc  yprec
 Fpost
c
0
1
0
dE
 Fpost
x

X
post
oE
s
dyposts
s
A
c
@
as a_ s dE
0
oy
 Fpost
x


y
post
post
pre
c
s
s
s2Opost
dypost
s
c

0
xpostc  yprec :
 Fpost
c

Finally since ypostc ypres ,


0
1
0
dE

F
x

X
post
posts
oE A
s
dyposts
a_ c @
as a_ s dE
c
0
oy

F
x


y
postc
post
post
posts
s
c
s2Opost
dypost
c

0
xpostc  yprec :
 Fpost
c

207

208

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

This shows that Eq. (3) is the unique solution to Eq. (2).

If Fn and E are instantaneous functions, which is the case for neural


networks, then the composition  can be replaced by multiplication in the
adaptation algorithm. In other words, the adaptation algorithm can be simplied as
yprec X
oE
0
0
xpostc
as a_ s cFpost
xpostc yprec
:
4
a_ c Fpost
c
c
ypostc s2Opost
oypostc
c

We have applied the above adaptation algorithm to adaptive control and


system identication [18,19]. Simulations show that the results are excellent
0
xpostc by a constant. When we use the approxieven if we approximate Fpost
c
mation, there is no need to know Fpostc xpostc , that is, the model of the device is
not needed. As we indicated early, adaptation without model has many advantages.
3. Neural networks
Let us now apply the adaptation algorithm we developed to neural networks. Therefore, we take the devices in the system as neurons. We use the
standard notations in neural networks and denote, for i; j 2 N,
vi the output of neuron i;
hi the input of neuron i;
ni the external input of neuron i;
wij the weight of the connection from neuron j to neuron i; wij 0 if j is
not connected to i.
We denote by gx the activation function of a neuron. For sigmoidal
neurons,
gx rx

1
:
1 ex

Ignoring the dynamics, we can describe the neural network by


!
X
vi ghi g
wij vj ni ;
i 2 N:

j2N

For output neurons, denote


fi the desired output of neuron i;

We assume that the equation has at least one xed point which is a stable attractor.

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

209

Our goal is to minimize the following error


E
where

1X 2
e;
2 i2N i


ei

vi fi
0

if i is an output neuron;
otherwise:

We can now apply our adaptation algorithm to the neural network. We have
the following substitutions:
ac ! wij
0
Fpost
xpostc
c

! g0 hi

yprec ! vj
ypostc ! vi
as ! wki
oE
oE
!
ei :
oypostc
ovi
Therefore, the adaptation algorithm for neural networks is as follows:
vj X
wki w_ ki cg0 hi vj ei :
w_ ij g0 hi
vi k2N
The above algorithm is mathematically equivalent to the back-propagation
algorithm. However, it does not require a feedback network to propagate the
errors.
By eliminating the feedback network, our new algorithm allows a much
simpler implementation than that of the back-propagation algorithm. Using
our algorithm, adaptation mechanism can be built within each neuron to
make the neuron trainable. A trainable neuron can be built as a standard
unit. As a particular application requires, these trainable neurons can be
interconnected arbitrarily with minimum wiring. In this way, it is much easier
to change the topology of a neural network, in other words, to recongure
the network.
In the rest of the paper, we will assume gx rx. Since
r0 x rxrx;
we can re-write the adaptation algorithm as

210

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

Fig. 2. A standard unit for trainable neuron using our adaptive algorithm.

vj X
wki w_ ki cr0 hi vj ei
vi k2N
vj X
rhi vi
wki w_ ki crhi vi vj ei
vi k2N
!0
1 X 2
rhi vj
w
crhi vi vj ei :
2 k2N ki

w_ ij r0 hi

A standard unit of a trainable neuron that implements the above equation is


shown in Fig. 2.
4. Application: Function approximation
Let us now apply the adaptation algorithm developed in Section 3 to
function approximation. That is, our goal is to train a neural network to approximate a set of nonlinear functions
ai wi b1 ; . . . ; bl ;

i 1; . . . ; m;

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

211

over the domain D  Rl . We will denote the inputs and outputs as


b b1 ; . . . ; bl , a a1 ; . . . ; am , and hence a wb.
We construct a network with l inputs and m outputs. The number of neurons and the topology of connections will determine the achievable accuracy of
the approximation. In general, more neurons and connections in the network
will result in more accurate approximation. To this end, we can dene the
topological capacity of a network to be the total number of connections. For
constructing a network having largest topological capacity for a given number
of neurons, we refer the reader to [5].
To train the network, we vary the input b over time according to
b gt:
The function g must be such that the trajectory bt gt will repeatedly visit
all regions of D, more or less uniformly.
Denote the input neurons by 1; . . . ; l and the output neurons by
N m 1; . . . ; N . Let
vj bj ;

j 1; . . . ; l:

Then the outputs of the neural network are functions of bj ; j 1; . . . ; l


vi /i v1 ; . . . ; vl /i b1 ; . . . ; bl ;

i N m 1; . . . ; N :

Denote v vN m1 ; . . . ; vN and hence v /b.


Since we want / to approximate w, we will adapt the synapse weights to
minimize the error
E

N
1 X
2
ai vi :
2 iN m1

To illustrate the eectiveness of such function approximation, we build a


neural network of two layers as shown in Fig. 3 using the standard units described in Section 3.
We use this neural network to approximate the following nonlinear function
a b1 b2 2b1 b2 :
Note that the XOR problem can be expressed by this function.
We performed a simulation on this network. In the simulation, we let b1 and
b2 vary over the region D 0:1; 0:9  0:1; 0:9 as follows:
b1 0:5 0:4 cos0:001pt;
b2 0:5 0:4 cos0:002pt:

212

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

Fig. 3. A neural network to approximate generalized XOR function.

Fig. 4. Simulation results of neural network in Fig. 3 error decreases as the network adapts.

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

213

Fig. 5. Simulation results of neural network in Fig. 3: six connection weights adapt according to
the adaptive algorithm.

We take
c 0:03;
and select the initial values of wi randomly at the interval 2; 2. The simulation results are shown in Figs. 4 and 5.
It is clear from the simulation that the neural network adapts nicely as the
error has decreased signicantly during the simulation. More simulation results
can be found in [3,4,6], where convergence rate, adaptation coecient, and
basin of attraction are studied in details.
5. Conclusion
In this paper, a new approach to system adaptation was proposed. We view
the adaptation of a system as accomplished by adaptive interaction among
subsystems (devices). We derived an adaption algorithm for adapting the

214

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

interactions that can be implemented based on local information only. Furthermore, an approximation of this algorithm does not require the knowledge
of the models of the devices. We applied this approach of adaptive interaction
to neural networks. The adaptation algorithm obtained is mathematically
equivalent to the well-known back-propagation algorithm but requires no
feedback networks to propagate the errors.

References
[1] W.R. Ashby, Design for a Brain, Wiley, New York, 1960.
[2] R.C. Bolles, M.D. Beecher, (Eds.) Evolution and Learning, Lawrence Erlbaum, London,
1988.
[3] R.D. Brandt, F. Lin, Supervised learning in neural networks without explicit error backpropagation, in: Proceedings of the 32nd Annual Allerton Conference on Communication,
Control and Computing, 1994, pp. 294303.
[4] R.D. Brandt, F. Lin, Can supervised learning be achieved without explicit error backpropagation? in: Proceedings of the International Conference on Neural Networks, 1996a, pp.
300305.
[5] R.D. Brandt, F. Lin, Optimal layering of neurons, in: 1996 IEEE International Symposium on
Intelligent Control, 1996b, pp. 497501.
[6] R.D. Brandt, F. Lin, Supervised learning in neural networks without feedback network, in:
1996 IEEE International Symposium on Intelligent Control, 1996c, pp. 8690.
[7] R.D. Brandt, F. Lin, Theory of Adaptive Interaction, AFI Press (to appear).
[8] F. Crick, The recent excitement about neural networks, Nature 337 (1989) 129132.
[9] B.K. Bolenko, H.C. Card, Tolerance to analog hardware of on-chip learning in backpropagation networks, IEEE Trans. Neural Networks 6 (5) (1995) 10451052.
[10] S. Haykin, Neural Networks: A Comprehensive Foundation, IEEE Press, New York, 1994.
[11] J. Hertz, A. Krogh, R.G. Palmer, Introduction to the Theory of Neural Computation,
AddisonWesley, Reading, MA, 1991.
[12] J.H. Holland, Adaptation in Natural and Articial Systems, MIT Press, Cambridge, 1992.
[13] P.W. Hollis, J.J. Paulos, A neural network learning algorithm tailored for VLSI implementation, IEEE Trans. Neural Networks 5 (5) (1994) 781791.
[14] P.A. Ioannou, J. Sun, Robust Adaptive Control, Prentice-Hall, Englewood Clis, NJ, 1996.
[15] R. Isermann, K.-H. Lachmann, D. Matko, Adaptive Control Systems, Prentice-Hall,
Englewood Clis, NJ, 1992.
[16] Y. D. Landau, Adaptive Control: The Model Reference Approach, Marcel Dekker, New
York, 1979.
[17] J.A. Lansner, T. Lehmann, An analog CMOS chip set for neural networks with arbitrary
topologies, IEEE Trans. Neural Networks 4 (3) (1993) 441444.
[18] F. Lin, R.D. Brandt, G. Sailalis, Self-tuning of PID controllers by adaptive interaction, 1998
(preprint).
[19] F. Lin, R.D. Brandt, G. Sailalis, Parameter estimation using adaptive interaction, 1998
(preprint).
[20] F. Lin, R.D. Brandt, Adaptive interaction: A new approach to adaptation, 1998 (preprint).
[21] B. Linares-Barranco, E. Sanchez-Sinencio, A. Rodriguez-Vazques, J.L. Huertas, A CMOS
adaptive BAM with on-chip learning and weight refreshing, IEEE Trans. Neural Networks 4
(3) (1993) 445455.
[22] D.G. Luenberger, Optimization by Vector Space Methods, Wiley, New York, 1968.

R.D. Brandt, F. Lin / Information Sciences 121 (1999) 201215

215

[23] D.B. Parker, Optimal algorithms for adaptive networks: second-order back-propagation,
second-order direct propagation, and second-order Hebbian learning, in: Proceedings of the
IEEE International Conference on Neural Networks, 1987, pp. 593600.
[24] K.H. Pribram, Rethinking Neural Networks: Quantum Fields and Biological Data, Lawrence
Erlbaum Associates, Publishers 1993.
[25] D.E. Rumelhart, G.E. Hinton, G.E. Williams, Learning internal representations by error
propagation, in: D.E. Rumelhart, J.L. McClelland (Eds.), Parallel Distributed Processing:
Explorations in the Microstructure of Cognition, vol. 1, Foundations MIT Press, Cambridge,
1986, pp. 318362.
[26] G.C. Williams, Adaptation and Natural Selection, Princeton University Press, Princeton, 1966.
[27] R.J. Williams, On the use of back-propagation in associative reinforcement learning, in:
Proceedings of the IEEE International Conference on Neural Networks, 1988, pp. 263270.
[28] R.J. Williams, Towards a theory of reinforcement-learning connectionist systems, Technical
Report, NU-CCS-88-3, Northeastern University, 1988.
[29] C.-Y. Wu, J.-F. Lan, CMOS current-mode neural associative memory design with on-chop
learning, IEEE Trans. Neural Networks 7 (1) (1996) 167177.
[30] D. Zipser, D.E. Rumelhart, Neurobiol. signicance of new learning models, in: E. Schwartz
(Eds.), Computational Neuroscience, MIT Press, Cambridge, 1990, pp. 192200.

You might also like