Professional Documents
Culture Documents
1. Introduction
The EKF has been applied extensively to the field of nonlinear estimation. General application areas may be divided
into state-estimation and machine leaming. We further divide machine learning into parameter estimation and dual
estimation. The framework for these areas are briefly reviewed next.
This work was sponsored by the NSF under grant grant IRI-9712346
0-7803-5800-7/00$10.0002000 IEEE
State-estimation
The basic framework for the EKF involves estimation of the
state of a discrete-time nonlinear dynamic system,
xk+i
Yk
= F ( x ~v k, )
= H(Xk,nk),
(1)
(2)
Parameter Estimation
The classic machine learning problem involves determining
a nonlinear mapping
Yk = G ( X k , W )
(3)
153
yk = G(xk, wk)
+ eh.
(5)
where the parameters W L correspond to a stationary process with identity state transition matrix, driven by process
noise uk (the choice of variance determines tracking performance). The output Yk corresponds to a nonlinear observation on w k . The EKF can then be applied directly as an
efficient second-order technique for learning the parameters. In the linear case, the relationship between the Kalman
Filter (KF) and Recursive Least Squares (RLS) is given in
[3]. The use of the EKF for training neural networks has
been developed by Singhal and Wu [9] and Puskorious and
Feldkamp [ 8 ] .
iq
E [ F ( e k - l > vk-l)]
lck
PXbYk
9;
E[H(f,,nk)],
(9)
(10)
p&
(11)
--
Xk
F(kk-l,+)
(12)
Dual Estimation
Kk
PXbYlrP&
(13)
9;
H(k;,fi),
(14)
Xk+l
F(Xk,Vk,W)
(6)
Yk
H(xk,nk;w).
(7)
where both the system states xk and the set of model parameters w for the dynamic system must be simultaneously estimated from only the observed noisy signal yk. Approaches
to dual-estimation are discussed in Section 4.2.
In the next section we explain the basic assumptions and
flaws with the using the EKF. In Section 3, we introduce the
Unscented Kalman Filter (UKF) as a method to amend the
flaws in the EKF. Finally, in Section 4, we present results of
using the UKF for the different areas of nonlinear estimation.
2.
[61),
= (prediction of x k )
(8)
154
Actual (sampling)
The unscented transformation (UT) is a method for calculating the statistics of a random variable which undergoes
a nonlinear transformation [SI. Consider propagating a random variable x (dimension L ) through a nonlinear function,
y = g(x). Assume x has mean 5i and covariance P,. To
calculate the statistics of y, we form a matrix K of 2L + 1
sigma vectors X,(with corresponding weights W,),according to the following:
xo
x,
= 5i
Linearized (EKF)
UT
w m y $
*/
0
Y = f(X)
y = f(Z)
(15)
ed
"ts
i = 1 , . . . ,L
= a + ( l / m )
2
X,
= ~ - ( / ~ ) , -i =, ~ + 1.,. . , 2~
w p = A / ( L + A)
WJ"' = A / ( L + A )
Figure 1: Example of the UT for mean and covariance propagation. a) actual, b) first-order linearization(EKF), c) UT.
+ ( 1 - a2+ p)
w p = wp = 1 / { 2 ( L+ A)} i = 1 , . . . ,2L
where X = a 2 ( L+ - L is a scaling parameter. a deterK)
( d m ) i
yi = g ( & )
i = 0,. . . ,2L
(16)
and the mean and covariance for y are approximated using a weighted sample mean and covariance of the posterior
sigma points,
nr
(17)
x$.
i=O
4.
2L
P,
W,!"
{Yi- y } {Yi - Y } T
(18)
i=O
Note that this method differs substantially from general "sampling'' methods (e.g., Monte-Carlo methods such as particle
filters [ 13) which require orders of magnitude more sample
points in an attempt to propagate an accurate (possibly nonGaussian) distribution of the state. The deceptively simple approach taken with the UT results in approximations
that are accurate to the third order for Gaussian inputs for
all nonlinearities. For non-Gaussian inputs, approximations
are accurate to at least the second-order, with the accuracy
of third and higher order moments determined by the choice
of a and /3 (See [4] for a detailed discussion of the UT). A
simple example is shown in Figure 1 for a 2-dimensional
system: the left plot shows the true mean and covariance
propagation using Monte-Carlo sampling; the center plots
In order to illustrate the UKF for state-estimation, we provide a new application example corresponding to noisy timeseries estimation.
In this example, the UKF is used to estimate an underlying clean time-series corrupted by additive Gaussian white
noise. The time-series used is the Mackey-Glass-30 chaotic
155
Initialize with:
Next, white Gaussian noise was added to the clean MackeyGlass series to generate a noisy time-series Y k = Xk n k .
The corresponding state-space representation is given by:
Bo = E[xo]
Po = E[(xo- Po)(xo
- ko)']
P; = E [ ( x ; - i$)(x$ - i;)']=
rPn
0
0
P,
P,
1-
ForkE(1, . . . ,m},
yk
o ...
= [I
01 . x k + n k
(20)
Time update:
I-
dean1
i=O
-5
200
220
210
240
230
250
260
280
270
290
3W
290
3W
2L
P+L.~
W,(C)[Yi,klk
-l
?il[~i,klk-1
- 9;lT
i=O
2L
pxkyk
- -T
W,!C)[Xi,klk-l
- kk][yi,klk-l- 3'1, 1
=
i=O
K: = P X k Y k P&
kk = k; + K ( y k - 9,)
Pk = P; - K P j r k i k K T
I
200
2!p 0 6
+ vk
(19)
!,
--2 0 4
253
240
LA
=02
OO
= f ( z k - 1 , ...xk-&f,W)
230
260
280
270
k
Estimation Error : EKF vs UKF on Mackey-Glass
xk
220
210
;
1w
-200
'i'
;3
I
-.
300
400
;\
',
TL
sw
6w
7M)
'
Boo
<"'
i
9w
1wo
156
+ +
0.55
0.45
0.51
Y
3 0.4 -
3.35 -
measurement equation.
In the joint extended Kalman jilter [7], the signal-state
and weight vectors are concatenated into a single,joint state
vector: [xzwzIT. Estimation is done recursively by writing the state-space equations for the joint state as:
0.3
0.25
epah
0.2
20
15
10
25
30
0.7
....
0.5
EE$
JoinIEKF
hmlUKF
0..
2
e
0.3-
O2t
0.1
....................
.....-..-
epDdl
6
10
LI
12
We present results on two time-series to provide a clear illustration of the use of the UKF over the EKE The first
series is again the Mackey-Glass-30 chaotic series with additive noise (SNR x 3dB). The second time series (also
chaotic) comes from an autoregressive neural network with
random weights driven by Gaussian process noise and also
' p is usually set to a small constant which can be related to the timeconstant for RLS weight decay [3]. For a data length of 1000. /J x l e - 4
was used.
4The covariance of U is again adapted using the RLS-weight-decay
method.
.............................................
200
210
220
230
240
250
260
270
280
290
3W
157
epoch
5
10
20
15
25
30
35
45
40
50
100
lo-'
10
15
~i~~~~4: ~~~~~~i~~~
of learning
20
25
30
References
158