You are on page 1of 6

Emirates Journal for Engineering Research, 10 (2), 63-68 (2005) (Regular Paper)

A STUDY OF PERFORMANCE SURFACES OF MA PARAMETER ESTIMATORS


N.A.S. ALWAN
Department of Electrical Engineering, College of Engineering, University of Baghdad. (Received September 2005 and accepted December 2005)

. . . .
The parameter estimation of a moving-average (MA) signal from second-order statistics is considered to be a difficult nonlinear problem the reliable solution of which is the center of much active research. The difficulty arises in the unknown excitation case, which is the most common, due to the error signal being a nonlinear function of the model parameters. In this paper, performance surfaces representing the mean square error as a function of the models parameters for first- and second-order MA estimators are derived and plotted and their properties investigated. A study of these properties provides insight into the most computationally convenient and the most reliable of the vast number of optimum and suboptimum methods which aim at minimizing the mean square error or the total squared error to obtain the model parameters.

1. INTRODUCTION
A special class of stationary random sequences is obtained by driving a linear time-invariant system with white noise. In general, the LTI system has a system function that is rational, i.e. the ratio of two polynomials. The term pole-zero models is used to emphasize the system viewpoint and the term autoregressive moving-average (ARMA) model refers to the resulting random sequence. Similarly, an allzero model is also known as the moving-average (MA) model and all-pole model is known as the autoregressive (AR) model. Parametric models describe a system with a finite number of parameters. Model parameter estimation or signal modeling is central to spectral estimation which finds many applications such as medical diagnosis, speech analysis, seismology and geophysics, radar and sonar, nondestructive fault detection and evaluating the predictability of time series [1]. Estimation of model parameters is achieved using the principal of least squares. In this work, we focus on MA parameter estimation and derive expressions for the performance surfaces representing the mean square error as a function of the model parameters

with the aim of studying the performance surface properties. In Section II, MA parameter estimation is reviewed. In Section III, we derive a general expression for performance surfaces of a MA estimator giving examples of first-order and secondorder estimators. Section IV is a discussion of the findings of Section III. Finally, Section V presents the conclusions of the paper.

2. MA PARAMETER ESTIMATION
A moving-average (MA) model AZ(Q) is an all-zero model of order Q driven by white noise, that is,

x(n) = w(n) + d k w(n k )


k =1

(1)

where x(n) is the model output or the MA signal, the {dk} are the model parameters and w(n) is the driving white noise of zero mean and variance w2. The essence of signal modeling or parameter estimation is the following: Given finite-length data which can be regarded as a sample sequence of the signal under consideration, we want to estimate the signal model parameters {dk} and w2 to satisfy a prescribed criterion.

63

N.A.S. Alwan

.w(n)

Using vector notation, we can express (1) as

x(n) = w(n) + w T (n 1)d


where

(2)

1+D(z) +
.w(n) .x(n) (MA signal)

w (n) = [ w(n) w(n 1)........w(n Q + 1)]T T and d = [ d1d 2 ...........d Q ]


Known excitation: Let us assume for the moment that the excitation w(n) is known. Then we can predict x(n) from past values using the following linear predictor, Figure 1:

D(z)

.x(n)

.e(n)

Figure 1: MA estimator with known excitation.

x' (n) = w T (n 1)d'


where

(3)

.w(n)

d' = [d1 ' d 2 '...............d Q ' ]T


are the predictor parameters. The prediction error

1+D(z)

e(n) = x(n) x' (n) = x(n) w T (n 1)d' equals w(n) if d ' = d


Q

(4)
.w(n)

+ D(z) .x(n)

.x(n) (MA signal)

.e(n)

This linear predictor (estimator) is shown in Figure (1) where

1 + D( z ) = 1 + d k z k
k =1

Figure Figure 2 estimator with unknown excitation. 2: MA : MA estimator with unknown excitation.

and

D' ( z ) = d k ' z k
k =1

e(n) = x(n) x' (n) = x(n) w 'T (n 1)d'. (10)


where

1 + D( z ) and D' ( z ) are the system functions of the


MA model and the estimator Minimization of the mean square error respectively. (5) (6)

w ' (n 1) = [e(n 1)...............e(n Q)]T


Then,

(11) (12)

ms (d' ) = E[e (n)]


2

e( n ) = d k ' e( n k ) + x ( n )
k =1

or the total squared error

ts (d' ) = e 2 (n)
n=0

N 1

leads to the well-known linear optimization problem with the following system of linear equations: R wd' = rw (7) where

R w = E[w (n 1)w T (n 1)] and rw = E[w (n 1) x(n)]

(8) (9)

The prediction error, as can be seen from equation (12), is obtained by exciting the inverse model with the signal x(n). Hence, the inverse model has to be stable. To satisfy this condition, we require that the estimated model be minimum-phase. This estimator is shown in Figure 2. The recursive computation in (12) makes e(n) a non-linear function of the model parameters. In the following section, we consider the implications of this non-linear relation through the derivation of performance surfaces.

for the expectation case [1]. It is worth noting that the estimation of all-pole models also leads to a linear optimization problem. Unknown excitation: In most applications, the excitation w(n) is never known. However, we can obtain a good estimate of x(n) by replacing w(n) by e(n) in (3). This makes a natural choice if the model used to obtain e(n) is reasonably accurate [1]. The prediction error is then given by

3. PERFORMANCE SURFACES FOR MA ESTIMATORS WITH UNKNOWN EXCITATION


In this work, we choose the mean square error

ms

given in (5) as the performance surface function. We now attempt to formulate this performance function in terms of the estimator system function D(z) and the signal power spectra. By noting that ms is the autocorrelation of the error signal at n=0, where n is the time index, we begin our formulation by finding the discrete error power spectrum and then taking the inverse ztransform.

64

Emirates Journal for Engineering Research, Vol. 10, No.2, 2005

A Study of Performance Surfaces of MA Parameter Estimators

The discrete error power spectrum ee (z ) is the z-transform of the autocorrelation function ee (n) and is given by

Now

ms (d' ) = ee (0)
(24)

ee ( z ) =
=

n =

ee

( n) z n
n

1 dz ee ( z ) z 2i

By substituting e(n) = x (n) x' (n) proceeding, it can easily be shown that

n =

E[e(k )e(k + n)]z

(13) and

Performing the contour integration in (24) and plotting the result yields the performance surface which is a plot of ms as a function of the model (estimator) parameters. In the following, we work out two examples for a first- and second-order estimator. Example 1: Consider the first-order MA signal model with D(z)=0.3 z-1 and w(n) is zero-mean white noise with unit variance. Then, ww ( z ) = 1 , of course, w(n) is obscure to the estimator. The choice d1=0.3 is consistent with the requirement that the resultant model be minimum-phase.

ee ( z ) = xx ( z ) + x ' x ' ( z ) D ' ( z ) xe ( z ) D ' ( z 1 ) xe ( z 1 )

(14) Since x(n) is obtained by passing e(n) through the estimator D(z) , then ee (z) = xx (z) + D' (z) 2 ee (z) D' (z)xe (z) D' (z 1 )xe (z 1 ) (15) Finally,

D' ( z ) = d1 ' z 1
Making the necessary substitutions in (22) and (24) results in 1.09 + 0.3 z + 0.3z 1 and ee ( z ) = 1 + d1 ' 2 + d1 ' z + d1 ' z 1

ee ( z ) =
Now,

xx ( z ) D' ( z ) xe ( z ) D' ( z 1 ) xe ( z 1 ) 1 + D' ( z ) 2


(16)

xe ( z ) = =

n =

xe

(n) z n
n

ms (d1 ' ) =
(17)

n =

E[ x(k )e(k + n)]z

1 (0.3 z 2 + 1.09 z + 0.3) dz 2i [d1 ' z 2 + (1 + d1 ' 2 ) z + d1 ' ] z 1 (0.3z 2 + 1.09 z + 0.3) dz 2i d ' ( z + d ' )( z + 1 ) z 1 1 d1 '
(25)

Factoring the denominator yields

Making the same substitution for e(n) , we find that

xe ( z ) = xx ( z ) xx ' ( z ) = xx ( z ) D' ( z ) xe ( z )
Therefore,

ms (d1 ' ) =

(18)

xe ( z ) =
ee ( z ) =
but

xx ( z ) 1 + D' ( z )
xx ( z ) 1 + D' ( z ) 2 + D' ( z ) + D' ( z 1 )

(19)

Since d1 must be less than unity, the pole at (1/d1) lies outside the unit circle. Thus the integral in (25) must be the sum of the residues at z=-d1 and z=0. This results in

Substituting (19) in (16) and simplifying, we obtain (20)

ms (d1 ' ) =

0.6d1 '1.09 d1 ' 2 1

(26)

This one-dimensional or univariable performance surface is plotted in Figure 3. ms has a minimum at d1=0.3 as expected. This minimum value is unity, which is also expected since when d1=d1, e(n)=w(n) and w(n) has unit variance. It is obvious that the shape of the surface non-quadratic in d1. This is a direct result of e(n) being a non-linear function of the model parameters. Example 2: Consider the second-order MA signal model with D(z)=0.5z-1+0.4z-2 and w(n) is zero-mean white noise with unit variance. The choice d1=0.5 and d2=0.4 ensures a minimum-phase estimator.

xx ( z ) = ww ( z ) 1 + D( z ) 2
Therefore,

(21)

ee ( z ) =

ww ( z ) 1 + D( z ) 2 1 + D' ( z ) 2 + D' ( z ) + D' ( z 1 )

(22)

Equation (22) gives the discrete power spectrum of the error signal in Figure 2 and is the basic equation from which we derive and plot performance surfaces. The inverse z-transform is found in the realm of complex-variable theory as:

1 n 1 ee (n) = ee ( z) z dz 2i

(23)

D' ( z ) = d1 ' z 1 + d 2 ' z 2


Substituting in (22) and (24) yields

Emirates Journal for Engineering Research, Vol. 10, No.2, 2005

65

N.A.S. Alwan
3 2.8 2.6 2.4

Mean square error

Mean square error

2.2 2 1.8 1.6 1.4 1.2 1 -0.4

Estimate of parameter d1
-0.2 0 0.2 0.4 0.6 0.8 1

Estimate of parameter d1

Figure 3: Univariable performance surface for Example 1

Figure 7: Performance surface for Example 2 with highlights, azimuth = 180 and elevation = -15

ms (d' ) =
Mean square error

1 (0.4z 4 + 0.7z 3 + 1.41z 2 + 0.7z + 0.4) dz [d2 ' z 4 + d1 ' (d2 '+1)z 3 + (1 + d1 '2 +d2 '2 )z 2 + d1 ' (d2 '+1)z + d2 ' ] z 2i

(27) Finding the poles of (27) and the corresponding residues to write ms as a function of d1 and d2 is a
r d2 mete para

Estimate

of para

meter d

te of stima

Figure 4: Performance surface for Example 2 with azimuth = 37.5 and elevation = 25

problem that is almost mathematically intractable. Instead, a MATLAB program is used to compute the poles and residues for different entries of d1 and d2. Only the residues corresponding to poles inside the unit circle are then summed to find and plot ms . The results are plotted in Figure 4. The two-dimensional performance surface of Figure 4 is also obviously non-quadratic in d1 and d2. It has a global minimum at d1=0.5 and d2=0.4 with ms (min) = 1 as expected. This becomes clearer as the plot is repeated in Figures 5 and 6 with zero elevation and rotated views. The figures demonstrate the presence of local minima which are very close in value to the global minimum such that the scale of the figures does not show the difference. We also notice that the gradient around the global minimum remains small over a considerable area of the surface. Careful inspection of Figure 4 reveals a small peak to the left of the global minimum indicating the inevitable presence of a neighboring local minimum. This is the local minimum that appears in Figures 5 and 6. Figure 7 is plotted using a MATLAB command that adds highlights to the surface from a light source with a specified direction. Azimuth and elevation values for this view are 180 and -15 respectively. The figure clearly demonstrates the presence of local minima.

Mean square error

Estimate of parameter d1

Figure 5: Performance surface for Example 2 with azimuth = 0 and elevation = 0

Mean square error

4. DISCUSSION
Two main properties characterize the performance surfaces of the MA estimators considered in this paper. These are the non-quadratic nature and the presence of local minima. The surface properties have several implications and provide insight as to which iterative minimization method to use. The nonquadratic nature of the univariable performance

Estimate of parameter d2

Figure 6: Performance surface for Example 2 with azimuth = 90 and elevation = 0

66

Emirates Journal for Engineering Research, Vol. 10, No.2, 2005

Estimate of performance d2

A Study of Performance Surfaces of MA Parameter Estimators

surface of Figure 3 implies that gradient search methods [1,2] to find the minimum will follow different path lengths depending on the chosen initial parameter value even for first-order estimators. This is due to the fact that the gradient is considerably larger on one side of the minimum than on the other. We may start with an initial value resulting in very slow convergence. This becomes evident upon inspection of the gradient search equation

d i +1 ' = d i '+ ( i )

(28)

where i is the step or iteration number and the gradient at d=di is designated by i . The parameter is a constant that governs stability and rate of convergence. The local minima that appear in the higher-order cases is another problem encountered when using gradient search methods especially that the number of local minima is expected to increase with the MA order. These methods may fail to reach the global minimum due to the possibility of being trapped at a local minimum. For the above reasons, nonlinear least squares optimization techniques based on the method of Gauss-Newton [1,5,6] seem to offer a better solution. In these methods, descending along the performance surface is governed by

d i +1 ' = d i '+ Gi i

(29)

where the matrix Gi modifies the direction of the descent. For quadratic functions, choosing Go=(2 R w ' )-1 gives d1= R w '1 rw ' , that is, we find the unique minimum in one step [2]. This provides the motivation for modifying the direction of the gradient using the inverse of R w ' even for non-quadratic functions. So the method reduces the convergence time for nonquadratic functions as well although it still does not guarantee global convergence. Genetic search methods[4] are powerful optimization tools but global search takes long computer time. A genetic algorithm (GA) is a search process based on the laws of natural selection and genetics. The population in a simple GA comprises a group of chromosomes from which candidates can be selected for the solution of a problem. Fitness values for all chromosomes are evaluated to select a particular group of parent chromosomes that generate offspring. The fitness of the offspring is also evaluated and the fittest offspring replace the parents. The cycle continues until a termination criterion is reached. The best chromosome in the final population then becomes a highly evolved solution to the problem [4]. Genetictype least-mean-square (LMS) adaptive algorithms can

converge to the global minimum escaping local minima along the way in a time that is comparable to that of the well-known LMS algorithm [2]. Such algorithms have been devised and implemented in [3] to solve IIR adaptive filtering problems characterized with the presence of prominent local minima. They can be used with high-order MA estimators as well. Adaptive MA estimators based on the LMS algorithm which is an approximation to the gradient search technique are very useful when the signal statistics are not known but are suboptimum due to their sensitivity to the value of the step size . From Figure 6, we see that the local ms (min) is very close to the global ms (min) so that a sufficiently small value of is necessary for the detection of the global minimum. This is why exact least squares methods based on the minimization of the total squared error offer the exact solution and are optimum in this sense. In [7], the problem of MA parameter estimation from sample covariances has been formulated as a semi-definite program that could be solved in a time that is a polynomial function of the MA order. The method is based on a convex optimization formulation and global convergence is guaranteed in polynomial time. Despite its suboptimality, the genetic-type LMS algorithm is by far the most computationally efficient of the above-mentioned algorithms used in MA parameter estimation. It has a computational burden that is slightly greater than that of the pure LMS which is the most computationally efficient adaptive algorithm[3]. Its usefulness also stems from its capability to operate in a tracking mode when the MA model coefficients are time-varying. The application of this algorithm to MA estimators is challenging since ms (min) is not zero but equal to the unknown excitation variance w2. Therefore, the minimum error threshold[3] needed in this algorithm cannot be anticipated beforehand. However, a logical solution is to modify the algorithm to find all minima, i.e. the search continues for a time during which the probability of finding all minima is acceptably large which is feasible especially for moderate MA orders. This procedure is common with GAs where the GA cycle is repeated until some termination criterion is reached, e.g. a predefined number of generations are produced. As explained previously in the light of the results obtained, an adequately small value of step size must be chosen for such an application. The minimum with the lowest ms is the global minimum. Once the latter is found, the algorithm is then implemented in its form stated in [3] to track time variations of the model coefficients. This provides incentive for future work.

Emirates Journal for Engineering Research, Vol. 10, No.2, 2005

67

N.A.S. Alwan

5. CONCLUSIONS
A general equation for the discrete power spectrum of the error signal of a MA estimator with unknown excitation is provided. The performance surfaces are then derived and plotted. Examples of first- and second-order estimators were taken and performance surfaces were obtained and their properties studied. In general, the surfaces are characterized by their nonquadratic nature and the existence of local minima. Existing search algorithms for descent along the performance surfaces are discussed in the light of the study made of MA estimator surfaces. The genetictype LMS algorithm previously used with adaptive IIR filtering seems to be the most appealing especially when the MA coefficients are time-varying thereby requiring adaptive estimation. Suggestions are presented concerning the modification of the aforementioned algorithm to suit the application of adaptive MA parameter estimation.

REFERENCES
1. Manolakis, D.G., Ingle, V.K. and Kogon, S.M., Statistical and Adaptive Signal Processing. McGrawHill, 2000. 2. Widrow, B. and Stearns, S.D., Adaptive Signal Processing. Englewood Cliffs, NJ, Prentice-Hall, 1985. 3. Ng, S.C., Leung, S.H., Chung, C.Y., Luk, A. and Lau, W.H. The Genetic Search Approach: A New Learning Algorithm for Adaptive IIR Filtering. IEEE Signal Processing Magazine, vol. 13, no.6, pp. 38-46, 1996. 4. Tang, K.S., Man, K.F., Kwong, S. and He,Q. Genetic Algorithms and their Applications. IEEE Signal Processing Magazine, vol.13, no.6, pp. 22-37, 1996. 5. Soderstrom, T. and Stoica, P., System Identification. Hemel Hempstead, U.K. Prentice-Hall. 1989. 6. Stoica, P. and Moses, R., Introduction to Spectral Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1997. 7. Stoica, P., McKelvey, T. and Mari, J., MA Estimation in Polynomial Time. IEEE Transactions on Signal Processing, vol. 48, no.7, pp. 1999-2012, 2000.

68

Emirates Journal for Engineering Research, Vol. 10, No.2, 2005

You might also like