You are on page 1of 6

SYSTEM IDENTIFICATION USING MATLAB

This step-by-step tutorial will guide you through an example of system identification(SID) where Matlab is used. A second order system with two inputs and one output will be considered. A graphical representation of the system is given in Figure 1.

SWF

g11 ( s )

+ +

CUA

CFF

g12 ( s )
Figure 1The system for which SID will be performed

The transfer function g11(s) between SWF and CUA will be determined. The procedure which you have to follow is: 1. Start Matlab with the Systems Identification Toolbox in its path. Matlab starts, and displays a window with a >>. This >> is the Matlab prompt, and data or commands are entered from here. 2. Load the file ebbles6.mat by entering the following command at the Matlab prompt:
>> load ebbles6

The vectors SWF, CFF and CUA are now loaded into Matlabs memory. These vectors contain numerical data from field measurements. The name and size of variables currently in Matlabs memory can be viewed by issuing the following command:
>> whos

3. The output-input pairs are CUA and SWF; and CUA and CFF. Make matrices z1 and z2 of the output-input pairs by issuing the following command:
>> z1=[CUA SWF]; >> z2=[CUA CFF];

Do not forget the semicolon, otherwise Matlab displays the whole matrix.

University of Pretoria

Page 5

4. Plot the output-input pairs:


>> idplot(z1) >> idplot(z2)

Notice the region where CFF remains constant. It will be in this region where we try to determine the transfer function between SWF and CUA, as we want to isolate the influence of SWF on CUA. 5. Now take a look at the region where CFF is constant:
>> idplot(z2(1000:1900,:))

The above command tells Matlab to make a plot of rows 1000 to 1900 and all columns of z2. Take a look at SWF in the region where CFF is constant:
>> idplot(z1(1000:1900,:))

As can be seen, a big part of SWF is also constant. In order to obtain a transfer function, we want to isolate the region where there is change in SWF, since the constant characteristic of SWF contains no information on the dynamics of the system. This changing region of SWF is the usable region. The usable region can be viewed by
>> idplot(z1(1500:1900,:))

6. The usable region can now be split up into two regions. i.)
2 of the region is used to fit the data. 3 1 ii.) of the region is used to validate the model. 3

The reason why we approach the problem in this way is to validate the model using data that was not used to fit the model. This is a much more stringent test of the model than using the same data for fitting and validation. We now create two new data regions by issuing the following command:
>> zf=z1(1500:1750,:); >> zv=z1(1751:1900,:);

zf is used for fitting of the model, and zv is used for validation of the model.

University of Pretoria

Page 6

7. Now plot the new data to make sure that the regions are suitable for model fitting and validation.
>> idplot(zf) >> idplot(zv)

8. The data will now be pre-treated. We will remove the offset of the input SWF and the output CUA, for both the fitting and validation regions, so that an offset parameter will not unnecessarily be fit.
>> zfd=dtrend(zf); >> zvd=dtrend(zv);

9. The time delay between the input and the output is a critical factor that has to be considered when SID is performed. The time-delay can best be determined by zooming in on the plot and then estimating from the plot how long it takes for an output to react for a large input change. Now zoom in on the pre-treated data:
>> idplot(zfd(1:50,:))

From the plot we can see that the time-delay is approximately 5 sampling periods. Since the data was acquired using a sampling interval of 10 seconds, the time delay is approximately 50 seconds. 10. We are now ready to fit the data. We will use the ARX(Auto Regression with eXternal input) method. In difference equation format the ARX model is given as follows:
y(t ) + a1 y(t T ) + a 2 y(t 2T )+K = b1u(t nT ) + b2 u(t nT T )+K ( auto regression part ) ( external input part )

where T is the sampling interval, y(t ) is the output, u(t ) is the input, a i and bi are the model coefficients, and n is the number of periods of time delay. In this tutorial we only fit a first order type model. For a first order model only the coefficients a1 and b1 are used. The ARX model in difference equation format is then

y(t ) + a1 y(t T ) = b1u(t nT ) .


Now fit the data:
>> th=arx(zfd,[1 1 5]);

University of Pretoria

Page 7

[1 1 5] indicates that a1 = 1 , b1 = 1 , and n = 5 . The results of the arx computation are stored in th. 11. Is our model good enough? The following tests are commonly used: a. First we simulate the model with the data we used to fit the model and then the data for validation.
>> compare(zfd,th,1);

The 1 in the above command tells Matlab to use one step ahead prediction. Nearly all models will fare well with this test since the model knows exactly what the current real output is, and needs only to predict the next output. A far more stringent test would be where the model knows only what the inputs are (pure simulation):
>> compare(zfd,th);

The most stringent test of the model occurs when it is validated on data that was not used to fit the model:
>> compare(zvd,th);

b. The error signal (residuals) contains important information on the shortcomings of the model, and are generated as in Figure 2:
y + -

Real plant

Error = y - y

model

Figure 2The error signal

An auto-correlation of the error signal determines whether the error signal is white noise or not. If so, your model is an unbiased estimator, which means that the errors that the model make have an equal chance to be positive or negative. The error signal as well as the auto-correlation are generated by
>> error=resid(zfd,th);

You can also plot the error signal to inspect it for outliers. Outliers are errors that are significantly different from average. To plot the error use:

University of Pretoria

Page 8

>> plot(error);

The resid command also generates a cross-correlation between the input u and the error signal. From this test one can determine whether the error signal and u are independent. If negative correlation is present, it means that output feedback occurs. The current error influences the future input. If positive correlation is present it means that the time delay was estimated incorrectly. If the auto-correlation of the error signal and the cross-correlation of the error signal and the input are within the dotted red lines, we are 99% sure that the error is white noise, and that the error and input are independent. If this is the case, the model is statistically speaking, a very good model! (The dotted red lines indicate the 99% confidence intervals) 12. To get the model in usable form for simulation and controller design, we have to convert the th model to a Laplace transform model. The ARX difference equation with parameters a1 and b1 can be determined as follows.
>> [dend,numd]=th2arx(th)

Two vectors, dend and numd, are returned. Their formats are dend = 1 a1

numd = 0 0 0 0 0 b1

numd contains a zero for each time delay.

13. The ARX model can now be converted to a continuous Laplace transfer function as follows:
>> [numc,denc]=d2cm([0 numd(6)],dend,10,zoh);

The 10 in the above command indicates the sampling period, and the zoh indicates zero-order-hold for the discrete to continuous transformation. numd(6) indicates that we are entering only the 6th element of numd, i.e. b1. d2cm stands for convert from discrete-to-continuous with method. In our case the method is zero-orderhold.

The results that you obtain for numc and numd should be

University of Pretoria

Page 9

numc = [0.9326e 3] denc = [1 0.0059]

The transfer function is then given by


g11 ( s ) =

932.6 10 6 e 50 s s + 0.0059 . 01581 50 s . = e 169s + 1

University of Pretoria

Page 10

You might also like