You are on page 1of 596

LINEAR SYSTEM THEORY

Second Edition

WILSON J. RUGH
Department of Electrical and Computer Engineering
The Johns Hopkins University

PRENTICE HALL, Upper Saddle River, New Jersey 07458

Library of Congress Cataloglng-inPubilcatlon Data


Rugh, Wilson I.

Linear system theory I Wilson J. Rugh. --2nd ed.


p. cot (Prentice-Hall information and system sciences
series)
Includes bibliological references and index.
ISBN: 0-13-441205-2
1, Control theory. 2. Linear systems. I. Title. II. Series.

QA402.3R84 1996
95-21164
CIP

003'.74--dc2O

Acquisitions editor: Tom Robbins


Production editor: Rose Kernan
Copy editor: Adrienne Rasmussen
Cover designer: Karen Salzbach
Donna Suflivan
Editorial assistant: PbyIIIs Morgan

1996 by Prentice-Hail, Inc.


Simon & Schuster/A Viacom Company
Upper Saddle River, NJ 07458

All Tights reserved. No part of this book may be


reproduced, in any form or by any means,
without permission in writing from the publisher.

The author and publisher of this book have used their best efforts in preparing this book. These efforts include the
development, research, and testing of the theories and programs to determine their effectiveness. The author and
publisher make no warranty of any kind, expressed or implied, with regard to these programs or the documentation
contained in this book. The author and publisher shall not be liable in any event for incidental or consequential damages
in connection with, or arising out of, the furnishing, performance, or use of these programs.
Printed in the United States of America

10 9 8 7 6 5

32

ISBN 0134412052

90000>

Prentice-Hall International (UK) Limited, London

Prentice-Hall of Australia Pty. Limited, Sydney


Prentice-Hall Canada Inc., Toronto
Prentice-Hall Hispanoamencana, S.A., Mexico
Prentice-Hall of India Private Limited, New Delhi
Prentice-Hall of Japan, Inc., Tokyo
Simon & Schuster Asia Pie. Ltd., Singapore
Editora Prentice-Hail do Brasil, Ltda., Rio de Janeiro

To Terry, David, and Karen

PRENTICE HALL INFORMATION AND SYSTEM SCIENCES SERIES


Thomas Kailath, Editor

ANDERSON & MOORE


ANDERSON & MOORE
ASTROM &
BASSEVILLE & NIKIROV
BOYD & BARRA'IT
DICKINSON
FRIEDLAND
GARDNER
GRAY & DAVISSON
GREEN & LIIvIEBEER
HAYKIN
HAYKIN
JAIN
JOHANSSON
JOHNSON
KAILATH
KUNG
KUNG, W}{ITEHOUSE.
& KAILATH, EDS.
KWAKERNAAK & SWAN
LANDAU

LJUNG
LIUNG & GLAD
MACOVSKI
MOSCA
NARENDRA & ANNASWAMY
RUGH
RUGH
SASTRY & BODSON
SOLIMAN & SRINATH
SOLO & KONG
SRINATH, RAJASEKARAN,
& VISWANATHAN
VISWANADHAM & NARAHARI

WILLIAMS

Optimal Control: Linear Quadratic Methods


Optimal Filtering
Computer-Controlled Systems: Theory and Design, 2/E
Detection of Abrupt Changes: Theory & Application
Linear Controller Design: Limits of Perfor,nance
Systems: Analysis, Design and Computation
Advanced Control System Design
Statistical Spectral Analysis: A Nonprobabilistic Theory
Random Processes: A Mathematical App roach for
Engineers
Linear Robust Control
Adaptive Filter Theory
Blind Deconvolution
Fundamentals of Digital Image Processing
Modeling and System Identification
Lectures on Adaptive Parameter Estimation
Linear Systems
VLSI Array Processors
VLSI and Modern Signal Processing
Signals and Systems
System Identification and Control Design Using P.I.M.
+ Software
System Identification: Theory for the User
Modeling of Dynamic Systems
Medical Imaging Systems
Stochastic and Predictive Adaptive Control
Stable Adaptive Systems
Linear System Theory
Linear System Theory, Second Edition
Adaptive Control: Stability, Convergence, and
Robustness
Signals and Systems
Continuous and
Adaptive Signal Processing Algorithms: Stability &
Performance
Introduction to Statistical Signal Processing with
Applications
Performance Modeling of Automated Manufacturing
Systems
Designing Digital Filters

CONTENTS

PREFACE

xiii

CHAPTER DEPENDENCE CHART

xv

MATHEMATICAL NOTATION AND REVIEW


Vectors

Matrices
3
Quadratic Forms
8
Matrix Calculus
10
Convergence
11
Laplace Transform
14
z-Transform
16
Exercises
18
Notes
21

2 STATE EQUATION REPRESENTATION


24
Examples
Linearization
28
State Equation Implementation
Exercises
34
Notes
38

STATE EQUATION SOLUTION


Existence

41

Uniqueness
45
Complete Solution
Additional Examples
Exercises
53
Notes
55

47

50

23

34

40

Contents

4 TRANSITION MATRIX PROPERTIES


Two Special Cases

General Properties
61
State Variable Changes
Exercises
69
Notes
73
5

66

TWO IMPORTANT CASES


Time-Invariant Case
Periodic Case
81
Additional Examples
Exercises
92
Notes
96

58

58

74

74
87

INTERNAL STABILITY

99

Uniform Stability
99
Uniform Exponential Stability
101
Uniform Asymptotic Stability
106
Lyapunov Transformations
107

Additional Examples
Exercises
110
Notes

109

113

7 LYAPUNOV STABILITY CRITERIA


Introduction
114
Uniform Stability
116
Uniform Exponential Stability
117
Instability
122
Time-Invariant Case
123
Exercises
125
Notes
129

114

ADDITIONAL STABILITY CRITERIA

131

Eigenvalue Conditions

131

Perturbation Results
133
Slowly-Varying Systems
135
Exercises
138
140
Notes
9

CONTROLLABILITY AND OBSERVABILITY


Controllability
142
Observability
148
Additional Examples
Exercises
152
Notes
155

150

142

ix

Contents

10

REALIZABILITY
Formulation

158

159

Realizability
160
Minimal Realization
164
Special Cases
Time-Invariant Case
Additional Examples
Exercises
177
Notes
180
11

162
169
175

MINIMAL REALIZATION
Assumptions

182

182

Time-Varying Realizations
184
Time-Invariant Realizations
189
Realization from Markov Parameters
Exercises
199
Notes
201

12

194

INPUT-OUTPUT STABILITY

203

Uniform Bounded-Input Bounded-Output Stability


Relation to Uniform Exponential Stability
206
Time-Invariant Case
211
Exercises
214
Notes
216

13

CONTROLLER AND OBSERVER FORMS

203

218

Controllability
219
Controller Form
222
Observability
231
Observer Form
232
Exercises
234
Notes
238

14 LINEAR FEEDBACK

240

Effects of Feedback
241
State Feedback Stabilization
244
Eigenvalue Assignment
247
Noninteracting Control
249
Additional Examples
256
Exercises
258
Notes
261

15

STATE OBSERVATION
266
Output Feedback Stabilization
Reduced-Dimension Observers

265

Observers

269
272

Contents

Time-Invariant Case
275
A Servomechanism Problem
Exercises
284
Notes
287

16

POLYNOMIAL FRACTION DESCRIPTION


Right Polynomial Fractions

Left Polynomial Fractions


Column and Row Degrees
Exercises
309
Notes
310

17

280

290

290
299
303

POLYNOMIAL FRACTION APPLICATIONS

312

Minimal Realization
312
Poles and Zeros
318
State Feedback
323
Exercises
324
326
Notes

18

GEOMETRIC THEORY
Subspaces

328

328

Invariant Subspaces
330
Canonical Structure Theorem
339
Controlled Invariant Subspaces
341
Controllability Subspaces
345
Stabilizability and Detectability
351
Exercises
352
Notes
354

19

APPLICATIONS OF GEOMETRIC THEORY


Disturbance Decoupling

357

Disturbance Decoupling with Eigenvalue Assignment


Noninteracting Control
367
Maximal Controlled Invariant Subspace Computation
Exercises
377
Notes
380

20

DISCRETE TIME: STATE EQUATIONS


Examples

357

384

Linearization
387
State Equation Implementation
390
State Equation Solution
391
Transition Matrix Properties
395
Additional Examples
397
Exercises
400
Notes
403

362
376

383

Contents
21

DISCRETE TIME: TWO IMPORTANT CASES


Time-Invariant Case
Periodic Case
412
Exercises
418
422
Notes

22

406

DISCRETE TIME: INTERNAL STABILITY


Uniform Stability
423
Uniform Exponential Stability
Uniform Asymptotic Stability
Additional Examples
432
Exercises
433
Notes
436

23

423

425
431

DISCRETE TIME: LYAPUNOV STABILITY CRITERIA


Uniform Stability
438
Uniform Exponential Stability
Instability
443
445
Time-Invariant Case
Exercises
446
Notes
449

24

406

437

440

DISCRETE TIME: ADDITIONAL STABILITY CRITERIA

450

Eigenvalue Conditions

450
452
Perturbation Results
456
Slowly-Varying Systems
459
Exercises
Notes
460

25

DISCRETE TIME: REACHABILITY AND OBSERVABILITY


462
Reachability
Observability
467
Additional Examples
Exercises
472
Notes
475

26

470

DISCRETE TIME: REALIZATION


Realizability

478

481
Transfer Function Realizability
Minimal Realization
483
Time-Invariant Case
493
Realization from Markov Parameters
498
Additional Examples
502
Exercises
503

Notes

506

462

477

Contents

27 DISCRETE TIME: INPUT-OUTPUT STABILITY


Uniform Bounded-Input Bounded-Output Stability
Relation to Uniform Exponential Stability
511
Time-Invariant Case
517
Exercises
519
Notes
520

28

DISCRETE TIME: LINEAR FEEDBACK


Effects of Feedback

508

508

521

523

State Feedback Stabilization


525
Eigenvalue Assignment
532
Noninteracting Control
533
541
Additional Examples
Exercises
543
Notes
544

29

DISCRETE TIME: STATE OBSERVATION


Observers

546

547

Output Feedback Stabilization


550
Reduced-Dimension Observers
553
Time-Invariant Case
556
A Servomechanism Problem
562
Exercises
565
Notes
567

AUTHOR INDEX

569

SUBJECT INDEX

573

PREFACE

A course on linear system theory at the graduate level typically is a second course on
linear state equations for some students, a first course for a few, and somewhere between
for others. It is the course where students from a variety of backgrounds begin to acquire

the tools used in the research literature involving linear systems. This book is my notion
of what such a course should be. The core material is the theory of time-varying linear
systems, in both continuous- and discrete-time, with frequent specialization to the timeinvariant case. Additional material, included for flexibility in the curriculum, explores
refinements and extensions, many confined to time-invariant linear systems.
Motivation for presenting linear system theory in the time-varying context is at
least threefold. First, the development provides an excellent review of the time-invariant
case, both in the remarkable similarity of the theories and in the perspective afforded by
specialization. Second, much of the research literature in linear systems treats the timevarying casefor generality and because time-varying linear system theory plays an

important role in other areas, for example adaptive control and nonlinear systems.
Finally, of course, the theory is directly relevant when a physical system is described by
a linear state equation with time-varying coefficients.
Technical development of the material is careful, even rigorous, but not fancy.
The presentation is self-contained and proceeds step-by-step from a modest
mathematical base. To maximize clarity and render the theory as accessible as possible,
I minimize terminology, use default assumptions that avoid fussy technicalities, and
employ a clean, simple notation.

The prose style intentionally is lean to avoid beclouding the theory. For those
seeking elaboration and congenial discussion, a Notes section in each chapter indicates
further developments and additional topics. These notes are entry points to the literature

rather than balanced reviews of so many research efforts over the years. The
continuous-time and discrete-time notes are largely independent, and both should be
consulted for information on a specific topic.

Preface

xiv

Over 400 exercises are offered, ranging from drill problems to extensions of the
theory. Not all exercises have been duplicated across time domains, and this is an easy
source for more. All exercises in Chapter 1 are used in subsequent material. Aside from
Chapter 1, results of exercises are used infrequently in the presentation, at least in the
more elementary chapters. But linear system theory is not a spectator sport, and the
exercises are an important part of the book.
In this second edition there are a number of improvements to material in the first
edition, including more examples to illustrate in simple terms how the theory might be
applied and more drill exercises to complement the many proof exercises. Also there are
10 new chapters on the theory of discrete-time, time-varying linear systems. These new
chapters are independent of, and largely parallel to, treatment of the continuous-time,

time-varying case. Though the discrete-time setting often is more elementary in a


technical sense, the presentation occasionally recognizes that most readers first study
continuous-time systems.

Organization of the material is shown on the Chapter Dependence Cha,-t.


on background it might be preferable to review mathematical topics in

Depending

Chapter 1 as needed, rather than at the outset. There is flexibility in studying either the
discrete-time or continuous-time material alone, or treating both, in either order. The
additional possibility of caroming between the two time domains is not shown in order to
preserve Chart readability. In any case discussions of periodic systems, chapters on
Additional Stability Criteria, and various topics in minimal realization are optional.
Chapter 13, Controller and Observer Forms, is devoted to time-invariant linear
systems. The material is presented in the continuous-time setting, but can be entered
from a discrete-time preparation. Chapter 13 is necessary for the portions of chapters on
State Feedback and State Observation that treat eigenvalue assignment. The optional
topics for time-invariant linear systems in Chapters 1619 also require Chapter 13, and
also are accessible with either preparation. These topics are the polynomial fraction
description, which exhibits the detailed structure of the transfer function representation
for multi-input, multi-output systems, and the geometric description of the fine structure
of linear state equations.

Acknowledgments
I wrote this book with more than a little help from my friends. Generations of graduate

students at Johns Hopkins offered gentle instruction. Colleagues down the hall, around
the continent, and across oceans provided numerous consultations. Names are unlisted

here, but registered in my memory. Thanks to all for encouragement and valuable
suggestions, and for pointing out obscurities and errors. Also I am grateful to the Johns
Hopkins University for an environment where I can freely direct my academic efforts,
and to the Air Force Office of Scientific Research for support of research compatible
with attention to theoretical foundations.
WJR
Baltimo,-e, Maryland, USA

II

cci

L)
xv

LINEAR SYSTEM THEORY


Second Edition

1
MATHEMATICAL NOTATION
AND REVIEW

Throughout this book we use mathematical analysis, linear algebra, and matrix theory at
what might be called an advanced undergraduate level. For some topics a review might

be beneficial to the typical reader, and the best sources for such review are mathematics

texts. Here a quick listing of basic notions is provided to set notation and provide
reminders.

In addition there are exercises

that

can be solved

by reasonably

straightforward applications of these notions. Results of exercises in this chapter are


used in the sequel, and therefore the exercises should be perused, at least. With minor
exceptions all the mathematical tools in Chapters 215, 2029 are self-contained
developments of material reviewed here. In Chapters 1619 additional mathematical
background is introduced for local purposes.
Basic mathematical objects in linear system theory are n x I or I x n vectors and

with real entries, though on occasion complex entries arise. Typically


vectors are in lower-case italics, matrices are in upper-case italics, and scalars (real, or
sometimes complex) are represented by Greek letters. Usually the i"-entry in a vector x
is denoted x,, and the i,j-entry in a matrix A is written a,1 or {A
These notations are
not completely consistent, if for no other reason than scalars can be viewed as special
cases of vectors, and vectors can be viewed as special cases of matrices. Moreover,
notational conventions are abandoned when they collide with strong tradition.
With the usual definition of addition and scalar multiplication, the set of all n x 1
vectors and, more generally, the set of all in x n matrices, can be viewed as vector spaces
over the real (or complex) field. In the real case the vector space of n x I vectors is
written as R" xl, or simply R", and a vector space of matrices is written as R" Xfl The
default throughout is the real casewhen matrices or vectors with complex entries
(i = '.IT) are at issue, special mention will be made. It is useful for some of the later
chapters to review the axioms for a field and a vector space, though for most of the book
technical developments are phrased in the language of matrix algebra.

in x ii matrices

Chapter 1

Mathematical Notation and Review

Vectors
Two ii x 1 vectors x and y are called linearly independent if no nontrivial linear
combination of x and y gives the zero vector. This means that if ctx + 13y = 0, then both

scalars ci and 13 are zero. Of course the definition extends to a linear combination of
any number of vectors. A set of n linearly independent n x I vectors forms a basis for
the vector space of all ii x I vectors. The set of all linear combinations of a specified set
of vectors is a vector space called the span of the set of vectors. For example
span x, y, z } is a 3-dimensional subspace of R", if x, y, and z are linearly
independent n x 1 vectors.
Without exception we use the Euclidean norm for n x 1 vectors, defined as
follows. Writing a vector and its transpose in the form

xT=

x,,]

x2

xl:

1/2

Elementary inequalities relating the Euclidean norm of a vector to the absolute values of

entries are (max of course is short for maximum)


max

ISiS:,

ISiS,:

ixji

As any norm must, the Euclidean norm has the following properties for arbitrary
vectors x and y, and any scalar a:
lxii

lxii =0 ifandonlyifx=0
iiaxii = lcd lixil

lix+yiiiixii + ilyil
The last of these is called the triangle inequality. Also the Cauchy-Schwarz inequality
in terms of the Euclidean norm is

ixTy I < lix lilly ii

If x is complex, then the transpose of x must be replaced by conjugate transpose, also


known as Hermitian transpose, and thus written

throughout the above discussion.

Matrices

transpose is not desired. For scalar x


either is correctly construed as complex conjugate, and Ix is the magnitude of x.
Overbar denotes the complex conjugate,

when

Matrices
For matrices there are several standard concepts and special notations used in the sequel.
The rn x n matrix with all entries zero is written as 0,,, a,,, or simply 0 when dimensional

emphasis is not needed. For square matrices, ni = n, the zero matrix sometimes is written
as 0,,, while the identity matrix is written similarly as 1,, or 1. We reserve the notation
or k"-row, depending on context, of the identity matrix.
ek for the
The notions of addition and multiplication for conformable matrices are presumed
to be familiar. Of course the multiplication operation is more interesting, in part because

it is not commutative in general. That is, AB and BA are not always the same. If A is
square, then for nonnegative integer k the power Ak is well defined, with A = I. If there
is a positive k such that Ak = 0, then A is called nilpotent.
is the
Similar to the vector case, the transpose of a matrix A with entries
matrix AT with i,j-entry given by
A useful fact is (AB)T = BTAT.
For a square 11 x n matrix A, the trace is the sum of the diagonal entries, written
tr A

a1

If B also is n x n, then [AB] = [BA].


A familiar scalar-valued function of a square matrix A is the determinant. The
determinant of A can be evaluated via the Laplace expansion described as follows. Let
denote the cofactor corresponding to the entry
Recall that c11 is ( 1)' times
-row and
the determinant of the (n 1) x (ii 1) matrix that results when the
column of A are deleted. Then for any fixed i, I i n,
det A

This is the expansion of the determinant along the i'1'-row. A similar formula holds for

the expansion along a column. Aside from being a useful representation for the
determinant, recursive use of this expression provides a method for computing the
determinant of a matrix from the fact that the determinant of a scalar is simply the scalar
itself. Since this procedure expresses the determinant as a sum of products of entries of
the matrix, the determinant viewed as a function of the matrix entries is continuously
differentiable any number of times. Finally if B also is n x n, then
det (AR) = det A det B = det (BA)

The matrix A has an inverse, written A', if and only if det A 0. One formula
for A - that occurs often is based on the cofactors of A. The adjugate of A, written
ad] A, is the matrix with i,j-entry given by the cofactor cfl. In other words, ad] A is the
transpose of the matrix of cofactors. Then

Chapter 1

A'

Mathematical Notation and Review

adj A
det A

a standard, collapsed way of writing the product of the scalar 1/(det A) and the matrix
adj A. The inverse of a product of square, invertible matrices is given by

(AB)'
if A is n x n and p is a nonzero n x 1 vector such that for some scalar
Ap = Xp

then p is an eigenvector corresponding to the eigenvalue 2.. Of course p must be


presumed nonzero, for if p = 0, then this equation is satisfied for any X. Also any
nonzero scalar multiple of an eigenvector is another eigenvector. We must be a bit
careful here, because a real matrix can have complex eigenvalues and eigenvectors,
though the eigenvalues must occur in conjugate pairs, and conjugate corresponding
eigenvectors can be assumed. In other words if Ap = Xp, then A j = X
These notions
can be refined by viewing (6) as the definition of a right eigenvector. Then it is natural

to define a left eigenvector for A as a nonzero 1 x n vector q such that qA = q for


some eigenvalue X.

The n eigenvalues of A are precisely the n roots of the characteristic polynomial


of A, given by det (si,, A). Since the roots of a polynomial are continuous functions of
the coefficients of the polynomial, the eigenvalues of a matrix are continuous functions

of the matrix entries. Recall that the product of the n eigenvalues of A gives det A,
while the sum of the n eigenvalues is tr A.
The Cayley-Hamilton theorem states that if
det (si,, A) = s" +

+ a0

then
A"

+a1A +a01,,=0,,

Our main application of this result is to write


for integer k 0, as a linear
combination of I, A,..., A"_'.
A similarity transformation of the type T - 'AT, where A and invertible T are
,i x n, occurs frequently. It is a simple exercise to show that T
and A have the
same set of eigenvalues. If A has distinct eigenvalues, and T has as columns a
corresponding set of (linearly independent) eigenvectors for A, then T - AT is a
diagonal matrix, with the eigenvalues of A as the diagonal entries. Therefore this
computation can lead to a matrix with complex entries.

1.1 Example

The characteristic polynomial of

A=

0 2
2 2

Matrices

det(2JA)=det [2

1i'd)

Therefore

A has eigenvalues

gives the linear

Setting up (6) to compute a right eigenvector pa corresponding to

equation

0 2
2 2

p?
=

One nonzero solution is

pa=

(8)

A similar calculation gives an eigenvector corresponding to


complex conjugate of p". Then the invertible matrix
2

T=

that is simply the

yields the diagonal form


0

T'AT

ODD
We often use the basic solvability conditions for a linear equation

Ax =

where A is a given m x n matrix, and b is a given m x 1 vector. The range space or


spanned by the columns of A. The null
image of A is the vector space (subspace of
space or kernel of A is the vector space of all n x 1 vectors x such that Ax = 0. The
linear equation (9) has a solution if and only if b is in the range space of A, or, more
subtly, if and only if bTy = 0 for all y in the null space of AT. Of course if m = a and
A is invertible, then there is a unique solution for any given b; namely x = A - 'b. The
rank of an m x n matrix A is equivalently the dimension of the range space of A as a
vector subspace of
the number of linearly independent column vectors in the matrix,
or the number of linearly independent row vectors. An important inequality involving an

mxnmatrix A and annxpmatrix B is

Mathematical Notation and Review

Chapter 1

rankA + rankB
For

matrices.

{rankA,rankB }

many calculations it is convenient to make use of partitioned vectors and

Standard computations can be expressed in terms of operations on the

partitions, when the partitions are conformable. For example, with all partitions square
and of the same dimension,

A1+B1 A,+B,

B1 B,

A1 A,

0A4

B30

A4

B3

A1 A,

B1 B,

A1B1+A,B3

0A4

B30

A4B3

If x is an ,z x 1 vector and A

an ,n x ii matrix partitioned by rows,

is

A1

A1x

A,,,

A,,,x

If A is partitioned by columns, and z is rn x 1,


.

A,,] = {_rA

..
]

A useful feature of partitioned square matrices with square partitions as diagonal blocks

det

[Au AI:]dAdA

When in doubt about a specific partitioned calculation, always pause and carefully check
a simple yet nontrivial example.
The induced norm of an nz x matrix A can be defined in terms of a constrained
maximization problem. Let

IIAII= max IIAxII


II

=1

where notation is somewhat abused. First, the same symbol is used for the induced norm

of a matrix as for the norm of a vector. Second, the norms appearing on the right side of
(10) are the Euclidean norms of the vectors x and Ax, and Ax is nu x 1 while x is n x 1.

We will use without proof the facts that the maximum indicated in (10) actually is
attained for some unity-norm x, and that this .v is real for real A. Alternately the norm
of A induced by the Euclidean norm is equal to the (nonnegative) square root of the
largest eigenvalue of ATA, or of AAT. (A proof is invited in Exercise 1.1 1.) While
induced norms corresponding to other vector norms can be defined, only this so-called
spectral norm for matrices is used in the sequel.

Matrices
1.2

Example

If

and X2 are real numbers, then the spectral norm of


x1

A=
is

given by (10) as
hA II =

max

\j(A1x1

+x,)2 +

elude this constrained maximization problem, we compute hA


eigenvalues of ATA. The characteristic polynomial of A rA is
To

II

by computing the

det(X!ATA)=det

+)X +

(1

The roots of this quadratic are given by

The radical can be rewritten so that its positivity is obvious. Then the largest root is
obtained by choosing the plus sign, and a little algebra gives

+X2)-+l +

hAil

DOD
The induced norm of an
n matrix satisfies the axioms of a norm on R" X and
additional properties as well. In particular liAr hi = hA hi, a neat instance of which is
that the induced norm hixT ii of the I x ii matrix xT is the square root of the largest
eigenvalue of
x
Choosing the more obvious of the two configurations
immediately gives hixT hi = lix hi. Also lAx hi hA hi lix ii for any n x 1 vector x
(Exercise 1.6), and for conformable A and B,
IIAB hi IA II IIB II

(Exercise

1.7). If A is in x ii,

then

inequalities relating

IA

Ii

to absolute values of the

entries of A are
max I

I IA

Ii

max

complex matrices are involved, all transposes in this discussion should be


replaced by Hermitian transposes, and absolute values by magnitudes.
When

Mathematical Notation and Review

Chapter 1

Quadratic Forms
For a specified ii x n matrix Q and any

x vector x, both with real entries, the


product xTQx is called a quadratic form in x. Without loss of generality Q can be
taken as symmetric, Q =
in the study of quadratic forms. To verify this, multiply out
a typical case to show that

+ Q')v

xT(Q

= 2VTQV

(13)

for all x. Thus the quadratic form is unchanged if Q is replaced by the symmetric
+
(Q
Q is called positive seniidefinite if .vTQx 0 for all
x. It is called positive definite if it is positive semidefinite, and if xTQx = 0 implies
= 0. Negative definiteness and semidefiniteness are defined in terms of positive

definiteness and positive semidefiniteness of Q. Often the short-hand notations Q > 0

and Q 0 are used to denote positive definiteness, and positive semidefiniteness,


respectively. Of course

Q,, simply

means that

is

positive semidefinite.

All eigenvalues of a symmetric matrix must be real. It follows that positive


definiteness is equivalent to all eigenvalues positive, and positive semidefiniteness is
equivalent to all eigenvalues nonnegative. An important inequality for a symmetric
ii x n matrix Q is the Ravleigh-Rit: inequality, which states that for any real ii x 1
vector .v,

v'Qx

vTv

(14)

and
denote the smallest and largest eigenvalues of Q. See Exercise
1.10 for the spectral norm of Q. If we assume Q 0. then II Q II =
and the trace is
bounded by

where

1Q11

Tests

for definiteness properties of symmetric matrices can be based on sign

properties of various submatrix determinants. These tests are difficult to state in a fashion
that is both precise and economical, and a careful prescription is worthwhile. Suppose
Q is

a real, symmetric,

x n matrix with entries

For integers p =

,, and

q,j

qj111

Q(i1,i,

q,1

= det
:

q11,1

(15)
:

a11,1,

called principal tninors of Q. The scalars Q(l, 2


p), p =
simply are the determinants of the upper left p x p submatrices of Q,
are

1,...,

<i7<

1,

ii, which

Quadratic Forms

q11 q2 q13

Q(1)=q11 , Q(l,2)=det

are

called leading principal minors.

1.3

Theorem

The symmetric matrix Q

Q(l,2

is

Q(l,2,3)=det

q21 q12 q21


q31 q32 q33

positive definite if and only if

p)>O,p=l,2,...,n

It is negative definite if and only if

(l)"Q(l,2

p)>O,p=l,2

The test for semidefiniteness is much more complicated since all principal minors
are involved, not just the leading principal minors.
1.4 Theorem

The symmetric matrix Q

Q(i1,i2,...,

is

positive semidefinite if and only if

li1 <i2<
p = 1, 2

It is negative semide finite if and only if

(1)" Q(i1,

1.5

Example

li,

<i2<

12

The symmetric matrix


q11 q12

q12 q22

positive definite if and only if q11 >0 and q11q22


semidefinite if and only if q11 0, q22
q11q22
is

>0.

It is positive

DOD

If

= QH where again
Q has complex entries but is Hermitian, that is Q

denotes
Hermitian (conjugate) transpose, then a quadratic form is defined as xHQx. This is a real
quantity, and the various definitions and definiteness tests above apply.

Mathematical Notation and Review

Chapter 1

Matrix Calculus
Often the vectors and matrices in these chapters have entries that are functions of time.

With only one or two exceptions, the entries are at least continuous functions, and often
they are continuously differentiable. For convenience of discussion here, assume the
latter. Standard notation is used for various intervals of time, for example, t e [t0, t )
means t0 t <t1. To avoid silliness we assume always that the right endpoint of an
interval is greater than the left endpoint. If no interval is specified, the default is
(_oo, 00).
The sophisticated mathematical view is to treat matrices whose entries are
functions of time as matrix-valued functions of a real variable. For example, an n x 1
x(r) would denote a function with domain a time interval, and range R". However this
framework is not needed for our purposes, and actually can be confusing because of
conventional interpretations of matrix concepts and calculations in linear system theory.
In mathematics a norm, for example IIx(t)II, always denotes a real number.
However this 'function space' viewpoint is less useful for our purposes than interpreting
Ilx(t)II 'pointwise in time.' That is, IIx(t)II is viewed as the real-valued function of
that gives the Euclidean norm of the vector v (t) at each value of t. Namely,

IIx(t)II = \LVT(t)x(t)
Also we say that an n x ii matrix function A (t) is invertible for all t if for every value
of t the inverse matrix A '(t) exists. This is completely different from invertibility of
even when n = I. Other algebraic
the mapping A(t) with domain R and range R"
constructs are handled in a similar pointwise-in-time fashion. For example at each t the
X,,(i), and an induced norm IIA(t)II,
matrix function A(t) has eigenvalues X1(t)
all of which are viewed as scalar functions of time. If Q (t) is a symmetric n x n matrix
at each t, then Q (t) > 0 means that at every value of t the matrix is positive definite.
Sometimes this viewpoint is said to treat matrices 'parameterized' by t rather than
'matrix functions' of t. However we retain the latter terminology.
Confusion also can arise in the rules of 'matrix calculus.' In general matrix
calculations are set up to be consistent with scalar calculus in the following sense. If the
matrix expression is written out in scalar terms, the usual scalar calculations performed,
and the result repacked into matrix form, then we should get the same result as is given

by the rules of matrix calculus. This principle leads to the conclusion that differentiation
and integration of matrices should be defined entry-by-entry. Thus the i,j-entries of

fA(t)
are, respectively,
r

Using

c/a

d
,

a,,(r)

these facts it is easy to verify that the product rule holds for differentiation of

Convergence

11

matrices. That is, with overdot denoting differentiation with respect to time,

[A(t)B(t)]

=A(t)B(t) +

The fundamental theorem of calculus applies in the case of matrix functions,

fJA(a)da=A(t)
and also the Leibniz rule:
g(1)

A (1, a) da = A (t, g(t))

A (r,

f(t))f(t)

1(1)

g(1)

5
IU)

However we must be careful about the generalization of certain familiar


calculations from the scalar caseparticularly those having the appearance of a chain
rule. For example if A (r) is square the product rule gives
A2(t) = A(r)A (t) + A (t)A(r)

This is not in general the same thing as 2A (t)A(t), since A (t) and its derivative need not
commute. (The diligent might want to figure out why the chain rule does not apply.) Of
course in any suspicious case the way to verify a matrix-calculus rule is to write out the
scalar form, compute, and repack.

In view of the interpretations of norm and integration, a particularly useful


inequality for an n x I vector function x (t) follows from the triangle inequality applied
to approximating sums for the integral:

IIJx(a)daIl 5 IJx(a)II dal


Often we apply this when t ti,, in which case the absolute value signs on the right side

can be erased.

Convergence
Familiarity with basic notions of convergence for sequences or series of real numbers is

assumed at the outset. A brief review of some more general notions is provided here,
though it is appropriate to note that the only explicit use of this material is in discussing
existence and uniqueness of solutions to linear state equations.
An infinite sequence of ii x 1 vectors is written as IXk }
where the subscript
notation in this context denotes different vectors rather than entries of a vector. A vector
i is called the limit of the sequence if for any given c> 0 there exists a positive integer,
written K (e) to indicate that the integer depends on e, such that

Chapter 1

Mathematical Notation and Review

xk II <e, k > K(e)

(19)

If such a limit exists, the sequence is said to converge to


written limL
=
Notice that the use of the norm converts the question of convergence for a sequence of
vectors {.rk

iii;

XL II

to a vector j into a question of convergence of the sequence of scalars


to zero.

More often we are interested in sequences of vector functions of time, denoted


{xL(t))r_o, and defined on some interval, say [t0, ti]. Such a sequence is said to
converge (pointwise) on the interval if there exists a vector function
such that for
every E [t0, t1 I the sequence of vectors f
converges to the vector (ti,). In
this case, given an e, the K can depend on both c and ti,. The sequence of functions
converges uniform/v on ['a, t,] if there exists a function
such that given E> 0
there exists a positive integer K (e) such that for every t1, in the interval,
xL(111)II <e, k > K(s)

The distinction is that, given e> 0, the same K (e) can be used for any value of

to

show convergence of the vector sequence XL (ta) I


For an infinite series of vector functions, written
(20)
j=o

with each
partial sums

defined on [t0,

t1 I,

convergence is defined in terms of the sequence of

Sk(t)

The series converges (pointwise) to the function


lim

5k(ta)

if for each
II

[ta, t ii'

=0

The series (20) is said to converge uniformly to

on It0,

if the sequence of partial

sums converges uniformly to i(t) on [t0, ti. Namely, given an e> 0 there must exist a
positive integer K(e) such that for every t e [t0, ti],
k

IIx(t)

II < c , k > K(e)

j =0

While the infinite series used in this book converge pointwise for t e ( 00, oo),
our emphasis is on showing uniform convergence on arbitrary but finite intervals of the

form [t0, t1 I. This permits the use of special properties of uniformly convergent series
with regard to continuity and differentiation.
1.6 Theorem If (20) is an infinite series of continuous vector functions on [ta, t1 I that
converges uniformly to
on [t0, t1}, then
is continuous fortE [r0, t1].

Convergence

It is an inconvenient fact that term-by-term differentiation of a uniformly


convergent series of functions does not always yield the derivative of the sum. Another
uniform convergence analysis is required.
1.7 Theorem Suppose (20) is an infinite series of continuously-differentiable functions
on [t0, t that converges uniformly to (t) on [t0, t 1. If the series

converges uniformly on [ta,

ti], it converges to di(t)Idt.

The infinite series (20) is said to converge absolutely if the series of real functions

1=0

converges on the interval. The key property of an absolutely convergent series is that
terms in the series can be reordered without changing the fact of convergence.

The specific convergence test we apply in developing solutions of linear state


equations is the Weierstrass M-Test, which can be stated as follows.
1.8 Theorem

If the infinite series of positive real numbers


(22)
j=0

and if
c,. for all t E [t0,
converges uniformly and absolutely on [ta, t ii.
converges,

For the special case of power series in t,


vector coefficients,

and every j, then the series (20)


basic fact is that if a power series with

j=0

on an interval, it converges uniformly and absolutely on that interval. A


vector function f (t) is called analytic on a time interval if for every point
in the
converges

interval, it can be represented by the power series


(23)

that converges on some subinterval containing ta. That is, f(t)

is

analytic on an interval

if it has a convergent Taylor series representation at each point in the interval. Thus

Chapter 1

Mathematical Notation and Review

(t) is analytic at ta if and only if it has derivatives of any order ata' and these
derivatives satisfy a certain growth condition. (Sometimes the term real analytic is used
to distinguish analytic functions of a real variable from analytic functions of a complex
variable. Except for Laplace and z-transforms, functions of a complex variable do not
arise in the sequel, and we use the simpler terminology.)

Similar definitions of convergence properties for sequences and series of in x n


It is not
matrix functions of time can be made using the induced norm for matrices.

difficult to show that these matrix or vector convergence notions are equivalent to
applying the corresponding notion to the scalar sequence formed by each particular entry
of the matrix or vector sequence.

Laplace Transform
Aside from the well-known unit impulse 3(t), which has Laplace transform 1, we use the

Laplace transform only for functions that are sums of terms of the form tIext, t
where

is a complex constant and k is a nonnegative integer. Therefore only the most

basic features are reviewed. If F (t) is an m x ii matrix of such functions defined for
E [0, oo), the Laplace transform is defined as the in x ii matrix function of the complex
variable s given by
F(s) = J

F(f)e

dt

(24)

Often this operation is written in the format F(s) = L[F (t)]. (For much of the book,
Laplace transforms are represented in Helvetica font to distinguish, yet connect, the
corresponding time function in Italic font.)
Because of the exponential nature of each entry of F (t), there is always a halfplane of convergence of the form Re Es] > for the integral in (24). Also easy
calculations show that each entry of F(s) is a strictly proper rational functiona ratio
of two polynomials in s where the degree of the denominator polynomial is strictly

greater than the degree of the numerator polynomial. A convenient method of


computing the matrix F (t) from such a transform F(s) is entry-by-entry partial fraction
expansion and table-lookup.

Our material requires only a few properties of the Laplace transform. These
include linearity, and the derivative and integral relations

L[F(t)]

L [J F(o) thy] =

sL[F(t)]

F(0)

L[F(t)}

Recall that in certain applications to linear systems, usually involving unit-impulse


inputs, the evaluation of F(t) in the derivative property should be interpreted as an
evaluation at t = 0. The convolution property

Laplace Transform

15

da] = L[F(t)] L{G(t)]

(25)

is very important. Finally the initial value theorem and final value theo,-em state that if
the indicated limits exist, then (regarding s as real and positive)

lim F(t) = lim sF(s)


S300

urn

F(t) = lim sF(s)

Often we manipulate matrix Laplace transforms, where each entry is a rational


function of s, and standard matrix operations apply in a natural way. In particular
suppose F(s) is square, and det F(s) is a nonzero rational function. (This determinant
calculation of course involves nothing more than sums of products of rational functions,
and this must yield a rational-function result.) Then the adjugate-over-determinant
provides a representation for the matrix inverse F-' (s), and shows that this inverse has
entries that are rational functions of s. Other algebraic issues are not this simple, but
fortunately we have little need to go beyond the basics. It is useful to note that if F(s) is

a square matrix with polynomial entries, and det F(s) is a nonzero polynomial, then
F' (s) is not always a matrix of polynomials, but is always a matrix of rational
functions. (Because a polynomial can be viewed as a rational function with unity
denominator, the wording here is delicate.)

1.9 Example

For the Laplace transform


S

s+2
1

the determinant is given by

a(7s +9)

det F(s) =
(s1)(s+3) 2

If a =

the inverse of F(s) does not exist. But for a

the determinant is a nonzero

rational function, and a straightforward calculation gives

(sl)(s+3)2

'

F (s)

a(7s +9)

a(s+2)

s-i-2

a
sl

An astute observer might note that strict-properness properties of the rational entries of
F(s) do not carry over to entries of F' (s). This is a troublesome issue that we address
when it arises in a particular context.

DOD

Chapter 1

Mathematical Notation and Review

The Laplace transforms we use in the sequel are shown in Table 1.10, at the end of
the next section. These are presented in terms of a possibly complex constant 2., and
some effort might be required to combine conjugate terms to obtain a real representation
in a particular calculation. Much longer transform tables that include various real
functions are readily available. But for our purposes Table 1.10 provides sufficient data,

and conversions to real forms are not difficult.

z-Transform
The z-t,-ansforrn is

used to represent sequences in much the same way as the Laplace

transform is used for functions. A brief review suffices because we apply the

z-

transform only for vector or matrix sequences whose entries are scalar sequences that are
sums of terms of the form
k = 0, 1, 2,..., or shifted versions of such sequences.

Here X is a complex constant, and r is a fixed, nonnegative integer. Included in this


form (for r = = 0) is the familiar, scalar uni, pulse sequence defined by

1, k=0
.

0, otherwise

(26)

In the treatment of discrete-time signals, where subscripts are needed for other
purposes, the notation for sequences is changed from subscript-index form (as in (19)) to
argument-index form (as in (26)). That is, we write x (k) instead of
x q matrix sequence defined for k 0, the z-transform of F (k) is
an r x q matrix function of a complex variable z defined by the power series

F(k)zt

F(z) =

(27)

k=o

We use Helvetica font for z-transforms, and often adopt the operational notation

F(z) =Z[F(k)].
For the class of sums-of-exponential sequences that we permit as entries of F (k),

it can be shown that the infinite series (27) converges for a region of z of the form
z > > 0. Again because of the special class of sequences considered, standard but
I

intricate summation formulas show that all z-transforms we encounter are such that each
entry of F(z) is a proper rationalfunction a ratio of polynomials in z with the degree
of the numerator polynomial no greater than the degree of the denominator polynomial.

For our purposes, partial fraction expansion and table-lookup provide a method for
computing F(k) from F(z). This inverse z-transform operation is sometimes denoted by
F(k) = Z '[F(z) 1.
Properties of the z-transform used in the sequel include uniqueness, linearity, and
the shift properties

Z[F(kl)] =z Z[F(k)]
Z[F(k+ 1)] =zZ[F(k)] zF(0)

z-Transform

Because we use the z-transform only for sequences defined for k 0, the right shift

(delay) F(kl) is the sequence


0, F(O),

F(1), F(2),

while the left shift F (k + I) is the sequence

F(l), F(2), F(3),


The convolution property plays an important role: With F (k) as above, and H (k) a
q x I matrix sequence defined for k 0,
F(k -j) H(j) 1= z [F(k)] Z {H(k) J

(28)

Also the initial value theorem and final value theorem appear in the sequel. These state
that if the indicated limits exist, then (regarding z as real and greater than 1)
F(z)

lim F(k) = lim (z 1) F(:)

Exactly as in the Laplace-transform case, we have occasion to compute the inverse

of a square-matrix z-transform F(z) with rational entries. If det F(z) is a nonzero


rational function, F-' (z) can be represented by the adjugate-over-determinant formula.
Thus the inverse also has entries that are rational functions of z. Notice that if F (z) is a
square matrix with polynomial entries, and det F (z) is a nonzero polynomial, then
F' (z) is a matrix with entries that are in general rational functions of z.
The z-transforms needed for our treatment of discrete-time linear systems are
shown in Table 1.10, side-by-side with Laplace transforms. In this table is a complex
constant, and the binomial coefficient is defined in terms of factorials by

(rl)!

(r1)! (kr)! '

0, k<rl
As an extreme example, for

= 0 Table 1.10 provides the inverse z-transform

Z'

=6(kr+1)

which of course is a unit pulse sequence delayed by r 1 units.

Mathematical Notation and Review

Chapter 1

18

f(t), tO

f(k), k0

F(s)

F(:)

zl
,.

.,

._L
5"

IriJ

s
eq'

(q1)!

Ic

sI.
1

Ic

(z_X)r

[r-1J

(q-1)!

Table 1.10 A short list of Laplace and z transforms.

EXERCISES
Exercise 1.1 (a) Under what condition on n x ii matrices A and B does the binomial expansion
hold for (A +B)k, where k is a positive integer?
(b) If the n x n matrix function A (t) is invertible for every t, show how to express A - (r) in terms
of
k =0, I
nI. Under an appropriate additional assumption show that if
I!A(t)II a < oo for all t,then there exists a finite constant such that hA (r)hh
for all I.
, what are the eigenvalues of

Exercise 1.2 If the n x n matrix A has eigenvalues ?9


(a) A , where k is a positive integer,
(b) A - assuming the inverse exists,

(c) AT.
(d) A",
(e) aA, where a is a real number,

U) ATA?(Careful!)
Exercise 1.3 (a) Prove a necesary and sufficient condition for nilpotence in terms of eigenvalues.
(h) Show that the eigenvalues of a symmetric matrix are real.
(c) Prove that the eigenvalues of an upper-triangular matrix are the diagonal entries.
Exercise 1.4

Compute the spectral norm of

(a)

00

(b)

31
1

'

(c)

1i
0

1+i

Exercises

19

Exercise 1.5 Given a constant cx>


eigenvalues ofA are both 1/cz, and HA II

1, show how to define a 2 x 2 matrix A such that the


a.

Exercise 1.6 For an rn x n matrix A, prove from the definition in (10) that the spectral norm is
given by

hAil =max

hAul
lix II

Conclude that for any ii x I vector x,


IIAx

Exercise 1.7

II hA U lxii

Using the conclusion in Exercise 1.6, prove that for conformable matrices A and B,

UABII iiAlhiIBhi

If A is invertible, show that


ii

Exercise 1.8 For a partitioned matrix


A

A11 A1,
A21 A,2

show that
ii IA Ii for I, j = 1, 2. If only one submatrix is nonzero, show that hA Ii equals
the norm of the nonzero submatrix.

Exercise 1.9 If A is ann x n matrix, show that for all n x I vectors x


ixTAx I UA ii lix 112 ,

xTAx IA II lix 112

Show that for any eigenvalue X of A,

hAil
(In words, the spectral radius of A is no larger than the spectral norm of A.)

Exercise 1.10 If Q is a symmetric n x ii matrix, prove that the spectral norm is given by
1Q11

where

Exercise 1.11

are the

= max hxTQxh = max


Dxli = I
Ii

eigenvalues of Q.

Show that the spectral norm of an rn X n matrix A is given by


hAil

lix II =
= [ max

xTATA1]"2

Conclude from the Rayleigh-Ritz inequality that hA Ii is given by the nonnegative square root of

the largest eigenvalue of A TA.

Exercise 1.12 If A is an invertible n x n matrix, prove that


n-I
A

ilA'Ii< idetAl

Hint: Work with the symmetric matrix A TA and Exercise 1.11.

Mathematical Notation and Review

Chapter 1

Exercise 1.13 Show that the spectral norm of an in x ii matrix A is given by

In 1 =

Exercise 1.14
A (1)

ITAx

max -

If A (t) is a continuous, ii x ii matrix function of 1. show that its eigenvalues


and the spectral norm IA (t)II are continuous functions of r. Show by example

that continuous differentiability of A (i) does not imply continuous differentiability of the
eigenvalues or the spectral norm. Hi,it: The composition of continuous functions is a continuous
function.
Exercise 1.15

II' Q is an ii x n symmetric matrix and

. r, are such that

O<r1!Q
show that

Exercise 1.16 Suppose W(t) is ann x ii time-dependent matrix such thai W(f)EI is symmetric
and positive semidelinite for all 1. where c> 0. Show there exists a 'y> 0 such that der W (i) 'y for

Exercise 1.17 If A (t) is a continuously-differentiable n x n matrix function that is invertible at


each t,show that

Exercise 1.18

If x(t) is an ii x I diflrentiable function oft, and IIx(t)II also is a differentiable

function oft, prove that


I 17 II.v(t)II

II fx(i)II

for all . Show necessity of the assumption that IIx(t)II is differentiable by considering the scalar
casex(t) = L
Exercise 1.19

Suppose that FU) is in xii and such that there is no finite constant a for which

t0
Show that there is at least one entry of F(t), say
no finite for which

that has the same property. That is, there is

If F(k) is an in xii matrix sequence, show that a similar property holds for

k0
Exercise 1.20

Suppose A (t) is an n x n matrix function that is invertible for each t. Show that if

Notes

21

there is a finite constant a such that hA - '(t) II

a for all t, then there is a positive constant f3 such

that IdetA(t)!
Exercise 1.21 Suppose Q (t) is n x ii, symmetric, and positive semidefinite for all t. If t,,

and

show that
5

IIQ(o)II

Hint. Use Exercise 1.10.

NOTES
Note 1.1

Standard references for matrix analysis are

F.R. Gantmacher, Theory of Matrices, (two volumes), Chelsea Publishing, New York, 1959
R.A. Horn, C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1985

0. Strang, Linear Algebra and its Applications, Third Edition, Harcourt, Brace, Janovich, San
Diego, 1988

All three go well beyond what we need. In particular the second reference contains an extensive
treatment of induced norms. The compact reviews of linear algebra and matrix algebra in texts on
linear systems also are valuable. For example consult the appropriate sections in the books
R.W. Brockett, Finite Dimensional Linear Systems, John Wiley, New York, 1970

D.F. Delchamps, State Space and input-Output Linear Systems, Springer-Verlag, New York, 1988
T. Kailath, Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey, 1980
L.A. Zadeh, C.A. Desoer, Linear System Theory, McGraw-Hill, New York, 1963

Note 1.2 Matrix theory and linear algebra provide effective computational tools in addition to a
mathematical language for linear system theory. Several commercial packages are available that
provide convenient computational environments. A basic reference for matrix computation is

G.H. Golub, C.F. Van Loan, Matrix Computations, Second Edition, Johns Hopkins University
Press, Baltimore, 1989

Numerical aspects of the theory of time-invariant linear systems are covered in

P.H. Petkov, N.N. Christov, M.M. Konstantinov, Computational Methods for Linear Control
Systems, Prentice Hall, Englewood Cliffs, New Jersey, 1991

Note 1.3 Various induced norms for matrices can be defined corresponding to various vector
norms. For a specific purpose there may be one induced norm that is most suitable, but from a
theoretical perspective any choice will do in most circumstances. For economy we use the spectral
norm, ignoring all others.

Chapter 1

22

Mathematical Notation and Review

A fundamental construct related to the spectral norm, but not explicitly used in this book, is the

following. The nonnegative square roots of the eigenvalues of A A are called the singular values
of A. (The spectral norm of A is then the largest singular value of A.) The singular value
decomposition of A is based on the existence of orthogonal matrices U and V (U - = UT and
= V1) such that U1AV displays the singular values of A on the quasi-diagonal, with all other
entries zero. Singular values and the corresponding decomposition have theoretical implications
in linear system theory and are central to numerical computation. See the citations in Note 1.2.
the paper

V.C. Klema, A.J. Laub, "The singular value decomposition:

its computation and some


applications.'' IEEE Transactions on Auto,,iaric Control, Vol. 25. No. 2. pp. 164 176, 1980

or Chapter 19 of
R.A. DeCarlo, Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey. 1989
Note 1.4 The growth condition that an infinitely-differentiable function of a real variable must
satisfy to be an analytic function is proved in Section 15.7 of

W. Fulks,Advanced Calculus, Third Edition, John Wiley. New York, 1978

Basic material on convergence and uniform convergence of series of functions are treated in this
text, and of course many, many others.
Note 1.5 Linear-algebraic notions associated to a time-dependent matrix, for example range
space and rank structure, can be delicate to work out and can depend on smoothness assumptions
on the time-dependence. For examples related to linear system theory, see
L. Weiss, P.L. FaIb, "Dolezal 's theorem, linear algebra with continuously parametrized elements,
and time-varying systems," &tathe,natical Systems Theo,y, Vol. 3, No. 1, pp. 67 75, 1969

L.M. Silverman, R.S. Bucy, "Generalizations of a theorem of Dolezal," Mathematical Systems


Theory, Vol. 4, No. 4, pp. 334 339, 1970

2
STATE EQUATION
REPRESENTATION

The

basic representation for linear systems is the linear state equation, customarily

written in the standard form

=A(t)x(t) + B(t)u(t)
y(t) = C(t)x(r) + D(t)u(t)
the overdot denotes differentiation with respect to time t. The n x 1 vector
and its components, x1(t) ,...,x,,(t), are
function of time x(t) is called the state
the state variables. The input signal is the nz x 1 function u (t), and y (t) is the p x 1
output signal. We assume throughout that p, nz <ii a sensible formulation in terms of
independence considerations on the components of the vector input and output signals.
where

Default assumptions on the coefficient matrices in (1) are that the entries of
A (t) (n x n), B (t) (n x m), C (t) (p x n), and D (t) (p x m) are continuous, real-valued
functions defined for all t E ( cc, cc). Standard terminology is that (1) is time invariant

if these coefficient matrices are constant. The linear state equation is called rime valying
if any entry of any coefficient matrix varies with time.
Mathematical hypotheses weaker than continuity can be adopted as the default
setting. The resulting theory changes little, except in sophistication of the mathematics
that must be employed. Our continuity assumption is intended to balance engineering
generality against simplicity of the required mathematical tools. Also there are isolated
instances when complex-valued coefficient matrices arise, namely when certain special
forms for state equations obtained by a change of state variables are considered. Such
exceptions to the assumption of real coefficients are noted locally.
cc, cc) and piecewise
The input signal ii (t) is assumed to be defined for all t
continuous. Piecewise continuity is adopted so that for a few technical arguments in the
sequel an input signal can be pieced together on subintervals of time, leaving jump
discontinuities at the boundaries of adjacent subintervals. Aside from these
23

Chapter 2

24

State Equation Representation

constructions, and occasional mention of impulse (generalized function) inputs, the input

signal can be regarded as a continuous function of time.


in engineering problems there is a fixed initial time

and properties of
the solution .v(t) of a linear state equation for given initial state x(t(,) = x0 and input
signal u(t), specified for t E [ti,, oo), are of interest for t t0. However from a
mathematical viewpoint there are occasions when solutions 'backward in time' are of
interest, and this is the reason that the interval of definition of the input signal and
coefficient matrices in the state equation is (00, 00). That is, the solution x (t) for
t <to, as well as t t0, is mathematically valid. Of course if the state equation is

defined and of interest only in a smaller interval, say t n [0, 00), the domain of definition
of the coefficient matrices can be extended to (00, oo) simply by setting, for example,
A (t) = A (0) for t <0, and our default set-up is attained.

The fundamental theoretical issues for the class of linear state equations just
introduced are the existence and uniqueness of solutions. Consideration of these issues is
postponed to Chapter 3, while we provide motivation for the state equation
representation. In fact linear state equations of the form (1) can arise in many ways.
Sometimes a time-varying linear state equation results directly from a physical model of
linear differential equations from mathematical
interest. Indeed the classical
physics can be placed in state-equation form. Also a time-varying linear state equation

arises as the linearization of a nonlinear state equation about a particular solution of


interest. Of course the advantage of describing physical systems in the standard format
(1) is that system properties can be characterized in terms of properties of the coefficient
matrices. Thus the study of (I) can bring out the common features of diverse physical
settings.

Examples
begin with a collection of simple examples that illustrate the genesis of time-varying
linear state equations. Relying also on previous exposure to linear systems, the universal
should emerge from the particular.
We

2.1 Example Suppose a rocket ascends from the surface of the Earth propelled by a
thrust force due to an ejection of mass. As shown in Figure 2.2, let h (t) be the altitude
of the rocket at time t, and v (t) be the (vertical) velocity at time t, both with initial
values zero at t = 0. Also, let rn (t) be the mass of the rocket at time r. Acceleration
due to gravity is denoted by the constant g, and the thrust force is the product
is the assumedwhere Ve is the assumed-constant relative exhaust velocity, and
constant rate of change of mass. Note Ve <0 since the exhaust velocity direction is
<0 since the mass of the rocket decreases.
opposite v (r), and
Because of the time-variable mass of the rocket, the equations of motion must be
based on consideration of both the rocket mass and the expelled mass. Attention to basic
physics (see Note 2.1) leads to the force equation
rn

(t)l)(t) = m (t)g +

Vertical velocity is the rate of change of altitude, so an additional differential equation

Examples

25

2.2

Figure A rocket ascends, with altitude h (t) and velocity v (t).

describing the system is

h(t) = v(f)
Finally the rocket mass variation is given by ,il(t) = fl(), which gives, by integration,
p1(t) = ni0 + u0t
where rn0 is the initial mass of the rocket. Let x1 (t) = h (t) and x2(t) = i' (t) be the state

variables, and suppose altitude also is the output. A linear state equation description that
is valid until the mass supply is exhausted is

x(t)

01
=

0 0

x (t) +

0
g +

.v

(0) = 0

y(t)= [1 0]x(t)
Here the input signal has a fixed form, so the input term is written as a forcing function.

This should be viewed as a time-invariant linear state equation with a time-varying


forcing function, not a time-varying linear state equation. We return to this system in
Example 2.6, and consider a variable rate of mass expulsion.

2.3 Example Time-varying versions of the basic linear circuit elements can be devised
in simple ways. A time-varying resistor exhibits the voltage/current characteristic

1(t) = r(t)i(t)
where

r(t) is a fixed time function. For example if r(t) is a sinusoid, then this is the

basis for some modulation schemes in communication systems. A time-varying


capacitor exhibits a time-varying charge/voltage characteristic, q (t) = c (t)v (t). Here
c (t) is a fixed time function describing, for example, the variation in plate spacing of a

parallel-plate capacitor. Since current is the instantaneous rate of change of charge, the
voltage/current relationship for a time-varying capacitor has the form

State Equation Representation

Chapter 2

26

= c(t)

1(1)

di'(t)
dt

dc(t)
+

1(t)

cIt

(4)

Similarly a time-varying inductor exhibits a time-varying flux/current characteristic, and

this leads to the voltage/current relation

v(t) = 1(t)

dl(t)

di(t)
cIt

1(t)

dr

r(t)

c(t)

ii(1)

1(1)

2.4 Figure A series connection of time-varying circuit elements.

Consider the series circuit shown in Figure 2.4, which includes one of each of
these circuit elements, with a voltage source providing the input signal u (t). Suppose
the output signal y (t) is the voltage across the resistor. Following a standard
prescription, we choose as state variables the voltage x1 (t) across the capacitor and the
current x7(t) through the inductor (which also is the current through the entire series
circuit). Then Kirchhoff's voltage law for the circuit gives

[r(t) + l(t)]x,(t) +

i2(t) =

Another equation describing the circuit (a trivial application of Kirchhoff's current law)
is (4), which in the present context is written in the form
c'(t)

i1(t) = c(t) x1(t) + c(t)A2(t)


The output equation is

y(t) = r(t)x7(t)
This yields a linear state equation description of the circuit with coefficients
1

A(t)=

2.5 Example

c(t)

c(t)

-l

-r(r)-!(t)

1(t)

1(t)

B(t)=

C(t)= [0

i.(t)]

/ (t)

Consider an n'1'-order linear differential equation in the dependent

Examples

27

variable y (t), with forcing function b0(t)u (t),

dy(t)
dr"
defined

for t

+ a,,_ 1(t)

+ a0(t)y (t) = b0(t)u (t)

dt"'

to, with initial conditions

dy
y(t0), (t0)

(t0)

A simple device can be used to recast this differential equation into the form of a linear
state equation with input u (t) and output y (t). Though it seems an arbitrary choice, it

is convenient to define state variables (entries in the state vector) by

x1(t) =y(t)
x2(t)

dy(t)
=

dt

x,,(t)=
That is, the output and its first n derivatives are defined as state variables. Then
1

i1(t) =X7(t)
.i2(t) =x1(t)

i,,_1(t) =x,,(t)
and, according to the differential equation,
= a0(t)x1(t) a1(t)x2(t)

a0_1(t)x,,(t) + b0(t)u(t)

Writing these equations in vector-matrix form, with the obvious definition of the state
vector x (t), gives a time-varying linear state equation,
o

...

x(t) +
a0(t) a1(t)

...
. . .

u(t)

a,,_1(t)

b0(t)

The output equation can be written as

y(t)= [1

... 0]x(t)

and the initial conditions on the output and its derivatives form the initial state

State Equation Representation

Chapter 2

28

y (t11)

clv
(t1,)

X(t0) =
'V

dt'

(t11)

Linearization
A linear state equation (1) is useful as an approximation to a nonlinear state equation in
the following sense. Consider

.i(t) = f (x (t), ii (t), t) , .v (ti,) =


where the state v (r) is an n x I vector, and u (t) is an in x 1 vector input. Written in
scalar terms, the jilt_component equation has the form

f,(x (r),...,x,,(t); u i(')

ii,,,(r); t)

x1(t1,) =

i,. ., ii. Suppose (9) has been solved for a particular input signal called the
to
and a particular initial state called the nominal initial state
nominal input
Of
interest
is
the
obtain a nominal solution, often called a nominal trajectomy.
behavior of the nonlinear state equation for an input and initial state that are close ' to
and r0
where
the nominal values. That is, consider zi(t) =
and 11116(1)11 are appropriately small for t t0. We assume that the
corresponding solution remains close to
at each t, and write x(t) =
+x6(t). Of
course this is not always the case, though we will not pursue further an analysis of the
assumption. In terms of the nonlinear state equation description, these notations are
related according to
for i =

fx&(t) = f

+ u6(t),

t),

+ v8(t11) =

(10)

Assuming derivatives exist, we can expand the right side using Taylor series about

and then retain only the terms through first order. This should provide a
reasonable approximation since u6(t) and x5(t) are assumed to be small for all 1. Note
that the expansion describes the behavior of the function f (x, a, I) with respect to
arguments x and a; there is no expansion in terms of the third argument t. For the i'1'
and

component, retaining terms through first order, and momentarily dropping most targuments for simplicity, we can write

Linearization

29

ii,

vU

1,...,

t) x8(t)

t)

u(t), r) Ub(t)

+
denotes

=
the

(11)

ii, t)u5111

n and arranging into vector-matrix form gives

where the notation

+
'-'11rn

Performing this expansion for i =

+ 5-(x, u,

u, f)Xi +

the Jacobian, a matrix with i,j-entry

1),

Since

(ta) =

relation between x6(t) and u5(t) is approximately described by a time-varying

linear state equation of the form


*6(1) = A (t)x8(t) + B (t)u5(t)

x6(f0) = x0

where A (t) and B (t) are the matrices of partial derivatives evaluated using the nominal

trajectory data, namely

u(t), t), B(t) =

A(t) =

u(t), t)

If there is a nonlinear output equation,

y(t) = h(x(t),

t)

and u = i(t) in a similar


t) can be expanded about x =
fashion to give, after dropping higher-order terms, the approximate description
v5(t) = C(t)x5(t) + D(t)u5(t)
Here

the deviation output is y8(f) = y (t) 5(t), where 5(t) = Ii ((t),

C(t) =

t), D(t) =

t), and
t)

In this development a nominal solution of interest is assumed to exist for all t to,
it must be known before the computation of the linearization can be carried out.
Determining an appropriate nominal solution often is a difficult problem, though
and

physical insight can be helpful.

2.6 Example Consider the behavior of the rocket in Example 2.1 when the rate of mass
expulsion can be varied with time: u (t) = th(t), in place of a constant u0. The velocity
and altitude considerations remain the same, leading to

Chapter 2

State Equation Representation

Ii(t) = v(t)
= g +

In addition the rocket mass rn (t) is described by

,z(t) = u(t)
Therefore m (t) is regarded as another state variable, with 11(t) as the input signal. Setting

x1(t) = Ii(t), x,(t) = v(t)

x3(t) =

yields
X2(t)

i2(t) =
i3(t)

(t)/x1(t)

14(t)

y(t)=x1(r)
This

(13)

is a nonlinear state equation description of the system, and we consider

linearization about a nominal trajectory corresponding to the constant nominal input


<0. The nominal trajectory is not difficult to compute by integrating in turn
=
the differential equations for x3(t), 12(t), and xi(t). This calculation, equivalent to
solving the linear state equation (3) in Example 2.1, gives
1(t) =

m0
I

In

I+

= rn0 +

(14)

140t

Again, these expressions are valid until the available mass is exhausted.
To compute the linearized state equation about the nominal trajectory, the partial
derivatives needed are

af(x, u)
=
ax

0 1
0 0

00

af(x, u)
=
au

0
1

Evaluating these derivatives at the nominal data, the linearized state equation in terms of
the deviation variables x5(t) = x (I) i(t) and
= zi(t) U0 IS

____________

Linearization

01
x6(t) =

00

(nz<,

00
(Here

VeU0

+ ti0t)2

x6(t) +

+ u0t

u6(1)

can be positive or negative, representing deviations from the negative

constant value ufl.) The initial conditions for the deviation state variables are given by
0
0

x6(0)=x(0)

rn<,

Of course the nominal output is simply 5(t) =

y6(t) = [1

and

the linearized output equation is

0] x6(t)

2.7 Example An Earth satellite of unit mass can be modeled as a point mass moving in
a plane while attracted to the origin of the plane by an inverse square law force. It is
convenient to choose polar coordinates, with r (t) the radius from the origin to the mass,
and e(t) the angle from an appropriate axis. Assuming the satellite can apply force
u 1(r) in the radial direction and u2(t) in the tangential direction, as shown in Figure
2.8, the equations of motion have the form

2.8 Figure A unit point mass in gravitational orbit.


= r(t)2(t)

r2(t)

+ u1(t)

u2(r)

6(t) =

r(t)

r(t)

is a constant. When the thrust forces are identically zero, solutions can be
ellipses, parabolas, or hyperbolas, describing orbital motion in the first instance, and
escape trajectories of the satellite in the others. The simplest orbit is a circle, where r (t)
and (z) are constant. Specifically it is easy to verify that for the nominal input
1(t) =
= 0, t 0, and nominal initial conditions
where 13

Chapter 2

32

r(O) = r0

i(O) = 0

(0) =

e(0) =
where cot, =

the

State Equation Representation

nominal solution is
= r,,

(t) = w0t +

To construct a state equation representation, let

x
so

= r (t), x2(t) = i(t), x3(t) = 0(t), x4(t) = O(t)

that the equations of motion are described by


V2(t)

i1(t)
x1(t)

x1(t)

x1(t)
The nominal data is then

(0)=
(0,,

(00

With the deviation variables

x5(t) =x(t)
the

u(t)

corresponding linearized state equation is computed to be


0

x5(t) =

0
0

0
0

00
02r,w,,
0

.v5(t)

000
1

u8(t)

0 hr0

Of course the outputs are given by

1000
o8(t)

01 0

r6(t) = r(t)r0, and e5(t) = e(t)Co0t00. For a circular orbit the linearized
state equation about the time-varying nominal solution is a time-invariant linear state
where

Linearization

33

equationan unusual occurrence. If a nominal trajectory corresponding to an elliptical

orbit is considered, a linearized state equation with periodic coefficients is obtained.

In a fashion closely related to linearization, time-varying linear state equations


provide descriptions of the parameter sensitivity of solutions of nonlinear state
equations. As a simple illustration consider an unforced nonlinear State equation of
dimension ii, including a scalar parameter that enters both the right side of the state
equation and the initial state. Any solution of the state equation also depends on the
parameter, so we adopt the notation
a) =

f (x (t, a), a), x (0, a) =

x0(cx)

Suppose that the function f (x, cx) is continuously differentiable in both x and a, and
that a solution x (t, a0), t 0, exists for a nominal value a0 of the parameter. Then a
standard result in the theory of differential equations is that a solution x (t, a) exists and
is continuously differentiable in both t and a, for a close to a0. The issue of interest is
the effect of changes in a on such solutions.
We can differentiate (19) with respect to a and write
a

a), a)

a) =

a), a),

x(t, a) +

a) =

(20)

To simplify notation denote derivatives with respect to a, evaluated at a0, by

z(t) =

ax

(t, a0), g(t) =

af

(x(t, a0), a0)

and let

A(r)=

af

a0), a0)

Then since

ataa x(t, a)
we can write (20) for a =

a0

as

+g(i),

ax.

The solution z (t) of this forced linear state equation describes the dependence of the

solution of (19) on the parameter a, at least for Iaa0 I small. If in a particular


instance llz(t)ll remains small for r 0, then the solution of the nonlinear state
equation is relatively insensitive to changes in a near a0.

Chapter 2

34

State Equation Representation

x1(10)

r,(t)

(a)

J'xi(o)da+xi(t,,)
(b)

2.9

Figure The elements of a state variable diagram.

State Equation Implementation


In a reversal of the discussion so far, we briefly note that a linear state equation can be

implemented directly in electronic hardware. One implementation is based on electronic

devices called operational amplifiers that can be arranged to produce on electrical


signals the three underlying operations in a linear state equation.

The first operation is the (signed) sum of scalar functions of time, diagramed in
Figure 2.9(a). The second is integration, which conveniently represents the relationship
between a scalar function of time, its derivative, and an initial value. This is shown in
Figure 2.9(b). The third operation is multiplication of a scalar signal by a time-varying
coefficient, as represented in Figure 2.9(c). The basic building blocks shown in Figure

2.9 can be connected together as prescribed by a given linear state equation. The
resulting diagram, called a state variable diagram, is very close to a hardware layout for
electronic implementation. From a theoretical perspective such a diagram sometimes

reveals structural features of the linear state equation that are not apparent from the
coefficient matrices.

2.10 Example The linear state equation (8) in Example 2.5 can be represented by the
state variable diagram shown in Figure 2.11.

EXERCISES
Exercise 2.1 Rewrite the n"-order linear differential equation
y(fl)(() +

a,,_

(j)y(h

+ a0(t)y (t) = h(t)u (1) + b

as a dimension-n linear state equation,

=A(i)x(t)
y(i')

Hint:

Letx,,(t)

= y(H_I)(f) b1(t)u(t).

+ B(t)u(t)

= C(1)x(t) + D(t)u(i)

Exercises

35

2.11 Figure A state variable diagram for Example 2.5.

Exercise 2.2

Define state variables such that the n"-order differential equation


+ a,,_2t _2y(fl_2)(() +

+ a,_ t Iy(n_

+ a1t_hI+IyW(t) +

a0t'y(t) = 0

can be written as a linear state equation

i(t) = t' Ax(t)


where A is a constant n x n matrix.

Exercise 2.3 For the differential equation


+ (4/3)y3(t) = (l/3)zi(t)

= sin (3t),
use a simple trigonometry identity to help find a nominal solution corresponding to
y (0) = 0, 5'(O) = 1. Determine a linearized state equation that describes the behavior about this
nominal.

Exercise 2.4 Linearize the nonlinear state equation

x2(t) = u(t)x1(t)
about the nominal trajectory arising from

(0) =

1,

and

= 0 for all I 0.

Exercise 2.5 For the nonlinear state equation

i1(t)
x,(t)
with constant nominal input

x2(t)
=

2x1(t)x2(t)

-xj (I) + 4(t) + 4(t) + u(t)

= 0, compute the possible constant nominal solutions, often

called equilibrium states, and the corresponding linearized state equations.

State Equation Representation

Chapter 2

36

The Euler equations for the angular velocities of a rigid body are

Exercise 2.6

= (1, Ia)o),(t)w3(1) +

Here w1(t),

"1(t)

11)co1(t)w3(t) + u,(r)

130)3(1) =

(I 12)col(t)w2(r) + u3(t)

and 0)i(t) are

the angular velocities

in

a body-fixed coordinate system

coinciding with the principal axes;


u,(t), and
are the applied torques; and!, '2' and
13 are the principal moments of inertia. For = I,, a symmetrical body. linearize the equations
about the nominal solution

(113)

= sin

(113)

co,

w2(t) = cos

= (0,,

wherel=11

12.

Exercise 2.7

Consider a single-input, single-output. time-invariant linear state equation

k(r) =Ax(t) bu(i) ,

.v(O)

=x,,

y(t) =

If the nominal input is a nonzero constant. u(t) =

under

what conditions does there exist a

constant nominal solution (r) = .v,, for some .v,,. (The condition is more subtle than assuming A is
invertible.) Under what conditions is the corresponding nominal output zero? Under what
conditions do there exist constant nominal solutions that satisfy 5 = for all

Exercise 2.8 A time-invariant linear state equation

=Av(t) + Bu(t)
y(t) = Cx(t)
with p = rn is said to have identity dc-gaiii if for any given m x I vector
vector such that

That

there exists an ii x I

is, given any constant input there is a constant nominal solution with output identical to

input. Under the assumption that

AB

Co

is invertible, show that


(a) if an m x n matrix K is such that (A + BK) is invertible, then C (A + BK) - B is invertible,
(b) if K is such that (A + BK) is invertible, then there exists an ,n x rn matrix N such that the state
equation

k(t) = (A

y(,) = Cx(t)
has

identity dc-gain.

BK)x(t) + BNu (t)

Exercises
Exercise 2.9

37

Repeat Exercise 2.8 (b), omitting the assumption that (A BK) is invertible.

Exercise 2.10 Consider a so-called bilinear stare equation


= A.v(t) + Dx(t)u(t) + hu(t)

y(t)

x(O) =

= cx(t)

where A, D are ii x n, h is ii x 1, c is I x n, and all are constant matrices. Under what condition
If
does this state equation have a constant nominal solution for a constant nominal input u (t) =
.4 is invertible, show that there exists a constant nominal solution if
is 'sufficiently small.'
What is the linearized state equation about such a nominal solution?
I

Exercise 2.11

For the nonlinear state equation

k(r) =

x,(t) + u(t)
x1(t) 2x,(t)
x10)u(t) 2x,(t)u(t)

v(t) =x1(t)
show

that for every constant nominal input

0, there exists a constant nominal

Explain. Linearize the state


0. What is the nominal output in terms of
equation about an arbitrary constant nominal. If = 0 and x8(O) = 0, what is the response y&(t) of
(Solution of the linear state equation is not needed.)
the linearized state equation for any
trajectory

Exercise 2.12

Consider the nonlinear state equation


11(t)

x(t) =
y(r)

u(I)x1(t) x1(t)
x,(t) 2x1(t)
2x3(t)

with nominal initial state


0
3
2

and constant nominal input


= I. Show that the nominal output is 3(t) = 1. Linearize the state
equation about the nominal solution. Is there anything unusual about this example?

Exercise 2.13 For the nonlinear state equation

x1(t) + u(t)
2x,(t) +

i(t) =
3x3(t)

(t) 4x (t)x2(t) +

(t)

)'(t)
determine the constant nominal solution corresponding to any given constant nominal input
u (t) =
Linearize the state equation about such a nominal. Show that if
= 0, then y6(t) is
zero regardless of u5(t).

Exercise 2.14 For the time-invariant linear state equation

Chapter 2

38

=Ax(t)
suppose A

State Equation Representation

+ 8,i(t) , .v(O) =x,,

is invertible and u (t) is continuously differentiable. Let

q(t)=
and derive a state equation description for :(t) = .v(t)q(t). Interpret this description in terms of
deviation from an 'instantaneous constant nominal.'

NOTES
Note 2.1 Developing an appropriate mathematical model for a physical system often is difficult,
and always it is the most important step in system analysis and design. The examples offered here
are not intended to substantiate this claimthey serve only to motivate. Most engineering models
begin with elementary physics. Since the laws of physics presumably do not change with time, the

appearance of a time-varying differential equation is because of special circumstances in the


physical system, or because of a particular formulation. The electrical circuit with time-varying
elements in Example 2.2 is a case of the former, and the linearized state equation for the rocket in
Example 2.6 is a case of the latter. Specifically in Example 2.6. where the rocket thrust is time
variable, a time-invariant nonlinear state equation is obtained with mO) as a state variable. This

leads to a linear time-varying state equation as an approximation via linearization about a


constant-thrust nominal trajectory. Introductory details on the physics of variable-mass systems,
including the ubiquitous rocket example, can be found in many elementary physics books, for
example
R. Resnick, D. Halliday, P/r%'sics, Part I, Third Edition, John Wiley, New York. 1977

J.P. McKelvey. H. Grotch, Physics for Science and Engineering. Harper & Row, New York, 1978

Elementary physical properties of time-varying electrical circuit elements are discussed in

L.O. Chua, C.A. Desoer, E.S. Kuh, Linear and Nonlinear Circuits, McGraw-Hill, New York, 1987

The dynamics of central-force motion, such as a satellite in a gravitational field, are treated in
several books on mechanics. See, for example,

B.H. Karnopp, Introduction to Dynamics, Addison-Wesley, Reading, Massachusetts, 1974


Elliptical nominal trajectories for Example 2.7 are much more complicated than the circular case.
Note 2.2 For the mathematically inclined, precise axiomatic formulations of 'system' and 'state'
are available in the literature. Starting from these axioms the linear state equation description
must be unpacked from complicated definitions. See for example
L.A. Zadeh, C.A. Desoer, Linear System Theory. McGraw-Hill, New York, 1963

E.D. Sontag, Mathematical Control Theo,y, Springer-Verlag, New York, 1990


Note 2.3 The direct transmission term D (t)u (I) in the standard linear state equation causes a
dilemma. It should be included on grounds that a theory of linear systems ought to encompass
'identity systems,' where D(t) = I, C(t) is zero, and A(t) and BU) are anything, or nothing. Also
it should be included because physical systems with nonzero D (I) do arise. In many topics, for
example stability and realization, the direct transmission term is a side issue in the theoretical
development and causes no problem. But in other topics, feedback and the polynomial fraction

Notes

39

description are examples, a direct traiismission complicates the situation. The decision in this
book is to simplify matters by often invoking a zero-D (t) assumption.

Note 2.4 Several more-general types of linear state equations can be studied. A linear state
equation where i(t) on the left side is multiplied by an ii x ii matrix that is singular for at least
state equation or descriptor state equation. To pursue this
some values of is called a
topic consult
F.L. Lewis, "A survey of linear singular systems," Circuits. Systems, and Signal Processing, Vol.
5. pp.336, 1986
or

L. Dai, Singular Control Systems. Lecture Notes on Control and Information Sciences, Vol. 118,
Springer-Verlag, Berlin, 1989

Linear state equations that include derivatives of the input signal on the right side are discussed
from an advanced viewpoint in
M. Fliess, "Some basic structural properties of generalized linear systems," Systems & Control
Letters. Vol. 15, No. 5, pp. 391 396, 1990

Finally the notion of specifying inputs and outputs can be abandoned completely, and a system
can be viewed as a relationship among exogenous time signals. See the papers
J.C. Willems, "From time series to linear systems," Autoniatica. Vol. 22, pp. 561 580 (Part I),
pp. 675 694 (Part II), 1986

J.C. Willems, "Paradigms and puzzles in the theory of dynamical systems," IEEE Transactions on
Auto,natic Control, Vol. 36, No. 3, pp. 259294, 1991
for an introduction to this behavioral approach to system theory.

Note 2.5 Our informal treatment of linearization of nonlinear state equations provides only a
glimpse of the topic. More advanced considerations can be found in the book by Sontag cited in
Note 2.2, and in

C.A. Desoer, M. Vidyasagar, Feedback Systems: Input-Output Properties, Academic Press, New
York, 1975

Note 2.6 The use of state variable diagrams to represent special structural features of linear state
equations is typical in earlier references, in part because of the legacy of analog computers. See
Section 4.9 of the book by Zadeh and Desoer cited in Note 2.2. Also consult Section 2.1 of
T. Kailath, Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey, 1980

where the idea of using integrators to represent a differential equation is attributed to Lord Kelvin.
Note 2.7 Can linear system theory contribute to the social, political, or biological sciences? A
harsh assessment is entertainingly delivered in
D.J. Berlinski, On Systems Analysis, MIT Press, Cambridge, 1976

Those contemplating grand applications of linear system theory might ponder Berlinski's
deconstruction.

3
STATE EQUATION SOLUTION

The basic questions of existence and uniqueness of solutions are first addressed for linear

state equations unencumbered by inputs and outputs. That is, we consider

=A(t)x(t),
where the initial time t0 and initial state

=x0

are given. The n x ii matrix function A (t)

assumed to be continuous and defined for all

t.

By

definition a solution

is

is a

continuously-differentiable, n x I function x (t) that satisfies (1) for all r, though at the
outset only solutions for t
are considered. Among other things this avoids
absolute-value signs in certain inequalities, as mentioned in Chapter 1. A general
contraction mapping approach that applies to both linear and nonlinear state equations is
typical in mathematics references dealing with existence of solutions, however a more
specialized method is used here. One reason is simplicity, but more importantly the
calculations provide a good warm-up for developments in the sequel.
An alternative is simply to guess a solution to (1), and verify the guess by
substitution into the state equation. This is unscientific, though perhaps reasonable for
the very special case of constant A (t) and n = 1. (What is your guess?) But the form of
the solution of (1) in general is too intricate to be guessed without guidance, and our
development provides this guidance, and more. Requisite mathematical tools are the
notions of convergence reviewed in Chapter 1.
After the basic existence question is answered, we show that for a given
and x0
there is precisely one solution of (1). Then linear state equations with nonzero input
signals are considered, and the important result is that, under our default hypotheses,
there exists a unique solution for any specified initial time, initial state, and input signal.

We conclude the chapter with a review of standard terminology associated with


properties of state equation solutions.

Existence

Existence
and an arbitrary time T> 0, we will construct a sequence of n x 1 vector
defined on the interval [t0, t(,+T], that can be interpreted as a
sequence of 'approximate' solutions of (1). Then we prove that the sequence converges
uniformly and absolutely on ['a, t0+TI, and that the limit function is continuously
differentiable and satisfies (1). This settles existence of a solution of (1) with specified
t0 and x0, and also leads to a representation for solutions.
The sequence of approximating functions on [ta, t0 +TJ is defined in an iterative
Given t0,

Xe,,

functions

fashion by

x0(r) = x0
x1(t)=x0 +

x2(t)=x0 +

Xk(t)=Xo +

course the subscripts in (2) denote different n x 1 functions, not entries in a vector.)
This iterative prescription can be compiled, by back substitution, to write Xk(t) as a sum
of terms involving iterated integrals of A (t),
(Of

Xk(t) =x0 +

SA(ai)xo da1 +

A(a2)x0

... +

the convergence analysis it is more convenient to write each vector function in (2) as
a 'telescoping' sum:
For

kI

Xk(t)

=x0(t) +

x3(t)]

k=

1,2,...

j=O

Then the sequence of partial sums of the infinite series of n x I vector functions

x0(t) +
j=O

precisely the sequence


Therefore convergence properties of the infinite
series (5) are equivalent to convergence properties of the sequence, and the advantage is
that a straightforward convergence argument applies to the series.
is

Chapter 3

42

State Equation Solution

Let

a=

max

IIA(t)ll

t0+T

13 = S

where a and

13

IA

)X(, II dcy1

are guaranteed to be finite since A (t)

is

continuous and the time

interval is finite. Then, addressing the terms in (5),


l1x1(t) xo(t)ll = IIfA(o)x0 dali

5 hA (a)x0 II da 13,

a' e [ti,,

Next,
htx2(t) x1(t)jb = IIJA(ai)xi(oj) A(a1)x0(a1)da1

5 IIA(a,)lh l1x1(aj)

ii

da1

Jal3dai =13a(tt0), ic [t0,t0+T]


It is easy to show that in general
=

IIJA(ai)xj(ai)

II

j IIA(a1)Il

da1

ic [t0,t0+T], j=O, 1,...

(7)

These bounds are all we need to apply the Weierstrass M-Test reviewed in Theorem 1.8.
The terms in the in finite series (5) are bounded for a' e [t0, 10 T] according to
lIxo(t)hl =

hi,

- x1(t)jI

j = 0, 1,

and the series of bounds

II + 13eaT. Therefore the infinite series (5) converges uniformly and

Existence

43

Since each term in the series is continuous on the


interval, the limit function, denoted x(t), is continuous on the interval by Theorem 1.6.
Again these properties carry over to the sequence
whose terms are the partial
sums of the series (5).
00, the limit of the sequence (2) can be written as the infinite
From (3), letting k
series expression
absolutely on the interval [t(,,

x(t)=x0 + JA(al)x(,dal +
+

...

(8)

do1 +

dok

last step is to show that this limit x (t) is continuously differentiable, and that it
satisfies the linear state equation (1). Evaluating (8) at t = yields
= x0. Next,
The

term-by-term differentiation of the series on the right side of (8) gives


0 + A (t)x0 + A (t)

The k"

partial

A (2)

sum of this series is the

5 A (OL)Xo

dok

...

do2 +

(9)

partial sum of the series A (t)x(t)

the right side of (8) with (9)and uniform convergence of (9) on [t0, t0+T]
follows. Thus by Theorem 1.7 this term-by-term differentiation yields the derivative of
x (t), and the derivative is A (t)x (t). Because solutions are required by definition to be
continuously differentiable, we explicitly note that terms in the series (9) are continuous.
Therefore by Theorem 1.6 the derivative of x (t) is continuous, and we have shown that,
compare

indeed, (8) is a solution of (1).


This same development works for t [ti, T, to], though absolute values must be
used in various inequality strings.
It is convenient to rewrite the ii x 1 vector series in (8) by factoring X() out the right
side of each term to obtain

x(t)=

5A(aj)5A(o2)da2do,

...

Denoting the n x ii

+ JA

matrix

Ao2

A (ok)

dok

series on the right side by

constructed can be written in terms of this transition ,natrix as

do1 +

x0

(10)

ta), the solution just

Chapter 3

44

State Equation Solution

(ii)

v (t) = cD(t,

Since for any X()


uniformly for t E

the ii x

vector series

t0)x0 in

(8) converges absolutely and

where T >0 is arbitrary, it follows that the n xn

matrix series cD(t, re,) converges absolutely and uniformly on the same interval. Simply
choose x0 = C,i, the j11'-column of 1,,, to prove the convergence properties of the
ti,).
column of
It is convenient for some purposes to view the transition matrix as a function of
two variables, written as
t), defined by the Peano-Baker series

JA(a2)

(12)

it takes a little more


work to show the series (12) converges uniformly and absolutely for t, rE [T. T],
where T >0 is arbitrary. See Exercise 3.13.
By slightly modifying the analysis, it can be shown that the various series
considered above converge for any value of t in the whole interval (00, oo). The
restriction to finite (though arbitrary) intervals is made to acquire the property of
uniform convergence, which implies convenient rules for application of differential and

Though we have established convergence properties for fixed r,

integral calculus.

3.1 Example

For a scalar, time-invariant linear state equation, where we write


(2) generates

A (t) = a, the approximating sequence

x0(t) =
(t ti,)

x1(t)
x2(t)
and

+ ax0

(tt(,)
X() +

I!

ax0

2!

so on. The general term in the sequence is

1+a
and

1!

1!

k!

the limit of the sequence is the presumably familiar solution

x(t)=e
Thus the transition matrix in this case is simply a scalar exponential.

Uniqueness

45

Uniqueness
next verify that the solution (11) for the linear state equation (1) with specified t0
x0 is the only solution. The Gronwall-Beilman inequality is the main tool.
Generalizations of this inequality are presented in the Exercises for use in the sequel.
We

3.2 Lemma
ith

Suppose that 4(t) and v (t) are continuous functions defined for t t0

(t) 0 for t

and

suppose

is

a constant. Then the implicit inequality

tt0
.rnplies the explicit inequality
5

v(o)da

Proof Write the right side of (14) as

r(t) =

+5

itrnplify notation. Then


i-(t) = v(r)tp(t)

(14) implies, since v (1) is nonnegative,

1(t) = v (t)4(t) v (t)r (t)

MiiIwh both sides of (16) by the positive function

Jr(o)da

to obtain

r(t)e

"

Integrating both sides from t0 to any t

r(t)e
and

gives

Jr(a)da
'

tt0

this yields (15).

DOD
A proof that there is only one solution of the linear state equation (1) can be
accomplished by showing that any two solutions necessarily are identical. Given t0 and
x0, suppose x0(t) and Xh(t) both are (continuously differentiable) solutions of (1) for

Chapter 3

46

State Equation Solution

t I,,. Then

:(t)

X0(t) x1,(t)

satisfies

=A(r):(t) ,

= 0

and the objective is to show that (17) implies :(t) = 0 for all t t0. (Zero clearly is a
solution of (17), but we need to show that it is the only solution in order to elude a
vicious circle.)
Integrating both sides of (17) from t0 to any t
and taking the norms of both
sides of the result yields the inequality

IIz(t)II $ IIA(a)II II:(o)II c/a


Applying Lemma 3.2 (with

= 0) to this inequality gives immediately that IIz(t) II = 0

forall tr0.
On using a similar demonstration for t <to. uniqueness of solutions for all t is
established. Then the development can be summarized as a result that even the jaded
must admit is remarkable, in view of the possible complicated nature of the entries of

A(r).
3.3 Theorem

For any

and x0 the linear state equation (1), with A (t) continuous,

has the unique, continuously-difirentiable solution

x(t) =

t0)x1,

The transition matrix


r) is given by the
absolutely and uniformly for t, tn [T, TI, where

T>

series (12) that converges


0 is arbitrary.

3.4 Example The properties of existence and uniqueness of solutions defined for all t
in an arbitrary interval quickly evaporate when nonlinear state equations are considered.
Easy substitution verifies that the scalar state equation

= 3x213(t), x(0) = 0
has two distinct solutions, x(t) =

and x(t) = 0, both defined for all t. The scalar state

equation

+x2(t), x(0)=0
has the solution x(t) = tan t, but only on the time interval t n (irI2, it/2). Specifically
this solution is undefined at t = it/ 2, and no continuously-differentiable function
satisfies the state equation on any larger interval. Thus we see that Theorem 3.3 is an
important foundation for a reasoned theory, and not simply mathematical decoration.

DOD

Complete Solution

47

The Peano-Baker series is a basic theoretical tool for ascertaining properties of


solutions of linear state equations. We concede that computation of solutions via the
Peano-Baker series is a frightening prospect, though calm calculation is profitable in the

simplest cases.

3.5 Example

For

A(t)=

(20)

the Peano-Baker series (12) is

t)

J
+

It is straightforward to verify that all terms in the series beyond the second are zero, and
thus
c1(t,

[I

For a diagonal A (t) the Peano-Baker series (12) simplifies greatly. Each
t) is diagonal. The
diagonal entry of (I)(t, r) has the form

3.6 Example

term of the series is a diagonal matrix, and therefore

t)=
where

is

+ Jakk(oI)dal +

the k"-diagonal entry of A (t). This expression can be simplified by

proving that
...

Jakk(aJ+I)daf+I ..

= (j+l)!

verify this identity note that for any fixed value of t the two sides agree at I = t, and
the derivatives of the two sides with respect to t (Leibniz rule on the left, chain rule on
the right) are identical. Therefore
To

Juu(a) do

t) = e'

(22)

and 1(t, t) can be written explicitly in terms of the diagonal entries in A (t).

Complete Solution
The standard approach to considering existence and uniqueness of solutions of

=A(t)x(t) + B(t)u(t) , x(t0) =x0


with given

x0 and

(23)

continuous u (t), involves using properties of the transition matrix

48

that

Chapter 3

State Equation Solution

are discussed in Chapter 4. However the guess-and-verify approach sometimes is

successful, so in Exercise 3.1 the reader is invited to verify by direct differentiation that a
solution of (23) is
x(1) = 4(t, t0)x(, + 5 D(t, a)B (a)u (a) da

t0

(24)

A little thought shows that this solution is unique since the difference (t) between any
two solutions of(23) must satisfy (17). Thus :(t) must be identically zero.
Taking account of an output equation,
v

(t) = C (t)x 0) D (t)i, (t)

(24) leads to

y (t) = C

t0)x0 + 5 C (t)D(t, a)B (a)u (a) da + D (t)u (t)

(26)

Under the assumptions of continuous input signal and continuous state-equation


coefficients, x (t) in (24) is continuously differentiable, while y (t) in (26) is continuous.

If the assumption on the input signal is relaxed to piecewise continuity, then x (t) is
continuous (an exception to our default of continuously-differentiable solutions) and
y (t) is piecewise continuous (continuous if D (t) is zero).
The solution formulas for both x(t) and y(f) comprise two independent
components. The first depends only on the initial state, while the second depends only
on the input signal. Adopting an entrenched converse terminology, we call the response
component due to the initial state the zero-input response, and the component due to the
input signal the zero-state response. Then the complete solution of the linear state
equation is the sum of the zero-input and zero-state responses.

The complete solution can be used in conjunction with the general solution of
unforced scalar state equations embedded in Example 3.6 to divide and conquer the
transition matrix computation in some higher-dimensional cases.

3.7 Example To compute the transition matrix for

A(t)= [1 a(t)]
write the corresponding pair of scalar equations
x1(t)

x1(t0)

x2(t) = a (t)x,(t) + x (t)

x,(t0) =

From Example 3.1 we have

x1(t) =
Then

the second scalar equation can be written as a forced scalar state equation

(B(t)u(t)=e t to

Complete Solution

49

t2(t) = a(t)x,(t) +
The

transition matrix for scalar a (t)

is

computed in Example 3.6, and applying (24)

gives
J a(a) do

x,(i) = C"

'

cit

da

+ 5 e

Repacking into matrix notation yields

Ct"
=

0
Su(o)do

+5 a('t) dt] dc3

from which we immediately ascertain

X0

C"

ii,).

DOD
We close with a few observations on the response properties of the standard linear
state equation that are based on the complete solution formulas (24) and (26).
of 1
Computing the zero-input solution x(t) for the initial state
= c, the
at the initial time t,, yields the i"-column of c1(1, ta). Repeating this for the obvious set
of n initial states provides the whole matrix function of t, cb(t, t0). However if t0
changes, then the computation in general must be repeated. This can be contrasted with
the possibly familiar case of constant A, where knowledge of the transition matrix for
any one value of
completely determines c1(t,
for any other value of t0. (See

Chapter 5.)
Assuming a scalar input for simplicity, the zero-state response for the output with
unit impulse input u (t) = 6(t t0) is, from (26),
y

(t) = C (t)4(t, t0)B

(t1,)

+D

(27)

(We assume that all the effect of the impulse is included under the integral sign in (26).
Alternatively we assume that the initial time is , and the impulse occurs at time re.)
Unfortunately the zero-state response to a single impulse occurring at
in general
provides quite limited information about the response to other inputs. Specifically it is
clear from (26) that the zero-state response involves the dependence of the transition
matrix on its second argument. Again this can be contrasted with the time-invariant

case, where the zero-state response to a single impulse characterizes the zero-state
response to all input signals. (Chapter 5, again.)
Finally we review terminology introduced in Chapter 2 from the viewpoint of the
complete solution. The state equation (23), (25) is called linear because the right sides
of both (23) and (25) are linear in the variables x (t) and u (t). Also the solution

components in x(t) and y(t) exhibit a linearity property in the following way. The
zero-state response is linear in the input signal ii 0'). and the zero-input response is linear

in the initial state x0. A linear state equation exhibits causal input-output behavior

Chapter 3

50

State Equation Solution

the response y (t) at any t0 t0 does not depend on input values for t > t0.
that the response 'waveshape' depends on the initial time in general. More
precisely let y0(t), t to, be the output signal corresponding to the initial state
x (t0) = x0 and input u (t). For a new initial time t0 > t0, let y0 (t), t t0, be the output
because

Recall

signal corresponding to the same initial state x (t0) = x0 and the shifted input u (t
y0(t) in general are not identical. This again is in contrast to the
time-invariant case.
Then y0(r t0) and

Additional Examples
We

illustrate aspects of the complete solution formula for linear state equations by

revisiting two examples from Chapter 2.

3.8 Example In Example 2.7 a linearized state equation is computed that describes
deviations of a satellite from a nominal circular orbit with radius and angle given by
(t) =

= r0 ,

(28)

Assuming that r0 = 1, and that the input (thrust) forces are zero (u6(t) = 0), the
linearized state equation is

i8(t) =

10

x8(t)

(29)

Suppose there is a disturbance that results in a small change in the distance of the
satellite from Earth. This can be interpreted as an initial deviation from the circular orbit,
and since the first state variable is the radius of the orbit we thus assume the initial state

Here a is a constant, presumably with a small.


Because the zero-input solution for (29) has the form
I

y5(t) =

CC1(t, O)X(O)

the first step in describing the impact of this disturbance is to compute the transition
matrix. Methods for doing this are discussed in the sequel, though for the present

Additional Examples

51

purpose we provide the result:

+ 6sinw0t

2 +

2sinw,,t

Then the deviations

2sino,,t

0) =

in

3( +

3+4cosco,,t

radius and angle are obtained by straightforward matrix

multiplication as

Taking

e(43cosw0t)

6c(w0t+sinw0t)

account of the nominal values

in

( 30 )

(28) gives the following approximate

expressions for the radius and angle of the disturbed orbit:

r(t)

3coso0t)

+ 6esino0t

+ (i

(31)

Thus, for example, we expect a radial disturbance with a> 0 to result in an oscillatory
variation (increase) in the radius of the orbit, with an oscillatory variation (decrease) in
angular velocity.
Worthy of note is the fact that while 8(t) in (31) is unbounded in the mathematical
sense there is no corresponding physical calamity. This illustrates the fact that physical
interpretations of mathematical properties must be handled with care, particularly in
Chapter 6 where stability properties are discussed.
3.9 Example

In Example 2.1 the linear state equation

i(t)

x(t)
=

[g

x(0) =

(32)

describes the altitude x1(t) and velocity x2(t) of an ascending rocket driven by
constant thrust. Here g is the acceleration due to gravity, and
<0, U0 <0, and
> 0 are other constants. Assuming that the mass supply is exhausted at time te > 0,
and
> g?n(,, so we get off the ground, the flight variables can be computed for
t E [0,
as follows. A calculation similar to that in Example 3.5, but even simpler,
provides the transition matrix

c1(t,t)=

[i t_t]

Chapter 3

52

State Equation Solution

Since x (0) = 0, the zero-state solution formula gives

1 ta

x(t)=5

/ (ni. + 11(,a)

g+

Evaluation of this integral, which is essentially the same calculation as one in Example
2.6, yields

.v ()

gt2/2

I + u0t/m0) In (1 + u0t/n10)

gt +

In (1 + 110t/n10)

, E [0, ti,] (33)

At time r = the thrust becomes zero. Of course the rocket does not immediately stop,
but the change in forces acting on the rocket motivates restarting the calculation. The
altitude and velocity for t
are described by

(34)

{g

the terminal state

The initial state for this second portion of the flight is precisely
of the first portion. Denoting the remaining mass of the rocket by
= rn0 +
so that

1+

(33) gives

(ta)

In

+(

gte + 1'e in

Therefore the complete solution formula yields a description of the altitude arid velocity
for the second portion of the flight as

x (t) =

('a) + f

Vete +

+ rn0 /
1'e ln

a)

In

j da
gr2 / 2

'

>
C

This expression is valid until the unpleasant moment when the altitude again reaches
zero. The important point is that the solution computation can be segmented in time,
with the terminal state of any segment providing the initial state for the next segment.

Exercises

53

EXERCISES
Etercise 3.1

By direct differentiation show that

x(t) =

10)x0 + 5 cD(t,

a)B (a)u (a) da

solution of

i(1) =A(1)x(t) + B(t)u(f)

=x0

Exercise 3.2 Use term-by-term differentiation of the Peano-Baker series to prove that
cb(t,

t) =

t)A (t)

Exercise 3.3 By summing the Peano-Baker series, compute D(t, 0) for

3.4 Compute cb(t, 0) for

A(r)=
Exercise 3.5

;]

Compute an explicit expression for the solution of

1+1

x(f), x(0)=x,,
1+t

Show that the solution goes to zero as r

oo, regardless

of the initial state.

Exercise 3.6 Compute an explicit expression for the solution of

lt2

x(t),

An integral table or symbolic mathematics software will help.) Show that the solution does not

to zero as 1 oo if x01 0. By comparing this result with Exercise 3.5, conclude that
ransposition of A (r) is not as harmless as might be hoped.
Exercise 3.7 Show that the inequality

+ Jv(a)4(a)da,
v(t) are real, continuous functions with v(t) 0 for all I

I,, implies

54

Chapter 3

State Equation Solution

i
This also is called the Gronwall-Bdllnwn inequality in the literature. Hi,,i: Let

r(t)
and

work with 1(t) r (l)r(t)

do

= 5

Exercise 3.8

Using the inequality in Exercise 3.7, show that with the additional assumption that
is continuously differentiable.

do, t t,,

+5
implies
'r(G)da

+ 5e
Exercise 3.9

constant and

41(O)dc5,

I I,,

Prove the following variation on the inequality in Exercise 3.7. Suppose v is a


w(t), and v (1) are continuous functions with r (I) 0 for all! t0. Then
t

implies
J

(t) we"
Exercise 3.10

Jw(o)c0

do, tt,,

Devise an alternate uniqueness proof for linear state equations as follows. Show

that if
= A (I)z(t)

;(t,,) = 0

then there is a continuous scalar function asuch


(t) that
d

.,

IIz(t)II a(t)IIz(t)II

Then use an argument similar to one in the proof of Lemma 3.2 to conclude that :(t) = 0 for all

Exercise 3.11 Consider the 'integro-di iferential state equation'


= A (t)x(t)

+ 5 E(!,o)x(o) do + B(t)u (i')

E(t,o), and B (i) are n x ,z,


respectively. Given x0, , and a continuous
where A (I),

x(10) =

x ,z, and a x
continuous matrix functions,
x I input signal u(t) defined for t I,,, show that

Notes
there is at most one (continuously differentiable) solution. Hint: Consider the equivalent integral
equation and rewrite the double-integral term.
Exercise 3.12

For the linear state equation

k(t) =A(t)x(t) , .v(t,,) =x,,


show that
11.4

IIx(t) II <

Exercise 3.13 Use an estimate of


II

JA(a1)JA(a,)
j4+IT
t

A(a1)da1"dty1i!

and the definition of uniform convergence of a series to show that the Peano-Baker series
converges uniformly to cD(t. c) fort, t e [ T. T], where T > 0 is arbitrary. Hint:

(k 4-f)!

Ic!

Exercise 3.14 For a continuous n x n matrix function A (t), establish existence of an n x n,


continuously-differentiable solution X(r) to the matrix differential equation

X(t) = A (t)X(t) ,

= X,,

by constructing a suitable sequence of approximate solutions, and showing uniform and absolute
convergence on finite intervals of the form [t,,T, t,,T1.

Exercise 3.15 Consider a linear state equation with specified forcing function and specified
two-point boundary conditions
= A (t)x(t)

+ f(t) ,

+ Hjx(tj) =

I,

Here
and H1 are it x n matrices, I, is an n x I vector, and tj> t0. Under what hypotheses does
there exist a solution .v (t) of the state equation that satisfies the boundary conditions? Under what
hypotheses does there exist a unique solution satisfying the boundary conditions? Supposing a
solution exists, outline a strategy for computing it under the assumption that you can compute the
transition matrix for A (1).
Adopt for this exercise a general input-output (zero-state response) notation for a
system: y (t) = H[u (t)J. 'We call such a system linear if H[u,,(t) + u,,(t)J = H[u,,(:)] +
for all input signals u,,(:) and Uh(t), and H[au(t)] = aH[u(t)] for all real numbers a and all
inputs ii (t). Show that the first condition implies the second for all rational numbers a. Does the
second condition imply the first for any important classes of input signals?
Exercise 3.16

NOTES
Note 3.1
In this chapter we are retracing particular aspects of the classical mathematics of
ordinary differential equations. Any academic library contains several shelf-feet of reference
material. To see the depth and breadth of the subject, consult for instance

Chapter 3

56
P. Hartman,

State Equation Solution

Ordinaty Differential Equations, Second Edition. Birkhauser, Boston, 1982

following two books treat the subject at a less-advanced level, and they are oriented toward
engineering. The first is more introductory than the second.
The

R.K. Miller, A.N. Michel, Ordinaty Differential Equations, Academic Press, New York, 1982
D.L. Lukes, Differential Equations: Classical to Controlled, Academic Press, New York, 1982

The default continuity assumptions on linear state equationsadopted to keep


technical detail simplecan be weakened without changing the form of the theory. (However
some proofs must be changed.) For example the entries of A (t) might be only piecewise
continuous because of switching in the physical system being modeled. In this situation our
requirement of continuous-differentiability on solutions is too restrictive, and a continuous x(t)
can satisfy the state equation everywhere except for isolated values of 1. The books by Hartman
and Lukes cited in Note 3.1 treat more general formulations. On the other hand one can weaken
Note 3.2

the hypotheses too much, so that important features are lost. The scalar linear state equation

.v(0)=0
is such that

x(t) =
a solution for every real number a, a highly nonunique solution indeed.

Note 3,3 The transition matrix for A (t) can be defined without explicitly involving the PeanoBaker series. This is done by considering the solution of the linear state equation for n linearly
independent initial states. Arranging the n solutions as the columns of an n x n matrix X (a'), called
= X (t )X - '(t0). See, for example, the book by
a fundamental matrix, it can be shown that
Miller and Michel cited in Note 3.1, or
L.A. Zadeh, C.A. Desoer, Linear System Theory, McGraw-Hill, New York, 1963

Use of the Peano-Baker series to define the transition matrix and develop solution properties was
emphasized for the system theory community in
R.W. Brockett. Finite Dimensional Linear Systems, John Wiley, New York, 1970

Note 3.4 Suppose for constants a,

0 the continuous, nonnegative function

satisfies

t E [ta, a'j]
Then the inequality
,

is established

a' e [, tf]

(by a technique very different from the proof of Lemma 3.2) in

T.H. Gronwall, "Note on the derivatives with respect to a parameter of the solutions of a system
of differential equations," Annals of Mathematics, Vol. 20, pp. 292 296, 1919

The inequality in Lemma 3.2, with additional assumptions of nonnegativity of q(t),


appears as the "fundamental lemma" in Chapter 2 of
R. Bellman, Stability Theoty of Differential Equations. McGraw-Hill, New York, 1953

and

0,

Notes

57

d appears in earlier publications of Bellman. At least one prior source for the inequality is

W.T. Reid. "Properties of the solutions of an infinite system of ordinary linear differential
of the first order with auxiliary boundary conditions." Transactions of the American
Mathematical Society, Vol. 32, pp. 284 318, 1930

Anribution aside, applications in system theory of these inequalities, and their extensions in the
Exercises, abound.

Note 3.5

Exercise 3.15 introduces the notion of boundary-value problems in differential


important topic that we do not pursue. For both basic theory and numerical
consult

U.M. Ascher, R.M.M. Mattheij, R.D. Russell, Numerical Solution of Boundaiy Value Problems for
Ordinaiy Differential Equations, Prentice-Hall, Englewood Cliffs, New Jersey, 1988
Our focus in the next two chapters is on developing theoretical properties of transition
atrices. These properties aside there are many commercial simulation packages containing

Note 3.6

effective, efficient numerical algorithms for solving linear state equations. Via the prosaic device
computing solutions for various initial states, say e
en, any of these packages can provide

a numerical solution for the transition matrix as a function of one argument. Of course the
be

solution of a linear state equation with specified initial state and specified input signal
calculated and displayed by these simulation packages, often at the click of a mouse in a
colorful window environment.

4
TRANSITION MATRIX
PROPERTIES

Properties

of linear state equations rest on properties of transition matrices, and the

complicated form of the Peano-Baker series

tends to mask marvelous features that can be gleaned from careful study. After pointing

out two important special cases, general properties of b(r,

t)

(holding for any

continuous matrix function A (t)) are developed in this chapter. Further properties in the
special cases of constant and periodic A (t) are discussed in Chapter 5.

Two Special Cases


Before developing a list of properties, it might help to connect the general form of the

transition matrix to a simpler, perhaps-familiar case. If A (r) = A, a constant matrix,


then a typical term in the Peano-Baker series becomes
JA

f A(a2) f

J A (at)
0,

=AUJJJ ..

01.1

ldaL ..

k!

With this observation our first property inherits a convergence proof from the treatment

Two Special Cases

59

of Peano-Baker series in Chapter 3. However, to emphasize the importance of the timeinvariant case, we specialize the general convergence analysis and present the proof
again.

4.1 Property If A (I) = A, an n x n constant matrix, then the transition matrix is


cb(t, c) =

where the matrix exponential is defined by the power series


eA1

that converges uniformly and absolutely on [ T, T], where T> 0 is arbitrary.

Proof On any time interval [ T, T], the matrix functions in the series (2) are
bounded according to

k=0,l,

k!

Since the bounding series of real numbers converges,

IIAUT

1k IIkTk

k!

we have from the Weierstrass M-test that the series in (2) converges uniformly a
absolutely on [T, T].

DOD
Because of the convergence properties of the defining power series (2), the matrix
exponential eAt is analytic on any finite time interval. Thus the zero-input solution of a
time-invariant linear state equation is analytic on any finite time interval.
Properties of the transition matrix in the general case will suggest that Cb(t, r) is
as close to being an exponential, without actually being an exponential, as could be
hoped. A formula for b(t, t) that involves another special class of A (t)-matrices
supports this prediction, and provides a generalization of the diagonal case considered in
Example 3.6.

4.2 Property If for every t and z,

A(r)JA(a)da =JA(a)daA(t)
then
k

c1(t,

t) ='

ri-

SA

(a) da

Chapter 4

Transition Matrix Properties

Proof Our strategy, motivated by Example 3.6, is to show that the commutativity
condition (3) implies, for any nonnegative integer j,
JA(y)
Then using this identity repeatedly on a general term of the Peano-Baker series (from the
right, for j = 1, 2, . . . ) gives

JA(a1)SA(o,)J

(/01

...

=5A(o1)JA(a,)f

A(o)do] do(,

do1

[JA(o)do]
and

so on, yielding

*
Of course this is the corresponding general term of the exponential series in (4).

To show (5), first note that it holds at

t = r,

for any fixed value of t. Before

continuing, we emphasize again that the tempting chain rule calculation generally is not
valid for matrix calculus. However the product rule and Leibniz rule for differentiation
are valid, and differentiating the left side of (5) with respect to t gives

di] =A(r)
Differentiating the right side of (5) gives
JA(oj+1)doj+i

=A(t)

where,

+JA(oi)doi

JA(oJ)doJA(t)]

[JAodo13

in the last step, (3) has been used repeatedly to rewrite each of the j + 1 terms in

General Properties

61

the same form. Therefore we have that the left and right sides of (5) are continuously
differentiable, have identical derivatives for all t, and agree at t = t. Thus
the left and
right sides of (5) are identical functions of t for any value oft, and the proof is complete.

DOD
For n = 1, where every A (t) commutes with its integral, the 'transition scalar'
JA(a)da
eT

often appears in elementary mathematics courses as an integrating factor in solving

linear differential equations. We first encountered this exponential in the proof of


Lemma 3.2, and then again in Example 3.6.
4.3

Example For

A(t)=

{a(t) a(t)]

where a (t) is a continuous scalar function, it is easy to check that the commutativity
condition (3) is satisfied. Since

JA(a)da= [Ja(ct)da

Ja(a)da]

the exponential series (4) is not difficult to sum, giving

[J a(a) dcr]

t) =

dr 1 1]

exp [Ia

If a (t) is a constant, say a (t) = 2, then

t) =

t)

= {

General Properties
While vector linear differential equationslinear state equationshave been the sole
topic so far, it proves useful to also consider matrix differential equations. That is, given
A (t), an n x n continuous matrix function, we consider

X(t) = A (t)X(t), X(:0) = X0


where X(t) is an n x n matrix function. Of course (9) can be viewed column-bycolumn, yielding a set of n linear state equations. But a direct matrix representation of
the solution is of interest. So with the observation that the column-by-column

Chapter 4

62

Transition Matrix Properties

yields existence and uniqueness of solutions via Theorem 3.3, the


following property is straightforward to verify by differentiation, and provides a useful
characterization of the transition matrix.

4.4 Property The linear n x n matrix differential equation

fX(t)_A(t)X(t), X(r0)=J
has the unique, continuously-differentiable solution

X(t) = (1'A(t, t0)


When

the initial condition matrix is not the identity, but X0 as in (9), then the

easily verified, unique solution is X(t) =


t0)X0.
Property 4.4 as well as the solution of the linear state equation

=A(t)x(t) , x(t0) =x0


focus on the behavior of the transition matrix b(t, t) as a function of its first argument.
It is not difficult to pose a differential equation whose solution displays the behavior of
t) with respect to the second argument.

4.5 Property The linear n x n matrix differential equation

fZ(t)= _AT(t)Z(t), Z(t0)=J


has the unique, continuously-differentiable solution

Z(t) =

r)

Verification of this property is left as an exercise, with the note that Exercise 3.2
provides the key to differentiating Z (t). The associated ii x 1 linear state equation
= _AT(t)z(t)

z(10) =

Z0

is called the adjoint state equation for the linear state equation (11). Obviously the
unique solution of the adjoint state equation is

z(t) =
4.6 Example

t0)z0

t)z<,

For

A(t)=

cost

Property 4.2 does not apply. Writing out the first four terms of the Peano-Baker series

gives

General Properties

63

t sint

O)

o i

t2/2 1cost
+

t3/3! tsint
+

has been assumed for simplicity. It is dangerous to guess the sum of this
series, particularly the 1,2-entry, but Property 4.4 provides the relation
where t = 0

0) = I

that aids intelligent conjecture. Indeed,

e'

(e'+sintcost)/2

This is not quite enough to provide cIA(,, t)

as

an explicit function of t,

and

therefore

Property 4.5 cannot be used to obtain for free the transition matrix for

_AT(t)=

cost o

However writing out the first few terms of the relevant Peano-Baker series and guessing

with the aid of Property 4.5 yields


=

e'

1/2+e'(costsint)/2

Property 4.4 leads directly to a clever proof of the following composition property.
(Attempting a brute-force proof using the Peano-Baker series is not recommended.)

4.7 Property For every t, r, and a, the transition matrix for A (t) satisfies
(I)(t, r) = (b(t, cy)

Proof

cb(a, t)

Choosing arbitrary but fixed values of r and a, let R (t) = (I)(t, a) (D(a, t).

Then for all t,


R (t) = A (t)(D(t, a) cD(cy, t) = A

(t)R (t)

and, of course,

c1(t, t) = A (t)cD(t, r)

Also the 'initial conditions' at t = a are the same for both R (a') and cb(t, t), since
R (a) =
a) 4(a, t) = 1(a, t). Then by the uniqueness of solutions to linear matrix

Chapter 4

64

Transition Matrix Properties

differential equations, we have R (t) = 1(t, r), for all t. Since this argument works for

every value of t

a, the proof is complete.

and

DOD
The approach in this proof is a useful extension of the approach in the proof of
Property 4.2. That is, to prove that two continuously-differentiable functions are
identical show that they agree at one point, that they satisfy the same linear differential
equation, and then invoke uniqueness of solutions.
Property 4.7 can be interpreted in terms of a composition rule for solutions of the
corresponding linear state equation (11); a notion encountered in Example 3.9. In (16)
let r =
t = t2 > ti. Then, as shown in Figure 4.8, the composition
property implies that the solution of (11) at time t2 can be represented as

.v(t,) =

t(,)x(t(,)

x(t1) =

t1)x(t1)

or as
where

x(t1)

= cD(t1, 10)x(t0)

This interpretation also applies when, for instance,

t1

<t0 by following trajectories

backward in time.

x(i)
= cD(t,, f,,)x(t,,)

.v(i2) = 'D(r,.
= 4(r,, 11)_r(11)

4.8

Figure

An illustration of the composition property.

The composition property can be applied to establish invertibility of transition


matrices, but the next property and its proof are of surpassing elegance in this regard.
(Recall the definition of the trace of a matrix in Chapter 1.)
4.9 Property

For every t and

the

transition matrix for A (t) satisfies


JtrIA(o)lda

det c1(t, t) =

Proof

The key to the proof is to show that for any fixed t

(17)
the

scalar function

Properties

4(t, t)

65

satisfies the scalar differential equation

det b(t. t) = tr [A (t)]

det cD(t, t)

det

b(t, t)

=1

(18)

iben (17) follows from Property 4.2, that is, from the solution of the scalar differential
(18).

To proceed with differentiation of det b(t, r), where r is fixed, we use the chain
with the following notation. Let
r) be the cofactor of the entry
t) of
t), and denote the i,j-entry of the transpose of the cofactor matrix C (t, r) by
Ju, t). (That is, cJ =
Recognizing that the determinant is a differentiable
of matrix entries, in particular it is a sum of products of entries, the chain rule
gives

t)

deUI)(t, t)}

t) =
1=1

(19)

J=I [

For any j = 1,..., n, computation of the Laplace expansion of the determinant along
column gives

det 1(t, t) =

t)

c,1(t,

t)

det

t)

t)

t)
t)

j=I

1=1

The double summation on the right side can be rewritten to obtain

det '1(t, r) = tr [CT(t,


= tr [CT(t,
= tr

t) f

r) I

t)]

[t(t, r)CT(t, t)A(t)]

(the last step uses the fact that the trace of a product of square matrices is independent
ci the ordering of the product.) Now the identity

Chapter 4

66

I det

r)CT(t, t)

t) =

D(t,

Transition Matrix Properties

which is a consequence of the Laplace expansion of the determinant, gives

det b(t, t) = tr [A (t)] det c1(t, c)


t) = 1, the proof is complete.

Since, trivially, det

4.10 Property

The transition matrix for A (t) is invertible for every t and r, and

(t, t) =

t)

(21)

Proof Invertibility follows from Property 4.9, since A (t) is continuous and thus the
exponent in (17) is finite for any finite t and r. The formula for the inverse follows from
Property 4.7 by taking t = t in (16).
4.11 Example

These last few properties provide the steps needed to compute the

transition matrices in Example 4.6 as functions of two arguments. Beginning with


=

[I

cost]

[et

0)

(sintcost

et)/2]

(22)

From Property 4.7,

t)

'r) =

and then Property 4.10 gives, alter computing the inverse of CDA(t, 0),

(t, 0)

r) =

ett

()

+ (sin t cos t)/2

23

Alternatively we can obtain

bA(O, c) from Example 4.6 as


&AT (t, r) can be computed directly from 1'A (t, r) via Property 4.5.

0)]T

Similarly

State Variable Changes


Often changes of state variables are of interest, and to stay within the class of linear state

equations, only linear, time-dependent variable changes are considered. That is, for
(t)

suppose a new state vector is defined by

x (re) = x0

(24)

State Variable

Changes

67

z(t) = P'(t)x(r)

where the n x n matrix P (t) is invertible and continuously differentiable at each t.


(Both assumptions are used explicitly in the following.) To find the state equation in
terms of z (t), write x (t) = P (t)z (t) and differentiate to obtain

i(t) = P(t)i(t) + P(t)z(t)


Also A (t)x (t) = A (t)P (t)z (t), so substituting into the original state equation leads to
=

P'(t)P(t)]z(t)

z(t0)

=P'(t0)x0

(25)

This little calculation, and the juxtaposition of the linear state equations (24) and (25) in
Figure 4.12, should motivate the relation between the respective transition matrices.
x(t0) =

I
= [Pl(t)A(t)P(t)

4.12 Figure

= P1(t)x(t)

P'(t)P(:)]z(t)

State variable change produces an equivalent linear state equation.

Suppose P (t) is a continuously-differentiable, n x n matrix function

4.13 Property

such that P '(t) exists for every value of t. Then the transition matrix for

F(t)

P'(t)P(t)

(26)

is given by

t) =
Proof

t)P(t)

(27)

First note that F (t) in (26) is continuous, so the default assumptions are

maintained. Then, for arbitrary but fixed t, let

X(t)=P'(t)l?A(t, r)P(r)
Clearly X(r) = 1, and differentiating with the aid of Exercise 1.17 gives

X(r) =

'

(t)P(t)P ' (t)bA (t, t)P (t) + P ' (t)A (t)c1A (t, t)P (t)

=
= F(t)X(t)

_P'(r)P(t)]P'(t)cIA(t,

Chapter 4

68

Transition Matrix Properties

this is valid for any t, by the characterization of transition matrices provided in


Property 4.4 the proof is complete.
Since

4.14 Example A state variable change can be used to derive the solution 'guessed' in
Chapter 3 for a linear state equation with nonzero input. Beginning with
= A (t)x (t) + B (t)u (t)

(28)

x (ta) = X0

let

2(t) =

t(,)x(t)

where it is clear that P (t) = c1(t, t0) satisfies all the hypotheses required for a state
variable change. Substituting into (28) yields
A (t)c1(t,

(t) 4'(r,

i(t)

'(t, t0)B (t)u (t)

Both sides can be integrated from

2(t)
Replacing z (t) by

t to

(t) ,

(t)

z (ti,)

(29)

('a) =

obtain

X0

'(t)x (t)

and

rearranging using properties of the transition matrix

gives
x (t) =

+ J

a)B

(a) da

Of course if there is an output equation

y(t) = C(t)x(t) + D(t)u(t)


then we obtain immediately the complete solution formula for the output signal:

y (t) = C (t)11(r, tQ)x(, +

(t)1(t, a)B (o)u (a) da + D (t)u (t)

(30)

This variable change argument can be viewed as an 'integrating factor' approach,


as so often used in the scalar case. An expression equivalent to (28) is

(i)'(t, t0)[i(t) A(t)x(t)] =


and

this simply is another form of (29).

t(,)B(t)u(t)

=x1,

Exercises

69

EXERCISES
Exercise 4.1

For what A (t) is


cos (t t) sin (t t)

Lsin(ft) cos(tr)
Can this

transition matrix be expressed as a matrix exponential?

Exercise 4.2 If the n x n matrix function X (t) is a solution of the matrix differential equation

X(t) =A(t)X(t)

X(t0) =X,,

show that
(a) if X, is invertible, then X (t) is invertible for all t,
(b) if X0 is invertible, then for any t and c the transition matrix for A (t)

is

given by

r)

Exercise 4.3 If x (t) and z (t) are the respective solutions of a linear state equation and its adjoint
state equation, with initial conditions x(t0) = XQ and z(t0) = z0, derive a formula for zT(t)x(t).
Exercise 4.4

equation

Compute the adjoint of the ntII -order scalar


+

+ ao(t)y(t) = 0

. . .

by converting the adjoint of the corresponding linear state equation back into an n "-order sca
differential equation.
Exercise 4.5 For the time-invariant linear state equation

show that given an x,, there exists a constant a such that

det [x(t) Ax(t)

A"_lx(t)]

Exercise 4.6 For the ii x n matrix differential equation

X(t) =X(t)A(t) , X(t0) =X0

the (unique) solution in terms of an appropriate transition matrix. Use this to determine a
complete solution formula for the n x n matrix differential equation
express

X(r) = A 1(r)X(t) +
Exercise 4.7

F(t)

Show that
JA(a)da

X(r)=e

is a solution of the n x n matrix equation


X(t) =A(t)X(t)
if F is a constant matrix that satisfies

X(t0) =

Chapter 4

70

A(t)
(This can be useful

Transition Matrix Properties

F =0, k = 1,2,

if F has many zero entries.)

Exercise 4.8 For a continuous n x n matrix A (I). prove that

for all t and r if and only if

A(t)A(t)=A(t)A(t)
for all tand t.
Exercise 4.9 Compute 1(t, 0) for

A(t)=
where a (1) is a continuous scalar function. Hint: Recognize the subsequences of even powers and
odd powers.

Exercise 4.10

Show that the time-varying linear state equation

=A(t)x(t)
can be transformed to a time-invariant linear state equation by a state variable change if and only
if the transition matrix for A (t) can be written in the form
0) =

where R is an ii x n constant matrix, and T(z) is ii x n and invertible at each 1.


Exercise 4.11 Suppose A (t) is n x n and continuously differentiable. Prove that the transition
matrix for A (r) can be written as

cl(t, 0) =
where A and A, are constant n x ii matrices, if and only if

A(t)=AA(t)Ao)A

A(0)=A1 +A2

Exercise 4.12 Suppose A and A, are constant n x n matrices and that A (t) satisfies

A(t)=AA(t)A(t)A

A(0)=A1 +A2

Show that the linear state equation i-(t) = A(t)x(t) can be transformed to
variable change.

Exercise 4.13

Show that if A (i') is partitioned as


A Ct)

where A1 (t) and

22(t) are square, then

A11(t) A12(i)
0
A22(t)

= A2:(t) by

a state

71

Exercises

t) D12(I, t)

C)

where

t), j = 1, 2

t) =
Can you find an expression for

Exercise 4.14

r)

in terms of

t)? Hint: Use Exercise

Using Exercise 4.13. prove that

F(t) = e"
is

(t, t) and

given by

B do

0), the upper-right partition of the transition matrix for

AB

00

Exercise 4.15 Compute the transition matrix for


1
l
0
0 sint
0
cost
0

A(i)=
Hint. Apply the result of Exercise 4.13.

Exercise 4.16 Compute cb(t, 0) for


o

What

are

the pointwise-in-time eigenvalues of A (t)? For every initial state .v0, are solutions of

=A(t)x(t)

x(0) =x(,

bounded for t 0?

Exercise4.17 Show that the linear state equations

x(t)

are related by a change of state variables.

Exercise 4.18 For A and F constant, n x n matrices, show that the transition matrix for the linear
state equation
=e

Transition Matrix Properties

Chapter 4

72

ti,) = e Afe(A +Fl(:_ro)eAro

Exercise 4.19

For the linear state equation

i(t)

= A (t)x ()

.v (0) =

with A (a') continuously differentiable, suppose F is a constant, invertible, n x

ii

matrix such that

A(t) +A2(t)=FA(t)
Show that the solution of the state equation is given by

.v(t) = [I + F

I)A (0)] x0

Hint: Consider 1(t).

Exercise 4.20 Show that the transition matrix forA

(a')

+ A,(t) can be written

as

t) = cDA(I,

0)

Exercise 4.21 Given a continuous pi x n matrix A (t) and a constant n x n matrix F, show how to
define a state variable change that transforms the linear state equation

=A(t)x(i)
into

=Fz(t)
Exercise 4.22 For the linear state equation

i(t) =A(t)x(i) + B(t)u(t), x()


y(t) = C(t)x(i) + D(r)u(t)
If z(t,,) =
suppose state variables are changed according to :(t) =
directly from the complete solution formula that for any zi(t) the response y (1)

show
of

the two state

equations is identical.

Exercise 4.23

Suppose the transition matrix for A (a')

t) = c1?(t, 1)?
Exercise 4.24 For

A(t)=
suppose a> 0 is such that la(t)I a2 for all a'. Show that

IkD(t, r)II

for all I

and t.

is

cD%(t,

t). For what matrix F (a')

is

Notes

73

<a for all t,

Exercise 4.25 If there exists a constant a such that HA


matrix for A (1) can be written as

prove that the transition

cD(t+a, a)=eMa)t + R(t, a), f, a>0


A,(a) is an 'average.'
0+I

A,(a)=+ J A(c)dt
and R (t, a) satisfies

t,a>0
NOTES
Note 4.1 The exponential nature of the transition matrix when A (t) commutes with its integral,
Property 4.2, is discussed in greater generality and detail in Chapter 7 of

Equations: Classical to Controlled, Academic Press, New York, 1982

D.L. Lukes,

Changes of state variable yielding a new state equation that satisfies the commutativity condition
are considered in

JJ. Zhu, C.D. Johnson, "New results in the reduction of linear time-varying dynamical systems,"
SIAM Journal on Control and Optimization, Vol. 27, No. 3, pp. 476494, 1989
and a method for computing the resulting exponential is discussed in

J.J. Zhu, C.H. Morales. 'On linear ordinary differential equations with functionally commutative
coefficient matrices," Linear Algebra and Its Applications, Vol. 170, pp. 81 105, 1992
Note 4.2 A power series representation for the transition matrix is derived in

W.B. Blair, "Series solution to the general linear time varying system," IEEE Transactions on
.4uto,natic Control. Vol. 16, No. 2, pp. 210211, 1971

Note 4.3

Higher-order n x ii matrix differential equations also can be considered. See, for

example,

T.M. Apostol, "Explicit formulas for solutions of the second-order matrix differential equation
Y"(t) = AY(t)," American Mathematical Monthly, Vol. 82, No. 2, pp. 159 162, 1975
Note 4.4 The notion of an adjoint state equation can be connected to the concept of the adjoint
of a linear map on an inner product space. Exercise 4.3 indicates this connection, on viewing zTx
as an inner product on R". For further discussion of the linear-system aspects of adjoints, see
Section 9.3 of
1. Kailath, Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey, 1980

5
TWO IMPORTANT CASES

Two classes of transition matrices are addressed in further detail in this chapter. The first
is the case of constant A (t), and the second is where A (t) is a periodic matrix function
of Lime. Special properties of the corresponding transition matrices are developed, and

implications are drawn for the response characteristics of the associated linear state
equations.

Time-Invariant Case
When A (t) = A, a constant n x ,i matrix, the transition matrix is the matrix exponential

t) =

t)

A"(t
=

A=O

We first list properties of matrix exponentials that are specializations of general


transition matrix properties in Chapter 4, and then introduce some that are not. Since
only the difference of arguments (t t) appears in (1), one variable can be discarded
with no loss of generality. Therefore in the matrix exponential case we work with
D(f, 0) = eAt

As noted in Chapter 4, this is an analytic function of t on any finite time interval.


The following properties are easy specializations of the properties in Chapter 4.

5.1 Property

The n x ii matrix differential equation

X(t)=AX(i), X(0)=i
has the unique solution

X(t) =
MV A

eAt

line-Invariant Case
5.2

75

Property The n x ii matrix differential equation

Z(t)= _ATZ(t), Z(O)=I


the unique solution

Z(t)

= e_ATI

leave the generalization of these first two properties to arbitrary initial

We

as mild exercises.

53 Property For every t

and

t,
eA(t + T) = eAte/t

5.4 Property

For every t, recalling the definition of the trace of a matrix,


det

55 Property

The matrix exponential is invertible for every t (regardless of A ), and


eat'

Property If P is an invertible, constant n x n matrix, then for every t


el'AI't = P_IeAtP

additional properties of matrix exponentials do not devolve from general


of transition matrices, but depend on specific features of the power series
defining the matrix exponential. A few of the most important are developed in detail,
with others left to the Exercises.
Several

5.7 Property

If A and F are n x n matrices, then


eAte

forevery

+F),

ifandonlyifAF=FA.

Proof Assuming AF =

FA, first note that

t=O

t=O

=1

Since F commutes also with positive powers of A, and thus commutes with the terms in
power series for eAt,

Two Important Cases

Chapter 5

76

= ACAICFI +

= (A + F)el%teFt
+ F)t

satisfies the same linear matrix differential equation, and by uniqueness


of solutions we have (4).
Conversely if (4) holds for every t, then differentiating both sides twice gives

Clearly

A2e"e" +
and evaluating at r =

+F)i

= (A +

0 yields

A2 + 2AF + F2 =

(A

+ F)2

=A2 +AF+FA +F2


Subtracting A2 + AF + F2 from both sides shows that A and F commute.

5.8 Property

There exist analytic scalar functions cx0(t)


'I

(t) such that

a,,

(5)
k =0

Proof Using Property 5.1, the matrix differential equation characterizing the matrix
exponential, we can establish (5) by showing that there exist scalar analytic functions
a0(t)
a,,_1(t) such that
,li

1,I

L=0

A=()

k=0

The Cayley-Hamilton theorem implies

A" = a0!

a1A

are the coefficients in the characteristic polynomial of A. Then (6)


as
can be written solely in terms of!, A,. ..,
where a0

ni

ni

n2

k=()

k=0

= a0a,,_1(t)! +

A=I

(7)

k=0

The astute observation to be made is that (7) can be solved by considering the coefficient

equation for each power of A separately. Equating coefficients of like powers of A


yields the time-invariant linear state equation

Tine-Invariant Case

77

(0(t)

a1(t)

0
0

a1

00(t)
a1(t)

a,,_1

a,,_1(t)

ao

a1(O)

Thus existence of an analytic solution to this linear state equation shows existence of
that satisfy (7), and hence (6).

functions

The Laplace transform can be used to develop a more-or-less explicit form for the
matrix exponential that provides more insight than the power series definition. We need

caly deal with Laplace transforms that are rational functions of s, that is, ratios of
polynomials in s. Recall the terminology that a rational function is proper if the degree
o4' the numerator polynomial is no greater than the degree of the denominator
polynomial, and strictly proper if the numerator polynomial degree is strictly less than
the denominator polynomial degree.
Taking the Laplace transform of both sides of the n x matrix differential equation

X(t)=AX(t), X(0)I
gives, after rearrangement,

X(s) = (si

Thus, by uniqueness properties of Laplace transforms, and uniqueness of solutions of


This is
linear matrix differential equations, the Laplace transform of e'1' is (si
an n x n matrix of strictly-proper rational functions of s, as is clear from counting
polynomial-entry degrees in the formula
adj (sI A)
(si

A)

det
A) is a degree-n polynomial in s, while each entry of ad] (si A)
is a polynomial of degree at most ni. Now suppose
det (sI A) = (s
where
cIT

(s

Xi,...,

are the distinct eigenvalues of A, with corresponding multiplicities


1. Then the partial fraction expansion of each entry in (si A)' gives
(si A

(s

where each WkJ is an ii x ii matrix of partial fraction expansion coefficients. That is,
each entry of Wk, is the coefficient of l/(s
in the expansion of the corresponding
entry in the matrix (si
(The matrix Wk, is complex if the associated eigenvalue

Chapter 5

78

can

Two Important Cases

is complex.) In fact, using a formula for partial fraction expansion coefficients,


be written as
Wkj=

(siAY']

(akf)!

Taking the inverse Laplace transform, using Table 1.10, gives an explicit form for the

matrix exponential:
fI

eAt =

(fi)!

k=I

Of course if some eigenvalues are complex, conjugate terms on the right side of (10) can
be combined to give a real representation.
5.9 Example

For the har,nonic oscillator, where

A=

01
10

a simple calculation gives

sli

s2+l

sl
1

Partial fraction expansion and the Laplace transforms in Table 1.10 can be used, if
memory fails, to obtain
cost

sint

sint
cost

ODD
The Jordan form for a matrix is not used in any essential way in this book. But it
may be familiar, and in conjunction with Property 5.6 it leads to another explicit form for

the matrix exponential in terms of eigenvalues. We outline the development as an


example of manipulations related to matrix exponentials. The Jordan form also is useful
in constructing examples and counterexamples for various conjectures since it is only a
state variable change away from a general A in a time-invariant linear state equation.
This utility is somewhat diminished by the fact that in the complex-eigenvalue case the

variable change is complex, and thus coefficient matrices in the new state equation
typically are complex. A remedy for such unpleasantness is the 'real Jordan form'
mentioned in Note 5.3.

5.10 Example

For a real n x n matrix A there exists an invertible n x ii matrix P, not

necessarily real, such that J = P - 'AP has the following structure. The matrix J is
block diagonal, with the

diagonal block in the form

line-Invariant Case

79

00i
is an eigenvalue of A. There is at least one block for each eigenvalue of A, but
patterns of diagonal blocks that can arise for eigenvalues with high multiplicities are
of interest here. We need only know that the n eigenvalues of A are displayed on the
of J. Of course, as reviewed in Chapter 1, if A has distinct eigenvalues, then
P can be constructed from eigenvectors of A and J is diagonal. In general J (and P)
e complex when A has complex eigenvalues. In any case Property 5.6 gives
=

the structure of the right side is not difficult to describe.


Using the power series definition, we can show that the exponential of the block
diagonal matrix J also is block diagonal, with the blocks given by
Writing
= X I + Nk, where Nk has all zero entries except for l's above the diagonal, and
noting that XI commutes with Nk, Property 5.7 yields
=

is nilpotent, calculation of the finite power series for


is upper triangular, with nonzero entries given by

Finally, since

shows that

(f-i)!
TIws (11), (12), and (13) prescribe a general form for
in terms of the eigenvalues of
A. (Again notice how simple the distinct eigenvalue case is.)
As a specific illustration the Jordan-form matrix

0 1000

00100
00010
00001

J= 00000

3 x 3 block corresponding to a multiplicity-3 eigenvalue at zero, and two scalar


blocks corresponding to a multiplicity-2 unity eigenvalue. Thus (12) and (13) give

Chapter 5

80

eul=

01

00
00

Two Important Cases

t2/2 0 0
t
1

00 0

00
00

Oe'

DOD
Special features of the transition matrix when A (t) is constant naturally imply
special properties of the response of a time-invariant linear state equation

i(t)=Ax(t) + Bu(t), x(t0)=x0


y(t) = Cx(t) + Du(t)
The complete solution formula in Chapter 3 becomes

y(t) =

+J

+ Du(t), t t0

This exhibits the zero-state and zero-input response components for time-invariant linear

state equations, and in particular shows that the integral term in the zero-state response is
a convolution. If t0 = 0 the complete solution is

t0

y(t)=CeA1xo

A change of integration variable from a to t = t a in the convolution integral gives

y(t) = CeAtxo + $ CeATBu(t_t)dt + Du(t), t 0


every r in (15) by t to shows that if the initial time is
0, then the
complete response to the initial state x(t0) = x0 and input u0(t) = u(t t0) is
y0(t) = y(r t0). In words, time shifting the input and initial time implies a
corresponding time shift in the output signal. Therefore we can assume t0 = 0 without
loss of generality for a time-invariant linear state equation.
Assuming a scalar input for simplicity, consider the zero-state response to a unit
impulse u (t) = S(t). (Recall that it is important for consistency reasons to interpret the
initial time as t = 0 whenever an impulsive input signal is considered.) From (15) this
Replacing

unit impulse response is


y (t) = CeA1B + D6(t)

it follows from (15) that for an ordinary input signal the zero-state response is
given by a convolution of the input signal with the unit-impulse response. In other
words, in the single-input case, the unit-impulse response determines the zero-state
response to any continuous input signal. It is not hard to show that in the multi-input
Thus

case m impulse responses are required.

Case

81

The Laplace transform is often used to represent the response of the linear timestate equation (14). Using the convolution property of the transform, and the
La.Iax transform of the matrix exponential, (15) gives
Y(s) = C(sI AY'x0

+ [C(sl AY1B + D lU(s)

(16)

formula also can be obtained by writing the state equation (14) in terms of Laplace

for Y(s). (Again, the initial time should be interpreted as


=0

for this calculation if impulsive inputs are permitted.)

It is easy to see, from (16) and (8), that if U(s) is a proper rational function, then
also is a proper rational function. Finally recall that the relation between Y(s) and
ths under the assumption of zero initial state is called the transfer function. Namely

transfer function of a time-invariant linear state equation is the p x ni matrix of


functions

+D
of the presence of D, the entries of G(s) in general are proper rational
but not strictly proper.

Periodic Case
Tk second special case we consider involves a restricted but important class of matrix
of time. A continuous ii x ii matrix function A (I) is called T-periodic if there
a positive constant T such that

A(t+T)=A(t)

(17)

all t. (It is standard practice to assume that the period T is the least value for which
117, bo4ds.) The basic result for this special case involves a particular representation for

uansition matrix. This Floquet decomposition then can be used to investigate


properties of T-periodic linear state equations.

511 Property The transition matrix for a T-periodic A (t) can be written in the form
t) = P(t)
R

t)

(t)

is a constant (possibly complex) x n matrix, and P (t) is a continuously


T-periodic, n x ii matrix function that is invertible at each t.

Proof Define the n x n matrix R by setting


=

0)

(19)

nontrivial step involves computing the natural logarithm of the invertible matrix
T. 01. and a complex R can result. See Exercise 5.18 for further development, and
5.3 for citations.) Also define P (t) by setting

Chapter 5

82

Two Important Cases

P(t) = b(t, 0) e_Rt

(20)

Obviously P (r) is continuously differentiable and invertible at each t, and it is easy to


show that these definitions give the claimed decomposition. Indeed
0) = P(t)eRt

implies
cb(0, t) =
so

0) =

that, as claimed,
t) = c1(t, 0)1(0, t) =
P (t) defined by (20) is T-periodic. From (20),

P(t+T)
=

0)e _RTe -Ri

and since (I)(T, 0)e -RT =

(22)

Now we note that b(t + T, T) satisfies the matrix differential equation

c1(t + T, T)

d(t T)

+ T, T) = A (t + T)4(t + T, T)

T), 4(t+T,
Therefore, by uniqueness of solutions,

+ T, T) =

0). Then (22) can be written

as

P(t+T)

=P(t)

to conclude the proof.


DEID
Because of the unmotivated definitions of R and P(r), the proof of Property 5.11
resembles theft more than honest work. However there is one case where the constant
matrix R in (18) has a simple interpretation, and is easy to compute. From Property 4.2

we conclude that if the T-periodic A (t) commutes with its integral, then R is the
average value of A (t) over one period.

5.12 Example
for

At the end of Example 4.6, in a different notation, the transition matrix

A(t)=
is

given as

Periodic Case

83

cI(t, 0) =

1/2 + e'(cos (sin 1)12

(23)

This result can be deconstructed to illustrate Property 5.11. Clearly T = 2it, and
0
1

It is not difficult to verify that

0]
by computing eRr, and evaluating the result at t = 27t. Then
0

e Ri

112+e'/2

and. from (20) and (23),

P(t)=

l/2+(costsint)/2

0) is

Thus the Floquet decomposition for

e'

1/2+(costsint)/2

1/2+e'/2

10

(24)

The representation in Property 5.11 for the transition matrix implies that if R is
+ T), then 1(t, t0) can be computed for
[ti,,
arbitrary values of i'. Also the growth properties of
ta), and thus of solutions of the

known and P(t) is known for t e


linear state equation

x(10)=x0

(25)

with T-periodic A (t), depend on the eigenvalues of the constant matrix eRT = cb(T, 0).
To see this, note that for any positive integer k repeated application of the composition
property (Property 4.7) leads to

x(t

+ kT) =

+ kT,

= 1(t +kT, t +(k1)T) 4(r +(k l)T, t +(k2)T)


i'0)x0

= P(t

(t

P0'

= P0' + kT)[ eRT

- '(t)x 0') = P (t)[ eRT

(t)x 0')

+ (k2)T)

Two Important Cases

Chapter 5

84

If, for example, the eigenvalues of eRT all have magnitude strictly less than unity, then
[eRT 1k
0 as k * co, as a Jordan-form argument shows. (Write the Jordan form of
as the sum of a diagonal matrix and a nilpotent matrix, as in Example 5.10. Then, using

of this sum to see that


oo.) Thus for any t,
Similarly when at
least one eigenvalue has magnitude greater than unity there are initial states for which
x(t) grows without bound as I oo.
If
has at least one unity eigenvalue, the existence of nonzero T-periodic
solutions to (25) for appropriate initial states is established in the following
development. We prove the converse also. Note that this is one setting where the
is considered, as dictated by the definition of
solution for t <ti, as well as for t
periodicity.x (t +T) =x(t) for all t.
commutativity, apply the binomial expansion to the

each entry of the result is zero, or approaches zero as k


oo for every
.v(t kT) * 0 as k *
That is, .v(t) * 0 as t

5.13

Theorem

initial state

Suppose A (t) is T-periodic. Given any


such that the solution of
(t ).v (t)

.v

to

there

exists a nonzero
(26)

(ti,) =

is T-periodic if and only if at least one eigenvalue of

0) is unity.

is unity, and let


be a
Proof Suppose that at least one eigenvalue of
is real and nonzero, and it is easy to verify that for
corresponding eigenvector. Then

any

:(t) =
is T-periodic. (Simply compute
for b(t, t0,) and letting x0 =

(r + T) from (27).) Invoking the Floquet description

yields the (nonzero) solution of (26):


= P(t)eRU

(r) = c1(t,

.v

(27)

I (t0,)x0

= P(t)z(t)
This solution clearly is T-periodic, since both P (I) and z (t)

Now suppose that given


x

is
x

(t) =

are T-periodic.
nonzero
initial
state
the
the Floquet description,

(t0)x0

and
+T_I..)p
x(t + T) = P0' +
'
= P(t)eT_1)P_l(to)xo,
Since x 0') = x Cf + T) for all t, these representations imply

e RTp (t, )x0, =

(t0)x0

(28)

Periodic Case
But P (t(,)xQ

85

0,

so (28) exhibits P '(t<,)x0 as an eigenvector of eRT corresponding

to a unity eigenvalue.

Theorem 5.13 can be restated in terms of the matrix R rather than eRT, since eRT
has a unity eigenvalue if and only if R has an eigenvalue that is an integer multiple of
the purely imaginary number 2iti/T. To prove this, if (k2itilT) is an eigenvalue of R
with eigenvector z, then (RT)3z = R'zT1 =
Thus, from the power series for

the matrix exponential,

= z, and this shows that

has

a unity

cigenvalue. The converse argument involves transformation of


to Jordan form.
Now consider the case of a linear state equation where both A (t) and B (t) are
T-periodic, and where the inputs of interest also are T-periodic. For simplicity such a
slate equation is written as

=A(t)x(t) + f(t)

(29)

We assume that both A (t) and f (t) are T-periodic, and A (t) is continuous, as usual.
However to accommodate a technical argument in the proof of Theorem 5.15 we permit
f(t) to be piecewise continuous.

5.14 Lemma A solution x (t) of the T-periodic state equation (29) is T-periodic if and
only if x(t0 + T) = x0.

Proof Of course if x(t) is T-periodic, then .v(t0 + T) = x(10). Conversely suppose


is such that the corresponding solution of (29) satisfies x + T) = x0. Letting
:fl = x(t + T) x(t), it follows that z (t0) = 0, and

.x.

z(t)= [A(t+T)x(t+T) +f(t+T)] [A(t)x(t) +f(t)]


=A(t)z(t)
But uniqueness of solutions implies z(t) = 0 for all t, that is, x(t) is T-periodic.

Using this lemma the next result provides conditions for the existence of Tperiodic solutions for eveiy T-periodic f (t). (A refinement dealing with a single,
specified T-periodic f (t) is suggested in Exercise 5.22.)

5.15 Theorem Suppose A (t) is T-periodic. Then for every t0


f (I) there exists an x0 such that the solution of

and

+f(t), x(10)=x0
is T-periodic if and only if there does not exist Z()

= A(t)z(t)
has a T-periodic solution.

every T-periodic
(30)

and t0 for which


=

(31)

Two Important Cases

Chapter 5

86

Proof For any x0, to, and T-periodic f (t), the solution of(30) is
x (t) =

By Lemma 5.14, x (t)

is

t0)x0 +

da

T-periodic if and only if


1,,

+T, t0)]x0 =

[1

+T

+T, a)f(a)da

(32)

Therefore, by Theorem 5.13, it must be shown that this algebraic equation has a solution

for x0 given any

t0

and

any T-periodic f (t) if and only if eRT

eigenvalues.
First suppose eRT =

has

no unity

0) has no unity eigenvalues, that is,

(33)

By invertibility of transition matrices, (33) is equivalent to the condition

0)]

+T, T) [I
= det

+ T, T)c1(0, t0)

t0)
+ T,

0), as shown in the proof of Property 5.11, we conclude that


(33) is equivalent to invertibility of [I 1D(t0 + T, t0)] for any t0. Thus (32) has a
solution x0 for any t0 and any T-periodic f (t).
Now suppose that (32) has a solution for every t0 and every T-periodic f (t).
Given t0, corresponding to any n x 1 vector f, define a particular T-periodic,
piecewise-continuous f (t) by setting
Since cb(t0 + T, T) =

f(t) =
and extending this definition to all
periodic f (t),
5

+T)

t
t by

(34)

repeating. For such a piecewise-continuous, T-

Jf0da=Tf,

and (32) becomes


[I c1(t(, +T, t1,)]x0 =

(35)

For every f(t) of the type constructed above, that is for every f0, (35) has a solution for
x0 by assumption. Therefore

det [1 b(t0 +T, t0)]


and, again, this is equivalent to (33). Thus no eigenvalue of eRT

ODD

is

unity.

Examples

87

Application of this general result

to a situation that might be familiar

is

The sufficiency portion of Theorem 5.15 immediately applies to the case


f (r) = B (t)z. (t), though necessity requires the notion of controllability discussed
Chapter 9 (to avoid certain difficulties, a trivial instance of which is the case of zero
tn. Of course a time-invariant linear state equation is T-periodic for any value of

r>0.

Corollary

For the time-invariant linear state equation

i(t) =Ax(t) + Bu(t), x(O) =x0

(36)

A has no eigenvalue with zero real part. Then for every T-periodic input u (t)
exists an x0 such that the corresponding solution is T-periodic.
In particular it is worthwhile to contemplate this corollary in the single-input case
.4 has negative-real-part eigenvalues, and the input signal is u (t) = sin wt. By

5.16 there exists an initial state such that the complete response x(t) is
periodic with T = 2E10.. And it is clear from the Laplace transform representation of the
that for any initial state the response x (t) approaches periodicity as t p oo
surprisingly, if A has (some, or all) eigenvalues with positive real part, but none
zero real part, then there still exists a periodic solution for some initial state.
Evidently the unbounded terms in the zero-input response component are canceled by
terms in the zero-state response.

Additional Examples
Consideration of physical situations leading to time-invariant or T-periodic linear state

equations might provide a welcome digression from theoretical developments.

517 Example

Various properties of time-invariant linear systems are illustrated in the


sequel by connections of simple cylindrical water buckets, some of which have a supply

pipe. and some of which have an orifice at the bottom. We assume that the crosssectional area of a bucket is c

the inflow is u
denoted y (t)

cm3 /sec,

the depth of water in the bucket at time t is x(t) cm,


Also it is assumed that the outflow through an orifice,

cm2,

(t) cm3/sec.

is described by

q is a positive constant. Since the rate-of-change of volume of water in the


bucket is

= u(t) y(t)
we are led immediately to the state equation description

t(t)=
(37)

Chapter 5

88

Two Important Cases

Two complications are apparent: This is a nonlinear state equation, and our
formulation requires that all variables be nonnegative. Both matters are rectified by

considering a linearized state equation about a constant nominal solution.


Suppose the nominal inflow is a constant,
=
> 0. Thus a corresponding
nominal constant depth is

X=

the nominal outflow (necessarily equal to the inflow) is j;(,) =


this nominal solution gives the linear state equation
and

Linearizing about

x8(t) +

=
where

r=

and the deviation variables have the obvious definitions. In this

formulation the deviation variables can take either positive or negative values,

corresponding to original-variable values above or below the specified nominal values.


Of course this is true within limits, depending on the nominal values, and we assume
always that the buckets are operated within these limits. Various other assumptions
relating to the proper interpretation of the linearized state equation, all quite obvious, are
not explicitly mentioned in the sequel. For example, the buckets must be large enough so
that floods are avoided over the range of operation of the flows and depths.

(1)

Figure 5.18 A linear water bucket.

Finally, to simplify notation, we drop the subscript


linearized water bucket, shown in Figure 5.18, as

x(t)=

in the sequel to write the

y(t) =
A simple calculation gives the bucket transfer function as

(38)

Additional Examples

89

G(s)
More

l/rc
= s + l/rc

interesting are connections of two or more linear buckets. A series

connection is shown in Figure 5.19, and the corresponding linearized state equation,
easily derived from the basic bucket principles discussed above, is

- l/(r2c2)]

H]

y(t) = [0 l/r2].r(t)
Computation of the transfer function of the series bucket is rather simple, due to the
triangular A, giving

(s) = [0
=

l/r, }

l/(r

s+
1/(,

I/c1

s + 1I(r,c,)

c,)
[s + l/(r1c1)][s +
-

(39)

More cleverly, it can be recognized from the beginning that Gs(s) is simply the product
of two single-bucket transfer functions.

C2

Figure 5.19 A series connection of two linear buckets.

A slightly more subtle system is what we call a parallel bucket connection, shown
in Figure 5.20. Assuming that the flow through the orifice connecting the two buckets is
proportional to the difference in water depths in the two buckets, the linearized state
equation description is
.r(t)

v(f)

l/(r1c1)

1/(r1 c,)

[0

I/r, ]x(t)

l/(r1c1 )
x (t)
l/(r1 c,) l/(r,c2)
+

1/c1

ii

(t)
(40)

Chapter 5

90

Computing

Two Important Cases

the transfer function for this system is left as a small exercise, with no

apparent short-cuts.

Figure 5.20 A parallel connection of linear buckets.

Example A variant of the familiar pendulum is shown in Figure 5.22, where the
rod has unit length, nz is the mass of the bob, and x1 (t) is the angle of the pendulum
from the vertical. We make the usual assumptions that the rod is rigid with zero mass,
and the pivot is frictionless. Ignoring for a moment the indicated pivot displacement,
w (t), the equations of motion lead to the nonlinear state equation
5.21

.i-2(t)

x2(t)
gsinx1(t)

' 41

where g is the acceleration due to gravity. Next assume that the pivot point is subject to

a vertical motion w (t). This induces an acceleration

that can be interpreted as

modifying the acceleration due to gravity. Thus we obtain

w(t)t

Figure 5.22 A pendulum with pivot displacement w(t).

i1(t)
i2(t)

x2(t)

[g +

A natural constant nominal solution corresponds to zero values for w (t), x1(t),
and x2(t). Then an easy exercise in linearization leads to the linear state equation
=

g+w(t)

x(t)

(42)

Mditional Examples

91

is a suitable approximation for small absolute values of angle x1 (t), angular

x,(t), and pivot displacement w (t).


Now suppose the pivot displacement has the form

w(t)= #coswt
a and to are constants. For simplicity we further suppose the pendulum is on
other planet, where g = 1 This yields the T-periodic linear state equation

where

O]xt

[_1+acos(0t

(43)

with T =

Though simple in form, this periodic state equation seems to elude useful
solution. The obvious exception is the case a = 0, where the oscillatory
schnion in Example 5.9 is obtained. In particular the initial conditions x1 (0) = 1,
0 yield x1 (t) = cos t, an oscillation with period 2it.
Consider next what happens when the parameter a is nonzero. Our approach is to
compute eRT = (I)(T, 0), and assess the asymptotic behavior of the pendulum from the

cigenvalues ofthis 2x2 matrix. With o=4 and a= 1, (43) has period T=jr/2, and
numerically solve (43) for two initial states to obtain the corresponding values of
x

shown:

x (0)

* x (ir/2)

x (0)

[?]

Therefore

/2 '

0)

0.0328
1.2026

0.8306

0.0236

and another numerical calculation gives the eigenvalues 0.0282 i 0.9994. In this
case, following the analysis below (25), we see that the pivot displacement causes the
oscillation to slowly die out, since the magnitude of both eigenvalues is 0.9998.
Next suppose to= 2 and a = 1, so that (43) is it-periodic. Repeating the
numerical solution as in (44) yields
e RE =

1.3061 0.8276
=

0.8526

1.3054

The eigenvalues now are 0.4657 and 2.1458, from which we conclude that the
oscillation grows without bound. What happens in this case, when the displacement

frequency is twice the natural frequency of the unaccelerated pendulum, can be

Chapter 5

92

Two Important Cases

interpreted in a familiar way. The pendulum is raised twice each complete cycle of its
oscillation, doing work against the centrifugal force, and lowered twice each cycle when

the centrifugal force is small. This results in an increase in energy, producing an


increased amplitude of oscillation. The effect is rapidly learned by a child on a swing.

EXERCISES
For a constant. n x n matrix A, show that the transition matrix for the transpose of A
the transpose of the transition matrix for A. Is this true for nonconstant A (1)? Is it true for the
case where A (t) commutes with its integral?
Exercise 5.1
is

Exercise 5.2 Compute

for
0

(a) A

_2]

(1,) A

-l

= [

(c)

-2

Exercise 5.3 Compute eM for

A=
by two different methods.

Exercise 5.4 Compute c1(r, 0) for

A(t)=
Hint: One efficient way is to use the result of Exercise 5.3.

Exercise 5.5 The transfer function of the series bucket system in Figure 5.19 with all parameter
values unity is
G5(s)

= (s-i-I)2

Can you find parameter values for the parallel bucket system of Figure 5.20 such that its transfer
function is the same?
Exercise 5.6 Compute state equation representations and voltage transfer functions Ya(s)JUa(S)
and Yb(s)/Ub(s) for the two electrical circuits shown. Then connect the circuits in cascade
and compute a linear state equation representation and transfer function
(Uh(t) =
Yb(S)IUa(S). Comment on the results in light of algebraic manipulation of transfer functions
involved in representing interconnections of linear time-invariant systems.

Exercises

93

VVYv

"b(t)

Ya(t)

Yb(t)

Liercise 5.7 If A is a constant n x n matrix, show that


do =

additional conditions on A yield

Exercise 5.8

Suppose the n x n matrix A (t) can be written in the form

A(t)

fr(t)

are

continuous, scalar functions, and A

are constant

that satisfy

=A1A,, i,j =

,...,r

that the transition matrix for A (t) can be written as


A

(I)(t,

1(,)

Sf0d0

ArJfr(cl)dO

=e

Lw this result to compute C1(t, 0) for


coswt

sinwf

sin cot cos cot

Etercise 5.9 For the time-invariant, n-dimensional, single-input nonlinear state equation
= Ax(t) +

Dx(t)u(t) +

(I), x(0) =

tho* that under appropriate additional hypotheses a solution is


DJu(t)Jt

x(t) =
Exercise 5.11)

bu(o) do

If A and F are n x n constant matrices, show that


e" = fe (_0)FC(A+F)a do

ii

xn

Chapter 5

94
Exercise 5.11

If A and F are n x n constant matrices, show that


_SeAa[e(A+FXt_o)F

Exercise 5.12

Two Important Cases

Suppose A has eigenvalues X1

and let

P0=1,P1=AX1!,P2=(AA.2!)(AA.11)
= (A

... (A

A.1!)

(t) such that

Show how to define scalar analytic functions


flI

eAt

A =0

Exercise 5.13

Suppose A is n x n, and

=?

det (si A)

+ a0

Verify the formula


adj (si A) =

. .

+a1)! +

(s

and use it to show that there exist strictly-proper rational functions of s such that

=&0(s)I +&1(s)A +
Exercise 5.14 Compute cD(t, 0) for the T-penodic state equation with
2+cos2t

A(t)=

3+cos2t

Compute P (t) and R for the Floquet decomposition of the transition matrix.

Exercise 5.15 Consider the linear state equation

.v(t)=Ax(t)

+f(t),

where all eigenvalues of A have negative real parts, and 1(1) is continuous and T-periodic. Show
that

x(t)=
is

e'tf(a)da

a T-periodic solution corresponding to


=

(a) da

Show that a solution corresponding to a different x0 converges to this periodic solution as t , oo

Show that a linear state equation with T-periodic A (1) can be transformed to a
time-invariant linear state equation by a T-periodic variable change.
Exercise 5.16

Exercise 5.17 Suppose that A (1) is T-periodic and


(1) can be written in the form

is fixed. Show that the transition matrix for

Exercises

95

cb(,, t0) = Q(t,


where S is a (possibly complex) constant matrix (depending on ti,), and Q (t, t0) is continuous and
vcrtible at each t, and satisfies

Q(t + T, t,,)=Q(t, ti,), Q(t0,10)=!


Suppose M is an n x n invertible matrix with distinct eigenvalues. Show that there
exists a possibly complex, n x n matrix R such that
Exercise 5.18

eR = M
Exercise

5.19

Prove that a T-periodic linear state equation


= A O)x(t)

unbounded solutions if

Exercise 5.20

Suppose A (t) is
matrix for AO) can be written as

x n, real,

continuous, and T-periodic. Show that the transition

(b(t, 0) =

S is a constant, real, n x n matrix, and Q (t) is n x n, real, continuous, and 2T-periodic.


Hint: It is a mathematical fact that if M is real and invertible, then there is a real S such that
where

= M2.
Exercise 5.21

For the time-invariant linear state equation

i(t) =Ax(t) + Bu(t)


y(t)
all eigenvalues

a(r) =

of A

Cx(t)

negative real parts, and consider the input signal

have

x I and w> 0. In terms of the transfer function, derive an explicit


expression for the periodic signal that y (t) approaches as + 00, regardless of initial state. (This
sin wa', where

u0

is

is called the steady-state frequency response at frequency co.)


5.22 For a T-periodic state equation with a specified T-periodic input, establish the
following refinement of Theorem 5.15. There exists an x0 such that the solution of
Exercise

.i(r) =A(r)x(r)
us

+f(t) ,

x(t0)

T-periodic if and only if f (t) is such that


5
z

(t) of the adjoint state equation


= __AT(t)z(t)

Exercise 5.23

z()

Consider the pendulum with horizontal pivot displacement shown below.

Chapter

96

Two Important Cases

Assuming g = 1, as in Example 5.22, write a linearized state equation description about the
natural zero nominal. If w(t) = sint, does there exist a periodic solution? If not, what do you
expect the asymptotic behavior of solutions to be? Hint: Use the result of Exercise 5.22, or
compute the complete solution.

Exercise 5.24

Determine values of w for which there exists an .v,, such that the resulting solution

of
=

.v(0) =x0
+

is periodic. Hint: Use the result of Exercise 5.22.

NOTES
Note 5.1 In Property 5.7 necessity of the commutativi.ty condition on A and F fails if equality of
exponentials is postulated at a single value of t. Specifically there are non-commuting matrices A
and F such that e"
For further details see
=

D.S. Bernstein, "Commuting matrix exponentials," Problem 88-I, SIAM Review, Vol. 31, No. 1,
p. 125, 1989

and the solution and references that follow the problem statement.

Note 5.2 Further information about the functions aL(t) in Property 5.8, including differential
equations they individually satisfy, and linear independence properties, is provided in

M. Vidyasagar, "A characterization of e't' and a constructive proof of the controllability


condition," IEEE Transactions on Automatic Control, Vol.
16, No. 4, pp. 370 371, 1971
Note 5.3 The Jordan form is treated in almost every book on matrices. The real version of the
Jordan form (when A has complex eigenvalues) is less ubiquitous. See Section 3.4 of

R.A. Horn, C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, England,
1985

The natural logarithm of a matrix in the general case is a more complex issue than in the special
case considered in Exercise 5.18. A Jordan-form argument is given in Section 3.4 of
R.K. Miller, A.N. Michel, Ordinary Differential Equations, Academic Press, New York, 1982
A more advanced treatment, including a proof of the fact quoted in Exercise 5.20, can be found in
Section 8.1 of

97
DL Lukes, Differential Equations: Classical to Controlled, Academic Press, New York, 1982

Differential equations with periodic coefficients have a long history in mathematical


and associated phenomena such as parametric pumping are of technological interest.
&ict and less-brief treatments, respectively, can be found in
5.4

IA. Richards, Analysis of Periodically Time-Varying Systems, Springer-Verlag, New York, 1983
M. Farkas, Periodic Motions, Springer-Verlag, New York, 1994

These books introduce standard terminology ignored in our discussion. For example in Property
5.11 the eigenvalues of R are called characteristic exponents, and the eigenvalues of eRT are called
multipliers. Also both books treat the classical Hill equation,

j(t) + [a +
where a (t) is

T-periodic. The special case in Example 5.21 is known as the Matliieu equation.

of periodicity and boundedness of solutions are surprisingly complicated for these


differential equations.

Note 5.5

Periodicity properties of solutions of linear state equations when A (t) and


properties (even or odd) in addition to being periodic are discussed in

f (t) have

Ri. Mulholland, "Time symmetry and periodic solutions of the state equations," IEEE
Transactions on Automatic Control, Vol. 16, No.4, pp. 367368, 1971

Note 5.6 Extension of the Laplace transform representation to time-varying linear systems has
kmg been an appealing notion. Early work by L.A. Zadeh is reviewed in Section 8.17 of

W. Kaplan, Operational Methods for Linear Systems, Addison-Wesley, Reading, Massachusetts,


1962

also Chapters 9 and 10 of

It

Linear Time-Vaiying Systems, Allyn and Bacon, Boston, 1970


for more recent developments,

Kamen, "Poles and zeros of linear time varying systems," Linear Algebra and Its
Applications, Vol. 98, pp. 263 289, 1988

Note 5.7 We have not exhausted known properties of transition matricesa believable claim we
with two examples. Suppose
q

A(r)=
k=I

where A
Aq are constant n x it matrices, a1 (t)
aq(t) are scalar functions, and of course
Then there exist scalar functions f1(t)
fq(t) such that
q

cD(t, 0) =

least for t in a small neighborhood of t = 0. A discussion of this property, with references to the
mathematics literature, is in

Ri. Mulholland, "Exponential representation for linear systems," IEEE Transactions on


.4utomatic Control, Vol. 16, No. I, pp.97 98, 1971

Chapter 5

98

Two

Important Cases

The second example is a formula that might be familiar from the scalar case:
eA

= tim (I + A/n )"

Numerical computation of the matrix exponential e4' can be approached in many ways,
each with attendant weaknesses. A survey of about 20 methods is in
Note 5.8

C. Moler, C. Van Loan, "Nineteen dubious ways to compute the exponential of a matrix," SIAM
Review, Vol. 20, No.4, pp. 801 836, 1978

Note 5.9 Our water bucket systems are light-hearted examples of the compartmental models
widely applied in the biological and social sciences. For a broad introduction, consult
K. Godfrey, Compartmental Models and Their Application, Academic Press, London, 1983

The issue of nonnegative signals, which we side-stepped by linearizing about positive nominal
values, frequently arises. So-called positive linear systems are such that all coefficients and signals
must have nonnegative entries. A basic introduction is provided in
D,G. Luenberger, Introduction to Dynamic Systems, John Wiley, New York, 1979
and more can be found in
A. Berman, M. Neumann, R.J. Stern, Nonnegative Matrices in Dynamic Systems, John Wiley, New
York, 1989

6
INTERNAL STABILITY

Internal stability deals with boundedness properties and asymptotic behavior (as t

oo)

of solutions of the zero-input linear state equation

i(t) =A(t)x(t)

x(t0) =x0

While bounds on solutions might be of interest for fixed t0 and x0, or for various initial
states at a fixed to, we focus on boundedness properties that hold regardless of the choice
of t0 or
In a similar fashion the concept we adopt relative to asymptotically-zero
solutions is independent of the choice of initial time. The reason is that these 'uniform
in t0' concepts are most appropriate in relation to input-output stability properties of
linear state equations developed in Chapter 12.
It is natural to begin by characterizing stability of the linear state equation (1) in
terms of bounds on the transition matrix
t) for A (t). This leads to a well-known
eigenvalue condition when A (t) is constant, but does not provide a generally useful
stability test for time-varying examples because of the difficulty of computing 11(t, 'r).
Stability criteria for the time-varying case are addressed further in Chapters 7 and 8.

Uniform Stability
The first stability notion involves boundedness of solutions of (1). Because solutions are

linear in the initial state, it is convenient to express the bound as a linear function of the
nonn of the initial state.
Definition The linear state equation (1) is called uniformly stable if there exists a
finite positive constant y such that for any t0 and x0 the corresponding solution satisfies
IIx(t) II

II ,

t t0

99

Chapter 6

Internal Stability

of (2) at t =

shows that the constant y must satisfy y 1. The


adjective uniform in the definition refers precisely to the fact that y must not depend on
the choice of initial time, as illustrated in Figure 6.2. A 'nonuniform' stability concept
can be defined by permitting y to depend on the initial time, but this is not considered
Evaluation

here except to show that there is a difference via a standard example.

yIIx0II

IkJI
IIx(t)II

to

6.2

6.3 Example

Figure Uniform stability implies the y-bound is independent of t,,.

The scalar linear state equation


= (4tsin t

2t)x(t)

x(t0) =

x0

has the readily verifiable solution

x(t) = exp (4sin t 4t cost t2 4sin t0 +

cost0 +

)x0

It is easy to show that for fixed t0 there is a y such that (3) is bounded by
for all
t
since the ( t2) term dominates the exponent as t increases. However the state
equation is not uniformly stable. With fixed initial state
consider a sequence of initial
times t0 = 2kit, where k = 0, 1,. . ., and the values of the respective solutions at times
it units later:
x(2kit+ it) = exp[(4k + l)it(4ir)]x0
I

Clearly there is no bound on the exponential factor that is independent of k. In other


words, a candidate bound must be ever larger as k, and the corresponding initial time,
increases.

DOD
We emphasize again that Definition 6.1 is stated in a form specific to linear state
equations. Equivalence to a more general definition of uniform stability that is used also
in the nonlinear case is the subject of Exercise 6.1.
The basic characterization of uniform stability is readily discernible from
Definition 6.1, though the proof requires a bit of finesse.

6.4 Theorem The linear state equation (1) is uniformly stable if and only if there exists
a finite positive constant y such that
Ikb(t, r)II

for all t, r such that t t.

'i

Uniform Exponential Stability

101

Proof First suppose that such a y exists. Then for any t0

and

x0 the solution of (I)

satisfies
lix (t) ii = ii D(t, t0)x0 ii

ii

t0

uniform stability is established.


For the reverse implication suppose that the state equation (1) is uniformly stable.
Then there is a finite ? such that, for any t0 and x0, solutions satisfy
and

llx(t)li
Given any t0 and

t(J

t t0

to, let Xa be such that


liXail

I,

= Ikb(t0,t0)li

(Such an x,,, exists by definition of the induced


yields a solution of (1) that at time t0 satisfies

Ilx(t0) ii =

norm.) Then the initial state x(t0) = Xa

tQ)x(, Ii =

ii

11x0 ii = 1, this shows that


t(,)ll
for any t0 and
the proof is complete.

Ii

ii

Because such an

Since

(5)

can be selected

Uniform Exponential Stability


Next

we consider a stability property for (1) that addresses both boundedness and

asymptotic behavior of solutions. It implies uniform stability, and imposes an additional


requirement that all solutions approach zero exponentially as t oo
Definition The linear state equation (I) is called unjformly exponentially stable if
there exist finite positive constants
X such that for any t0 and x0 the corresponding
solution satisfies
6.5

lix(t)il

and

tt0

(6)

Again y is no less than unity, and the adjective


refers to the fact that y
are independent of t0. This is illustrated in Figure 6.6. The property of uniform

exponential stability can be expressed in terms of an exponential bound on the transition


matrix. The proof is similar to that of Theorem 6.4, and so is left as Exercise 6.14.

yIIx0JI
iix,,Il

to

6.6

to

Figure A decaying-exponential bound independent of t0.

Chapter 6

102

Internal Stability

Theorem The linear state equation (I) is uniformly exponentially stable if and only
if there exist finite positive constants y and A such that

6.7

t)II
for all t, t such that t

(7)

t.

Uniform stability and uniform exponential stability are the only internal stability
concepts used in the sequel. Uniform exponential stability is the most important of the
two, and another theoretical characterization of uniform exponential stability for the
bounded-coefficient case will prove useful.

6.8 Theorem

Suppose there exists a finite positive constant a such that

IA

(t)II a

for all t. Then the linear state equation (1) is uniformly exponentially stable if and only
if there exists a finite positive constant f3 such that

a)II da13
for all t, t such that t

(8)

r.

Proof If the state equation is uniformly exponentially stable, then by Theorem 6.7
there exist finite y, A> 0 such that
II

b(t, a) II ye ).(ia)

for all t, a such that t a. Then

a)II

for all t, t such that r t. Thus (8) is established with = yIA.


Conversely suppose (8) holds. Basic calculus and the result of Exercise 3.2 permit
the representation

t) = I
=1
and thus

1(t, a) da

+Jc1(t,a)A(a)da

Uniform Exponential Stability

1k1(r, r) II I +

aJ IIb(t, a) II do

l+c(13
for all t, c such that r t. In completing this proof the composition property of the
transition matrix is crucial So long as t t we can write, cleverly,

t) II (t

II

t) =

511 cD(t, t) II

o)II

do

t)II do

13(l +a13)
Therefore

and t=t+T gives

letting T=

r)II 1/2

(10)

for all t. Applying (9) and (10), the following inequalities on time intervals of the form
[r + kT, c + (k + 1 )T), where r is arbitrary, are transparent:

t)II I +af3,
II

cb(t, t) II = II c1(r, r + T)c1(r + T, t) II

1+cLf3

t) II =

II b(t, t + T) liii

+ T, t) II

te ['r-i-T, t-i-2T)

2
II

[t, r+T)

+ 2T, t + T)1(t + T, t) I

c1(r, c +

k1(t, t2T)lI

+ T)lI

1k1(t+T, t)II

e [t+2T, t-i-3T)
Continuing in this fashion shows that, for any value of r,

t)II

+cLI3

2'

E [t+kT,

l)T)

Finally choose A = ( lIT) ln(1/2) and y=


Figure 6.9 presents a plot of the
corresponding decaying exponential and the bound (11), from which it is clear that

t)II ye_M1t)
for all t, 'r such that
Theorem 6.7.

t r. Uniform exponential stability thus is a consequence of

Chapter 6

104

Internal Stability

I+

t+3T

t+2T

6.9 Figure Bounds constructed in the proof of Theorem 6.8.

An alternate form for the uniform exponential stability condition in Theorem 6.8 is
IkP(t, a)Il da13

for all t. For time-invariant linear state equations, where

a) =
an
integration-variable change, in either form of the condition, shows that uniform
exponential stability is equivalent to finiteness of
(12)

dt

adjective 'uniform' is superfluous in the time-invariant case, and we will drop it in


clear contexts. Though exponential stability usually is called asymptotic stability when
discussing time-invariant linear state equations, we retain the term exponential stability.
The

Combining an explicit representation for c" presented in Chapter 5 with the


finiteness condition on (12) yields a better-known characterization of exponential
stability.

A linear state equation (1) with constant A (t) = A is exponentially


stable if and only if all eigenvalues of A have negative real parts.
6.10 Theorem

Proof

Suppose the eigenvalue condition holds. Then writing e" in the explicit
are the distinct eigenvalues of A, gives

form in Chapter 5, where X1


00

5 IIe"II dt =

511

WkJ

k=Ij=I

(jl)'
jl

I,,

II

k=Ij=I
=

II 5
0

(fI)'

di

the bounds from Exercise 6.10, or an exercise in integration by


parts, shows that the right side is finite, and exponential stability follows.
Since

Exponential Stability

If the negative-real-part eigenvalue condition on A

fails, then appropriate

of an eigenvector of A as an initial state can be used to show that the linear


equation is not exponentially stable. Suppose first that a real eigenvalue
is
and let p be an associated eigenvector. Then the power series
for the matrix exponential easily shows that

e/ttp = eXp
the initial state x,, p, it is clear that the corresponding solution of (1), x(t) =
not go to zero as t f
Thus the state equation is not exponentially stable.
Now suppose that = + ho is a complex eigenvalue of A with
0. Again let
p be an eigenvector associated with written

p =Re[pJ +

ilm[p}

Then

iie't'p ii = ieXhl lip ii = ear

II ,

thus

e"p =eAtRe[j,1
oo Therefore at least one of the real initial states
not approach zero as t
= Re [p] or
= mi [p] yields a solution that does not approach zero as t 9 oo

This

proof. with a bit of elaboration, shows also that

jim,

=0

is a

and sufficient condition for uniform exponential stability in the time-invariant


The corresponding statement is not true for time-varying linear state equations.

11 Example Consider a scalar linear state equation (1) with

A(t)=

2t

t- + 1

A quick computation gives

1(t,

+
t1,) =

+1

aid it is obvious that


= 0 for any t0. However the state equation is not
iiformly exponentially stable, for suppose there exist positive
and 'I' such that

t)ii =
for all t, t such that t

t.

Taking t = 0, this inequality implies

+ l)ye

but L'Hospital's rule easily proves that the right side goes to zero as t
This
contradiction shows that the condition for uniform exponential stability cannot be
satisfied.

Chapter 6

106

Internal Stability

Uniform Asymptotic Stability


Example

6.11 raises the interesting puzzle of what might be needed in addition to


t0) = 0 for uniform exponential stability in the time-varying case. The

answer turns out to be a uniformity condition, and perhaps the best way to explore this
issue is to start afresh with another stability definition.
6.12 Definition The linear state equation (1) is called unijth-mly asymptotically stable
if it is uniformly stable, and if given any positive constant 6 there exists a positive T
such that for any t0 and
the corresponding solution satisfies

ttf?+T

IIx(t)II

Note that the elapsed time T until the solution satisfies the bound (15) must be
independent of the initial time. (It is easy to verify that the state equation in Example
6.11 does not have this feature.) Some of the same tools used in proving Theorem 6.8
can be used to show that this 'elapsed-time uniformity' is the key to uniform exponential
stability.

6.13 Theorem The linear state equation (1) is uniformly asymptotically stable if and
only if it is uniformly exponentially stable.

Proof Suppose that the state equation is uniformly exponentially stable, that is,
there exist finite, positive y and such that
t)II ye Mtt) whenever t t. Then
the state equation clearly is uniformly stable. To show it is uniformly asymptotically

stable, for a given 6 > 0 pick T such that e

6/y. Then for any t0

and

and

t t0 + T,

IIx(r)Il = 11q(t, t0)x011 <

ye

ye

x0

tt0T
This demonstrates uniform asymptotic stability.
Conversely suppose the state equation is uniformly asymptotically stable.
Uniform stability is implied by definition, so there exists a positive y such that

1k1(t, r)II y
for all t, r such that t 'r. Select 8 = 1/2, and by Definition 6.12 let T be such that (15)
is satisfied. Then given a
let x0 be such that 11x0 II = 1, and
II

c1(t0 + T,

II = II

+ T,

With the initial state x(t0) = x0, the solution of (1) satisfies

Lyapunov Transformations
=

= 114)(t0+T,

t0)II IIXaII

(1/2) Il_VaIl

from which

t0)II 1/2

(17)

Of course such an
exists for any given ti,, so the argument compels (17) for any to.
Now uniform exponential stability is implied by (16) and (17), exactly as in the proof of
Theorem 6.8.

Lyapunov Transformations
The stability concepts under discussion are properties of a particular linear state equation

that presumably represents a system of interest in terms of physically meaningful


variables. A basic question involves preservation of stability properties under a state
variable change. Since time-varying variable changes are permitted, simple scalar
examples can be generated to show that, for example, uniform stability can be created or
destroyed by variable change. To circumvent this difficulty we must limit attention to a
particular class of state variable changes.
6.14 Definition An n x n matrix P (t) that is continuously
and invertible
at each t is called a Lyapunov transformation if there exist finite positive constants p
and
such that for all t,

detP(f)Iri
A condition equivalent to (18) is existence of a finite positive constant p such that
for all t,

IIP(r)lI p,
Exercise 1.12 shows that the lower bound on Idet P(t)I implies an upper bound on
P - '(t) II, and Exercise 1.20 provides the converse.
Reflecting on the effect of a state variable change on the transition matrix, a
1

detailed proof that Lyapunov transformations preserve stability properties is perhaps


belaboring the evident.

6.15 Theorem

Suppose the n x n matrix P (t)

is

a Lyapunov transformation. Then the

linear state equation (1) is uniformly stable (respectively, uniformly exponentially


stable) if and only if the state equation

= [P'(t)A(t)P(t) P'(t)P(t)}z(t)
is uniformly stable (respectively, uniformly exponentially stable).

Proof

The

linear state equations (I) and (19) are related by the variable change

(t) = P - '(t)x (t), as shown in Chapter 4, and we note that the properties required of a

Internal Stability

Chapter 6

108

Lyapunov transformation subsume those required of a variable change. Thus the relation

between the two transition matrices is

t)P(t)

r) =

Now suppose (1) is uniformly stable. Then there exists such that
t, t such that t t, and, from (18) and Exercise 1.12,

t)!I

t)II IIP(t)II
(20)

for all t, t such that t c.


similar argument applied to

This shows that (19) is uniformly stable. An obviously

t)P'(t)

t) =

shows that if (19) is uniformly stable, then (1) is uniformly stable. The corresponding
demonstrations for uniform exponential stability are similar.
The Floquet decomposition for T-periodic state equations, Property 5.11, provides

a general illustration. Since P (r) is the product of a transition matrix and a matrix
exponential, it is continuously differentiable with respect to t. Since P (t) is invertible,
by continuity arguments there exist p, > 0 such that (18) holds for all t in any
interval of length T. By periodicity these bounds then hold for all t, and it follows that
P (t) is a Lyapunov transformation. It is easy to verify that z (t) = P - 1(t)x (t) yields the
time-invariant linear state equation

=Rz(t)

By this connection stability properties of the original T-periodic state equation are
equivalent to stability properties of a time-invariant linear state equation (though, it must
be noted, the time-invariant state equation in general is complex).
6.16 Example

Revisiting Example 5.12, the stability properties of

.x(t)=
[

:c'st

are equivalent to the stability properties of

From the computation


eRt

= [_ 112e'12

(22)

Additional Examples

109

in Example 5.12, or from the solution of Exercise 6.12, it follows that (21) is uniformly
stable, but not uniformly exponentially stable.

Additional Examples
6.17

Example

The linearized state equation for the series bucket system in Example

5.17, or a series of any number of buckets, is exponentially stable. This intuitive


conclusion is mathematically justified by the fact that the diagonal entries of a triangular
A-matrix are the eigenvalues of A. These entries have the form lI(rkck), and thus are
negative for positive constants
and Ck. (We typically leave it understood that every
bucket has area and an outlet, that is, each ck and rk is positive.)
Exponential stability for the parallel bucket system in Example 5.17, or a parallel
connection of any number of buckets, is less transparent mathematically, though equally
plausible so long as each bucket has an outlet path to the floor.

6.18 Example We can use bucket systems to illustrate the difference between uniform
stability and exponential stability, though some care is required. For example the system
shown in Figure 6.19, with all parameters unity, leads to

u()

y(t)=

[1

O]x(t)

(23)

Figure 6.19 A disconnected bucket system.

This is a valid linearized model under our standing assumptions, for any specified
> 0.
constant inflow
> 0 and any specified constant depth
=
=
Furthermore an easy calculation gives

ett) 0
0

1, but it is clear that


exponential stability does not hold.
The care required can be explained by attempting another example. For the bucket
system in Figure 6.20 we might too quickly write the linear state equation description

Thus uniform stability follows from Theorem 6.4, with y =

Chapter 6

x(r)

10
=

y(t)=

[1

x(t)

Internal Stability

u(t)

0]x(t)

(24)

and conclude from


e_(1_t)

[1_e_(t_t) 1]
that the bucket system is uniformly stable but not exponentially stable. This is a correct
conclusion about the state equation (24). But the bucket formulation is flawed since the
system of Figure 6.20 cannot arise as a linearization about a constant nominal solution
with positive inflow. Specifically, there cannot be a constant nominal with
> 0.

(i)

Figure 6.20 A problematic bucket system.

6.21 Example

The transition matrix for the linearized satellite state equation is shown

in Example 3.8. Clearly this state equation is unstable, with unbounded solutions.
However we emphasize again that the physical implication is not necessarily disastrous.

EXERCISES
Exercise 6.1 Show that uniform stability of the linear state equation

=A(t)x(t) ,

=x0

equivalent to the following property. Given any positive constant a there exists a positive
constant 3 such that, regardless of
if 11x0 II 8, then the corresponding solution satisfies
r t0.
IIx(t)II
is

Exercise 6.2

For what ranges of the real parameter a are the following scalar linear state

equations uniformly stable? Uniformly exponentially stable?


(a)

x(t) = at x(t) ,

ae
f

(b) x(t) =
e

+1

x(t)

Exercises

111

Exercise 6.3 Determine if the linear state equation

[a(t)

x(t)
]

is uniformly exponentially stable for a (1) =

(ii)

(/) 0

(iii)

(iv)

-t

t <0

to

Exercise 6.4 Is the linear state equation

e'

c_I

x(t)

uniformly stable?

Exercise 6.5 Show that (perhaps despite initial impressions) the linear state equation
-3,

x(t)

=
is not

uniformly exponentially stable.

Exercise 6.6

Suppose there exists a finite constant a such that hA (t) hi

given a finite >Q there exists a finite y> 0 such that


It

1111)0',

a for all 1. Prove that

t)hI y for all t, t

such that

tI .

Exercise 6.7

If A 0') = _AT(t), show that the linear state equation

= A (t)x(t)

is uniformly stable. Show also that P0') = 'D(r, 0) is a Lyapunov transformation.


= A (t)x (1) is uniformly exponentially
Exercise 6.8 Show that the linear state equation
stable if and only if the linear state equation 1(t) =AT(_t)z(t) is uniformly exponentially stable.

Hint: See Exercise 4.23.

Exercise 6.9

Suppose that

r) is

the transition matrix for [A(t)_AT(t)]/2,

and let

P(t)=41(t, 0). For the state equation t(t)=A(t)x(t), suppose the variable change
is used to obtain 1(t) = F(t)z(t). Compute a simple expression for F(t), and
:0')
show that F (t) is symmetric. Combine this with the Exercise 6.7 to show that for stability
purposes only state equations with a symmetric coefficient matrix need be considered.

Exercise 6.10 If X is complex with

by

<0, show how to define a constant

such that

a decaying exponential, and show in particular that for any

nonnegative integer k,
J

t'ie"hdt

+1

Internal Stability

Chapter 6
Exercise 6.11

Consider the time-invariant linear state equation

k(i) = FAx(i)
where F is symmetric and positive definite, and A is such that A +AT is negative definite. By
directly addressing the eigenvalues of PA. show that this state equation is exponentially stable.

Exercise 6.12 For a time invariant linear state equation


.i.(t) = A.x(t)
use

techniques from the proof of Theorem 6.10 to derive a necessary condition and a sufficient

condition for uniform stability in terms of the eigenvalues of A. Illustrate the gap in your
conditions by examples with n = 2.

Exercise 6.13 Suppose the linear state equation k(t) = A (t)x(t)


x0 and t,,, show that the solution of

.i(t)
is

=A(t)v(t) +

f(r),

is

uniformly stable. Then given

x(t,,)

bounded if there exists a finite constant ii such that

If (a) Ilda
Give a simple example to show that if I (t) is

constant, then unbounded solutions can occur.

Exercise 6.14 Prove Theorem 6.7.

Exercise 6.15

Show that the linear state equation

uniformly exponentially stable if and only if


Exercise 6.16

= A O)x 0) with
1,,) = 0 for every

T-periodic A (t)

is

Suppose there exist finite constant a such that hA (z)lI a for all t, and finite y

such that

a)112 day
for all 1, 'r with t t Show there exists a finite constant

such that

5 hIcD(t, a)lI
for

all 1,

t such that t t.

Exercise 6.17 Suppose there exists a finite constant a such that IA 0)11 a for all t. Prove that
the linear state equation

i(t) =A(1)x(t)
is uniformly exponentially stable if and only if there exists a finite constant
5 IhcD(a,t)hI
for

all e,

r such that t

such that

Notes

113

Exercise 6.18 Show that there exists a Lyapunov transformation P (t) such that the linear state
equation i(t) = A(t)x(t) is transformed to
= 0 by the state variable change :(t) = P'(t)x(t)
if and only if there exists a finite constant y such that

t)II y
for all (and r.

NOTES
There is a huge literature on stability theory for ordinary differential equations. The
terminology is not completely standard, and careful attention to definitions is important when
consulting different sources. For example we define uniform stability in a form specific to the
linear case. Stability definitions in the more general context of nonlinear state equations are cast
in terms of stability of an equilibrium state. Since zero always is an equilibrium state for a zeroinput linear state equation, this aspect can be suppressed. Also stability definitions for nonlinear
state equations are local in nature: bounds and asymptotic properties of solutions for initial states
sufficiently close to an equilibrium. In the linear case this restriction is superfluous. Books that
Note 6.1

provide a broader look at the subjects we cover include


R. Bellman. Stability Theory of Differential Equations. McGraw-Hill, New York. 1953
W.A. Coppel, Stability and Asymptotic Behavior of Differential Equations, Heath, Boston, 1965

J.L. Willems, Stability Theo,y of Dynamical Systems, John Wiley, New York, 1970
C.J. Harris, J.F. Miles, StahilTh' of Linear Systems. Academic Press, New York, 1980

Note 6.2 Tabular tests on the coefficients of a polynomial that are necessary and sufficient for
negative-real-part roots were developed in the late I
The modem version is usually
called the Rout/i criterion or the Rout/i-Hurwitz criterion, and can be found in any elementary
control systems text. A detailed review is presented in Chapter 3 of
S. Barnett, Polynomials and Linear Control Systems, Marcel Dekker, New York, 1983

See also Chapter 7 of

W. Kaplan, Operational Methods for Linear Systems, Addison-Wesley, Reading, Massachusetts,


1962

More recently there has been extensive work on robust stability of time-invariant linear systems,
where the characteristic-polynomial coefficients are not precisely known. Consult
B.R. Barmish, New Tools for Robustness of Linear Systems, Macmillan, New York, 1994.

Note 6.3 Typically the definition of Lyapunov transformation includes a bound II P(t) II
for
all t. This additional condition preserves boundedness of A (t) under state variable change, but is
not needed for preservation of stability properties. Thus the condition is missing from Definition
6.14.

7
LYAPUNOV STABILITY
CRITERIA

for stability assessment is the notion


that total energy of an unforced, dissipative mechanical system decreases as the state of
The origin of Lyapunov's so-called direct

the system evolves in time. Therefore the state vector approaches a constant value
corresponding to zero energy as time increases. Phrased more generally, stability
properties involve the growth properties of solutions of the state equation, and these
properties can be measured by a suitable (energy-like) scalar function of the state vector.
The problem is to find a suitable scalar function.

Introduction
illustrate the basic idea we consider conditions that imply all solutions of the linear
state equation
To

= A (t)x (t) , x (ti,) =

such that IIx(t)112 monotonically decreases as t


the derivative of the scalar function
are

For

any solution x(t) of (1),

lix (t) 112 =xT(t)x(t)

with respect to t can be written as

llx(t) 112 =

.T()

(t) +

= VT(r)[AT(t) + A (t) ] x(t)

In this computation
is replaced by A(t)x(t) precisely because x(t) is a solution of
(1). Suppose that the quadratic form on the right side of (3) is negative definite, that is,
suppose the matrix AT(t) +A(t) is negative definite at each t. Then, as shown in Figure

Introduction

115

decreases as t increases. Further we can show that if this negative


definiteness does not asymptotically vanish, that is, if there is a constant v > 0 such that
4T(f) +A(t) vi for all t, then IIx(t)112 goes to zero as t
Notice that the
transition matrix for A (t) is not needed in this calculation, and growth properties of the
scalar function (2) depend on sign-definiteness properties of the quadratic form in (3).
Admittedly this calculation results in a restrictive sufficient conditionnegative
definiteness of AT(t) + A (t) for a type of asymptotic stability. However more general
scalar functions than (2) can be considered.
.1.

II x (t)

112

IIx(t)II

7.1

Figure

IfAT(t) +A (t) <0 at each t, the solution norm decreases

fort

t0.

Formalization of the above discussion involves somewhat intricate definitions of


time-dependent quadratic forms that are useful as scalar functions of the state vector of
U) for stability purposes. Such quadratic forms are called quadratic Lyapunov
They can be written as XTQ (t)x, where Q (t) is assumed to be symmetric and
continuously differentiable for all t. If x (t) is any solution of (1) for t to, then we are
interested in the behavior of the real quantity xT(t)Q (t)x (t) for t to. This behavior
can be assessed by computing the time derivative using the product rule, and replacing

i(t) by A(t)x(t) to obtain

* [xT(t)Q (t)x (t)] = xT(t) [AT(t)Q (t) + Q (t)A (t) + Q(t)

To analyze stability properties, various bounds are required on quadratic


Lyapunov functions and on the quadratic forms (4) that arise as their derivatives along
solutions of (1). These bounds can be expressed in alternative ways. For example the
condition that there exists a positive constant such that

Q(r)iI
for all t is equivalent by definition to existence of a positive ii such that

xTQ(t)x rillx

112

for all t and all n x 1 vectors x. Yet another way to write this is to require existence of a

symmetric, positive-definite constant matrix M such that

Chapter 7

Lyapunov Stability Criteria

VTQ (t)x VTM.v


t and all n x I vectors .v. The choice is largely a matter of taste, and the most
economical form is adopted here.

for all

Uniform Stability
We

begin with a sufficient condition for uniform stability. The presentation style

throughout is to list requirements on Q (r) so that the corresponding quadratic form can
be used to prove the desired stability property.

7.2 Theorem The linear state equation (I) is uniformly stable if there exists an n x n
matrix Q (t) that for all t is symmetric, continuously differentiable, and such that
AT(t)Q(,) + Q(t)A(t) + Q(t)O
where

and p are finite positive constants.

Proof Given any


from (4) and (6),

and

XT(t)Q(t)x(t)

.v0, the corresponding solution .v(t) of (1) is such that,

dc

=5

Using the inequalities in (5) we obtain

.rTO)Q (t)x 0)

(10).v0 p

112 ,

t t0

and then
112,

Therefore

tt()

IIx(t)II

Since (7) holds for any .v(, and ti,, the state equation (1) is uniformly stable by definition.

ODD
Typically it is profitable to use a quadratic Lyapunov function to obtain stability
conditions for a family of linear state equations, rather than a particular instance.

7.3 Example

Consider the linear state equation


0

l a(i)

.v(t)

Uniform Exponential Stability

117

where a (t) is a continuous function defined for all t. Choose Q (t) = 1, so that
xT(t)Q(t)x(t) =xT(t)x(t) = IIx(t)112, as suggested at the beginning of this chapter.
Then (5) is satisfied by 11 = p = 1, and

AT(t)Q(t) + Q(t)A(t) + Q(t) =AT(t) + A(t)


0

0 2a(t)

If a (t) 0 for all t, then the hypotheses in Theorem 7.2 are satisfied. Therefore we
have proved (8) is uniformly stable if a (t) is continuous and nonnegative for all

t.

Perhaps it should be emphasized that a more sophisticated choice of Q (t) could yield
uniform stability under weaker conditions on a (t).

Uniform Exponential Stability


For uniform exponential stability Theorem 7.2 does not sufficethe choice Q (t) = I
provesthat (8) with zero a (t) is uniformly stable, but Example 5.9 shows this case is not
exponentially stable. The strengthening of conditions in the following result appears
slight at first glance, but this is deceptive. For example the strengthened conditions fail
to hold in Example 7.3, with Q (t) =1, for any choice of a (t).
7.4 Theorem

The linear state equation (l)is uniformly exponentially stable if there

exists an n x n matrix function Q (t) that for all t is symmetric, continuously


differentiable, and such that
(9)

AT(t)Q(t) + Q(t)A(t) + Q(t) vi


where

(10)

p and v are finite positive constants.

Proof For any


inequality (10) gives

x0, and corresponding solution x (r) of the state equation, the

t t0

[xT(t)Q(t)x(t)}
Also from (9),

tt0
so that
IIx(r)112 <

t t0

Therefore

[xT(t)Q (t)x (t)]

xT(t)Q (t)x (t)

t t0

Chapter 7

Lyapunov Stability Criteria

and this implies, after multiplication by the appropriate exponential integrating factor,
and integrating from to to 1,

.vT(t)Q (t)x(t)

(t0,)x0,,

Summoning (9) again.


I_v(t)112
I

(ti, )x0,

t t,

which in turn gives

JIx(t)112

112,

Noting that (12) holds for any .v0 and ti,, and taking the positive square root of both

sides, uniform exponential stability is established.

7.5 Example

For the linear state equation

we choose

Q()

[l+2a(t) i]

and pursue conditions on a (t) that guarantee uniform exponential stability via Theorem
7.4. A basic technical condition is that a (t) be continuously differentiable, so that Q (t)
is continuously differentiable. For

2_li]
the positive-semidefiniteness conditions are (see Example 1.5)

l2a(t)iO,
for all t, then Q (t) 111 0 for all
Thus if i is a small positive number and a (t)
t. That is, Q (t)
for all t. In a similar way we consider p1 Q (t), and conclude that
if p is a large positive number and a (r) (p 2)/2 for all t, then Q (t) p1.
Further calculation gives

AT(t)Q(t) + Q(r)A(t) + Q(t) + v/ =

Uniform Exponential Stability

If (t) a(t)v/2 for all 1, where v is a small positive constant, then the last condition
in Theorem 7.4 is satisfied.

In summarizing the results of an analysis of this type, it is not uncommon to


sacrifice some generality for simplicity in the conditions. However sacrifice is not
necessary in this example, and we can state the following, simple sufficient condition.

The linear state equation (13) is uniformly exponentially stable if, for all t, a (t)

is

continuously differentiable and there exists a (small) positive constant ct such that

cta(t) 1/ct
(t)a(t)ct

For n = 2 and constant Q (r) = Q, Theorem 7.4 admits a simple pictorial


representation. The condition (9) implies that Q is positive definite, and therefore the
level curves of the real-valued function vTQv are ellipses in the (x1 , x,)-plane. The

condition (10) implies that for any solution x(t) of the state equation the value of
vT(t)Qv (t) is decreasing as t increases. Thus a plot of the solution x (t) on the
(x1, x2)-plane crosses smaller-value level curves as t increases, as shown in Figure 7.6.
Under the same assumptions, a similar pictorial interpretation can be given for Theorem

7.2. Note that if Q (t) is not constant, the level curves vary with t and the picture is
much less informative.

7.6 Figure A solution x (t) in relation to level curves for xTQx.

Just in case it appears that stability of linear state equations is reasonably intuitive,
consider again the state equation (8) in Example 7.3 with a view to establishing uniform
exponential stability. A first guess is that the state equation is uniformly exponentially

stable if a (t) is continuous and positive for all t, though suspicions might arise if

Chapter 7

120

oo

a (t) * 0 as I

These

Lyapunov Stability Criteria

suspicions would be well founded, but what is more

surprising is that there are other obstructions to uniform exponential stability.

7.7 Example

A particular linear state equation of the form considered in Example 7.3

Here a (t) 2 for all

t, and

- (2 e')]

x(t)

(16)

we have uniform stability, but the state equation is not

uniformly exponentially stable. To see this, verify that a solution is

1 +e'
e'
Clearly this solution does not approach zero as t

DOD
The stability criteria provided by the preceding theorems are sufficient conditions
that depend on skill in selecting an appropriate Q (I). It is comforting to show that there
indeed exists a suitable Q (t) for a large class of uniformly exponentially stable linear
state equations. The dark side is that it can be roughly as hard to compute Q (t) as it is
to compute the transition matrix for A (t).

Theorem Suppose that the linear state equation (1) is uniformly exponentially
stable, and there exists a finite constant a such that IA (t) II a for all (.Then
7.8

Q (t) = $

t) thy

satisfies all the hypotheses of Theorem 7.4.

Proof First we show that the integral converges for each 1, so that Q (1) is well
defined. Since the state equation is uniformly exponentially stable, there exist positive y
and
such that

t,

such

that t

t0. Thus

t)cb(a, t)daII 5

t)lI da

Exponential Stability

121

= y21(2X)

(or all t. This calculation also defines p in (9). Since Q (t) clearly is symmetric and

differentiable at each t, it remains only to show that there exist 11, v > 0 as
in (9) and (10). To obtain v, differentiation of(l7) gives
()

Q(t) = 1 + $ [

t)A (1)] do

Q(t)A(t)

= I

(18)

That is

AT(t)Q(t) + Q(t)A(t) + Q(t) = 1

clearly a valid choice for v in (10) is v = I. Finally it must be shown that there
a positive ii such that Q (t)
for all t, and for this we set up an adroit
maneuver. A differentiation followed by application of Exercise 1.9 gives, for any x
M)dt.
t)cD(o,

t)[AT(o) +

t)x}

t)x

11A1(o) + A(cy)II XTCIT(a t)c1(o, t)x

t)c1(o, t)x

Using the fact that b(o, 1) approaches zero exponentially as a> oo, we integrate both

sides to obtain

[x

1(cy, t)D(a, t)x I do 2cc 5 XTcFT(O, t)cD(o, t)x do

= _2ccxTQ(t)x
Evaluating the integral gives

_vTx _2ccxTQ(t)x

(19)

Chapter 7

122

Lyapunov Stability Criteria

for all t. Thus with the choice ii = l/(2ct) all hypotheses of Theorem 7.4 are satisfied.

ooi:i
Exercise 7.18 shows that in fact there is a large family of matrices Q(t) that can
be used to prove uniform exponential stability under the hypotheses of Theorem 7.4.

Instability
Quadratic Lyapunov functions also can be used to develop instability criteria of various

types. One example is the following result that, except for one value of t, does not
involve a sign-definiteness assumption on Q (t).

Suppose there exists an n x n matrix function Q (t) that for all t is


7.9 Theorem
symmetric, continuously differentiable, and such that

IIQ(t)II p
AT(r)Q(t) + Q(t)A(t)

Q(t)

(20)

vi

where p and v are finite positive constants. Also suppose there exists a
such that
Q(ta) is not positive semidefinite. Then the linear state equation (1) is not uniformly
stable.

Proof

Suppose x (t) is the solution of (1) with

= t0 and x0 = Xa such that

4Q (ti, )Xa <0. Then, from (21),


xT(t)Q (t)x (t)

(a)x(a)] da

=5

(a) da <0,

v5

One consequence of this inequality, (20), and the choice of

and ti,, is

<0, t

pIIx(t)112

(22)

and a further consequence is that


v

xT(cr)x (a)

da

xT(t)Q (t)x (t)

Ixr(t)Q (t)x (t) I +


(t)x
Using (20) and (23) gives

t t0

(23)

Time-Invariant Case

123

xT
The state equation

tt0

(24)

can be shown to be not uniformly stable by proving that x (t) is

unbounded. This we do by a contradiction argument. Suppose that there exists a finite y


suchthat IIx(t)II
tt0. Then(24)gives
$ xT(a)x

dr

2p'y2
t

and the integrand, which is a continuously-difirentiable scalar function, must go to zero


as t oo Therefore x (t) must also go to zero, and this implies that (22) is violated for
sufficiently large t. The contradiction proves that x (t) cannot be a bounded solution.

7.10 Example

Consider a linear state equation with

A(t)=

-a2(t)]

The choice

Q(t)=

[al(t)

(25)

?]

gives

Q(t)A(t)

AT(t)Q(t) + Q(t)

a1(t)
0

Suppose that 1(t) is continuously differentiable, and there exists a finite constant p
such that a1 (t) I p for all t. Further suppose there exists t0 such that
(ta) < 0, and
a positive constant v such that, for all t,
I

a2(t)v/2
Then it is easy to check that all assumptions of Theorem 7.9 are satisfied, so that under
these conditions on
(t) and a2(t) the state equation is not uniformly stable. The
unkind might view this result as disappointing, since the obvious special case of constant
A is not captured by the conditions on a 1(t) and a2(t).

Time-Invariant

Case

In the time-invariant case quadratic Lyapunov functions with constant Q can be used to

connect Theorem 7.4 with the familiar eigenvalue condition for exponential stability. if
Q is symmetric and positive definite, then (9) is satisfied automatically. However,
rather than specifying such a Q and checking to see if a positive v exists such that (10)

is satisfied, the approach can be reversed. Choose a positive definite matrix M,

for

Chapter 7

124

Lyapunov Stability Criteria

example M = vi, where v >0. If there exists a symmetric, positive-definite Q such that

QA + ATQ = M

(26)

then all the hypotheses of Theorem 7.4 are satisfied. Therefore the associated linear state

equation

=Ax(t)

.v(0)

X()

exponentially stable, and from Theorem 6.10 we conclude that all eigenvalues of A
have negative real parts. Conversely the eigenvalues of A enter the existence question
for solutions of the Lyapunov equation (26).
is

7.11 Theorem

Given an n x n matrix A, if M and Q are symmetric, positive-definite,

n x n matrices satisfying (26), then all eigenvalues of A have negative real parts.
Conversely if all eigenvalues of A have negative real parts, then for each symmetric
n x n matrix M there exists a unique solution of (26) given by
Q

dt

(27)

Furthermore if M is positive definite, then Q is positive definite.

Proof As remarked above, the first statement follows from Theorem 6.10. For the
converse, if all eigenvalues of A have negative real parts. it is obvious that the integral

in (27) converges, so Q is well defined. To show that Q is a solution of (26), we


calculate
+ Je/TIMeAIA dt

ATQ + QA

di

= 5

= M

=
0

To prove this solution is unique, suppose Qa also is a solution. Then


(28)

But this implies


et%Tt(Qa

= 0,

from which

t0
Integrating both sides from 0 to co gives

t0

Exercises

125

0=eATI(Qa_Q)eM
0

=(QaQ)

That is, Qa = Q
Now suppose that M is positive definite. Clearly Q is symmetric. To show it is
positive definite simply note that for a nonzero n x 1 vector x,
XTQX = JxTeATtMeAtx

since

dt >0

the integrand is a positive scalar function. (In detail,

(29)
eAtx

for t 0, so

positive definiteness of M shows that the integrand is positive for all t 0.)

DOD
Connections between the negative-real-part eigenvalue condition on A and the
Lyapunov equation (26) can be established under weaker assumptions on M. See
Exercise 7.14 and Note 7.2. Also (26) has solutions under weaker hypotheses on A,
though these results are not pursued.

EXERCISES
Exercise 7.1

For a linear state equation where A (t) =

T(t), find a Q (t) that demonstrates

uniform stability. Is there such a state equation for which you can find a Q (t) that demonstrates
uniform exponential stability?

Exercise 7.2 State and prove a Lyapunov instability theorem that guarantees every nonzero
initial state yields an unbounded solution.
Exercise 7.3 Consider the time-invariant linear state equation

i(t)=FAx(t)
where Fis ann x n symmetric, positive-definite matrix, lithe n x n matrix A is such that A AT is
negative definite, use a clever Q to show that the state equation is exponentially stable.

Exercise 7.4 For the time-invariant linear state equation

use Theorem 7.11 to derive a necessary and sufficient condition on


when a0 = 1.
Exercise 7.5 Using

Q(r)=Q=

1/2J

for exponential stability

Chapter 7

126

Lyapunov Stability Criteria

find the weakest conditions on a (a') such that

-21
can

be shown to be uniformly stable.

Exercise 7.6 For a linear state equation with

A(t)=

-2]

consider the choice

Q(t)=

[aCt) 0]

Find the least restrictive conditions on a(t) so that uniform exponential stability can be
concluded. Does there exist an a (1) satisfying the conditions?

Exercise 7.7 For a linear state equation with

A(t)=
use

a1 (t) a2(t)

the choice

10
0
to

a1(f)

determine conditions on a (t) and a2(t) such that the state equation is uniformly stable.

Exercise 7.8 For a linear state equation with

A(t)-

_a2(t)]

use

Q(t)=

[ai(t)

?]

determine conditions on a 1(t) and a2(t) such that the state equation is uniformly stable. Do
there exist coefficients
(a') and a2(t) such that this Q (a') demonstrates uniform exponential
stability?
to

Exercise 7.9 For a linear state equation with

A(t)
use

-a(t)]

Exercises

127

2a(t)+l

a(t)+1
a(t)
to derive sufficient conditions for uniform exponential stability.

Exercise 7.10 For a linear state equation with

A(t)=

[0,
a(t)}

use

Q(t)=

a (t)

to determine conditions on a (t) such that the state equation is uniformly stable.

Exercise 7.11 Show that all eigenvalues of the matrix A have real parts less than
< 0 if arid
only if for every symmetric, positive-definite M there exists a unique, symmetric, positive-definite
Q such that

ATQ + QA 2iiQ = M
Exercise 7.12 Suppose that for given constant n xn matrices A and M there exists a constant,
n x n matrix Q that satisfies

ATQ + QA =-M
Show that for all t 0,
Q=

do

For a given constant, n x a matrix A, suppose M and Q are symmetric, positive


definite, a x a matrices such that
Exercise 7.13

QA + ATQ = M
Using the (in general complex) eigenvectors of A in a clever way, show that all eigenvalues of A
have negative real parts.

Exercise 7.14 Suppose Q and M are symmetric, positive-semidefinite, a x n matrices satisfying

QA+ATQ=_M
where A is a given a x n matrix. Suppose also that for any n x I (complex) vector z,
zfeATfMeAlz = 0,

implies

t 0

Lyapunov Stability Criteria

Chapter 7

128

urn e't'z = 0
I

-*00

Show that all eigenvalues of A have negative real parts. Hint: Use contradiction, working with an
offending eigenvalue and corresponding eigenvector.

Exercise 7.15 Develop a sufficient condition for existence of a unique solution and an explicit
solution formula for the linear equation

FQ +

QA =

-M

where F, A, and M are specified, constant n x n matrices.

Exercise 7.16 Suppose the ,z x n matrix A has negative-real-part eigenvalues and M is an n x ii,
symmetric, positive-definite matrix. Prove that if Q satisfies
QA + ATQ = M
then
max

Ot<oo

Hint: At any t 0 use a particular ti x I vector . and the Rayleigh-Ritz inequality for
S

Exercise 7.17 Suppose that all eigenvalues of A have real parts less than ji < 0. Show that for
any e satisfying 0 <e < pt,
+l.tE) e

t0

where Q is the unique solution of

ATQ QA + 2(uie)Q = I

Hint: Use Theorem 7.11 to conclude

di

= 5 et
Then show that for any n x I vector x and any I 0,

dcy 2(IIA II

Exercise 7.18

State and prove a generalized version of Theorem 7.8 using

Q(r) = JDT(cy. t)P (cy)cD(a, i)da


under appropriate assumptions on the it x n matrix P (cr).

Exercise 7.19 For the linear state equation with

Notes

129

'I

tO

A(t)=
l
o

use a diagonal Q (r)

to

'

t<0

prove uniform exponential stability. On the other hand, show that

= AT(t)x(,) is unstable. (This continues a topic raised in Exercises 3.5 and 3.6.)
= A (t)x(t), suppose there exists a real
Exercise 7.20 Given the linear state equation
function v (1, x) that is continuous with respect to t and .-, and that satisfies the following

conditions.
(a) There exist continuous, strictly increasing real functions

and

such that a(0) =

= 0,

and
a(II_v

II) v(t, x)

13( Il_v II)

for all t and all x.


.v(t)) is nonincreasing.
(b) If.v(t) is any solution of the state equation, then the time function
Prove that the state equation is uniformly stable. (This shows that attention need not be restricted
to quadratic Lyapunov functions, and smoothness assumptions can be weakened.) Hint: Use the
characterization of uniform stability in Exercise 6.1.

Exercise 7.21 If the state equation


= A (t)x(f) is uniformly stable, prove that there exists a
function v (t, x) that has the properties listed in Exercise 7.20. Hint: Writing the solution of the
state equation with x (r,,) = .v,, as x (t; x,,, :,,), let

r(r, .v) = sup IIxO +a; x, 1) H


oO

where suprernurn denotes the least upper bound.

NOTES
Note 7.1 The Lyapunov method is a powerful tool in the setting of nonlinear state equations as
well. Scalar energy-like functions of the state more general than quadratic forms are used, and

this requires general definitions of concepts such as positive definiteness. Standard, early
references are

R.E. Kalman, J.E. Bertram, "Control system analysis and design via the "Second Method" of
Lyapunov, Part I; Continuous-time systems," Transactions of the ASME, Series D: Journal of
Basic Engineering, Vol. 82, pp. 371 393, 1960
W. Hahn, Stability of Motion, Springer-Verlag, New York, 1967

The subject also is treated in many introductory texts in nonlinear systems. For example,
H.K. Khalil, Nonlinear Systems, Macmillan, New York, 1992

M. Vidyasagar, Nonlinear Systems Analysis, Second Edition, Prentice Hall, Englewood Cliffs,
New Jersey, 1993

Chapter 7

130

Lyapunov Stability Criteria

Note 7.2 The conditions

O<iiI
AT(t)Q(t) + Q(t)A(t) +

Q(t)

vi

<0

for uniform exponential stability can be weakened in various ways. Some of the more general
criteria involve concepts such as controllability and observability that are discussed in Chapter 9.
Early results can be found in
B.D.O. Anderson, J.B. Moore, "New results in linear system stability," SIAM Journal on Control.
Vol.7, No.3, pp. 398 414, 1969

S.D.O. Anderson, "Exponential stability of linear equations arising in adaptive identification,"


IEEE Transactions on Automatic Control, Vol. 22, No. 1, pp. 83 88, 1977

Further weakening of the conditions can be made by replacing controllability/observability


hypotheses by stabilizability/detectability hypotheses. See

R. Ravi, A.M. Pascoal, PP. Khargonekar, "Normalized coprime factorizations and the graph
metric for linear time-varying systems," Systems & Control Letters, Vol. 18, No. 6, pp. 455 465,
1992

In the time-invariant case see Exercise 9.9 for a sample result that involves controllability and
observability. Exercise 7.14 indicates the weaker hypotheses that can be used.

8
ADDITIONAL STABILITY
CRITERIA

In

addition to the Lyapunov stability criteria in Chapter 7, other types of stability

conditions often are useful. Typically these are sufficient conditions that are proved by
application of the Lyapunov stability theorems, or the Gronwall-Beliman inequality

(Lemma 3.2 or Exercise 3.7), though sometimes either technique can be used, and
sometimes both are used in the same proof.

Eigenvalue Conditions
At first it might be thought that the pointwise-in-time eigenvalues of A (t) could be used
to characterize internal stability properties of a linear state equation

=A(t)x(t) , x(t0) =x0


but this is not generally true. One example is provided by Exercise 4.16, and in case the
unboundedness of A (t) in that example is suspected as the difficulty, we exhibit a wellknown example with bounded A (t).
8.1

Example

For the linear state equation (1) with

A(t)=

1 + a cos2 t
.
1cxsintcost

a sin t cost

l+asint

where a is a positive constant, the pointwise eigenvalues are constants, given by


a
2

It is not difficult to verify that

Chapter 8

132

0) =

Additional Stability Criteria

e" -I)! Sint e -( cost


(

Thus while the pointwise eigenvalues of A (t) have negative real parts if 0 <a < 2, the
state equation has unbounded solutions if a> 1.

DOD
Despite such examples the eigenvalue idea is not completely daft. At the end of
this chapter we show, via a rather complicated Lyapunov argument, that for slowly
time-varying linear state equations uniform exponential stability is implied by negativereal-part eigenvalues of A (t). Before that a number of simpler eigenvalue conditions
(on A (t) + A T(t) not A (t) ) and perturbation results are discussed, the first of which is a
straightforward application of the Rayleigh-Ritz inequality reviewed in Chapter 1.

For the linear state equation (I), denote the largest and smallest

8.2 Theorem

Then for any X0 and

pointwise eigenvalues of A (t) +AT(t) by Xniax(t) and


the solution of (1) satisfies

do

iix(t)ii 11x0 lie

11x0 lie "

4
"

do
,

(3)

Proof First note that since the eigenvalues of a matrix are continuous functions of
the entries of the matrix, and the entries of A(t)+AT(t) are continuous functions of 1,
the pointwise eigenvalues
and
are continuous functions of t. Thus the
integrals in (3) are well defined. Suppose x (t) is a solution of the state equation
corresponding to a given to and nonzero x0. Using

f
the

f [xT(,)x(r)] =XT(r)[AT(t) + A(t)].v(t)

Rayleigh-Ritz inequality gives


lx (t) Ii 2Xmjn(t) f lix (t) 112 lix 0)11

Dividing through by l!x(t)


yields
any

112,

t t0

which is positive at each t, and integrating from t0 to

da In llx(t)

112

In

11x0, 112

< .1 2Lmax(a) da,

t t0

Exponentiation followed by taking the nonnegative square root gives (3).

DOD
Theorem 8.2 leads to easy proofs of some simple stability criteria based on the
eigenvalues of A (t) +AT(,).

Perturbation Results

133

Corollary The linear state equation (1) is uniformly stable if there exists a finite
constant y such that the largest pointwise eigenvalue of A (t) + AT(t) satisfies
8.3

for all t, 'r such that t r.

8.4 Corollary The linear state equation (1) is uniformly exponentially stable if there
exist finite, positive constants and X such that the largest pointwise eigenvalue of
A (t) + A T(t) satisfies
X(r r) + y

for all t, t such that t 'r.

These criteria are quite conservative in the sense that many uniformly stable, or
uniformly exponentially stable, linear state equations do not satisfy the respective
conditions (4) and (5).

Perturbation Results
Another approach is to consider state equations that are close, in some sense, to a state

equation that has a particular stability property. While explicit, tight bounds sometimes

are of interest, the focus here is on simple calculations that establish the desired
property. We discuss an additive perturbation F (t) to an A (t) for which stability
properties are presumed known, and require that F (t) be small in a suitable way.
8.5 Theorem

Suppose the linear state equation (1) is uniformly stable. Then the linear

state equation

[A(t) + F(t)Iz(t)
is uniformly stable if there exists a finite constant

such that for all t

S IIF(r)II
Proof For any t0 and z0 the solution of (6) satisfies

z(t) = '1A(t, t0)z0 +


where, of course,

(t, t) denotes the transition matrix for A (t). By uniform stability


of (1) there exists a constant y such that
for all t, t such that t t.
t)II
Therefore, taking norms,

Additional Stability Criteria

Chapter 8

134

IIz(t)II

II +

Applying the Gronwall-Bellman inequality (Lemma 3.2) gives


S?IIF(011

c/c

II

Then the bound (7) yields

IIz(t)II

tt0

1z011 ,

and uniform stability of (6) is established since this same bound can be obtained for any

value of

Theorem Suppose the linear state equation (1) is uniformly exponentially stable
and there exists a finite constant a such that IA (t)II a for all t. Then there exists a
8.6

positive constant 13 such that the linear state equation

i(t) = [A (t) + F(t)


is uniformly exponentially stable if IIF(t II
Proof Since (I) is

]: (t)

(8)

for all t.

exponentially stable and A (t)

is

bounded, by Theorem

7.8

t)da

Q(t) =

(9)

is such that all the hypotheses of Theorem 7.4 are satisfied for (1). Next we show that
Q Ct) also satisfies all the hypotheses of Theorem 7.4 for the perturbed linear state
equation (8). A quick check of the required properties reveals that it only remains to
show existence of a positive constant v such that, for all t,

[A(t)

+ F(f)}TQ(t)

+ Q(t){A(t)

F(tfl

Q(t) -vi

By calculation of Q(t) from (9), this condition can be rewritten as


FT(t)Q (t) + Q (t)F(t) (I v)i

(10)

for all t. Denoting the bound on IIQ (t)II by p and choosing 13 = l/(4p) gives
IIFT(f)Q (r) + Q ()F(r) II

for all I,

and

2 IIF(t) liii Q (t) II 1/2

thus (10) is satisfied with v = 1/2.

DOD
The different types of perturbations that preserve the different stability properties
in Theorems 8.5 and 8.6 are significant. For example the scalar state equation with A (t)
zero is uniformly stable, though a perturbation F (t) = 13, for any positive constant f3, no

Slowly-Varying Systems

135

matter how small, clearly yields unbounded solutions. See also Exercise 8.6 and Note

Slowly-Varying Systems
Now a basic result involving an eigenvalue condition for uniform exponential stability
of linear state equations with slowly-varying A 0') is presented. The proof offered here
makes use of the Kronecker product of matrices, which is defined as follows, If B is an
it8 X nifi matrix with entries h11, and C is an 1tc < mc matrix, then the Kronecker product
BC is given by

h11C

. .

BC=
C
Obviously BC is an nnnc x

. .

b,,,,,,,,, C

matrix, and any two matrices are conformable with

respect to this product. Less clear is the fact that the Kronecker product has many
interesting properties. However the only properties we need involve expressions of the
form lB + B!, where both B and the identity are ii x ii matrices. It is not difficult to
show that the n2 eigenvalues of !B + B! are simply the n2 sums A, +
n, where
A,, are the eigenvalues of B. Indeed this is transparent
1, j = I
in the case of diagonal B. And writing JB as a sum of n partitioned matrices, each
with one B on the block diagonal, it follows from Exercise 1.8 that II/B II <ii lB II.
For B! a similar argument using an elementary spectral-norm bound from Chapter 1
gives lB!
IIB II. (Tighter bounds can be derived using additional properties of
the Kronecker product.)

8.7 Theorem
Suppose for the linear state equation (1) with A(t) continuously
differentiable there exist finite positive constants a, such that, for all t, IA (t)II a
and every pointwise eigenvalue of A(t) satisfies Re[A(t)j < p. Then there exists a
positive constant
such that if the time-derivative of A (t) satisfies IIA(t)ll
t. the state equation is uniformly exponentially stable.

Proof For each t

let

13

for all

ii x n Q (t) be the solution of

AT(t)Q(t) + QO)A(t) =

Existence, uniqueness, and positive definiteness of Q 0') for each t


Theorem 7.11, and furthermore
Q (t) = J

T(t)oeA(.l)a

(12)

da

is

guaranteed by

(13)

The strategy of the proof is to show that this Q 0') satisfies the hypotheses of Theorem
7.4, and thereby conclude uniform exponential stability of (1).

Additional Stability Criteria

Chapter 8

136

First we use the Kronecker product to show boundedness of Q (t). Let e, denote
the it/I_column of I, and Q.(t) denote the i"-column of Q(t). Then define the ,,2 x 1

vectors (using a standard notation)

Q1(t)

e1

vec[1] =

vec[Q(t)] =
Q,,(t)

The following manipulations show how to write the ii x n matrix equation (12) as an
x 1 vector equation.

The j"-column of Q (t)A (t) in terms of the j"-column

A (t) is

Q
=

[a 11(t)I

vec[Q (t)]

a,11(t)J

= [AJ(t)1 ] vec[Q(t)]

Stacking these columns gives

[Af(t)I J vec[Q(r)]
= [AT(t)J]vec[Q(r)]

] vec[Q (t)]

Similar stacking of columns of AT(t)Q(t) gives [IAT(t)]vec[Q(t)J, and thus (12) is


equivalent to

[AT(t)I + IAT(t)]vec[Q(t)} = vec[1]


Now we prove that vec[Q (t)] is bounded, and thus show that there exists a finite

p such that Q (t) p1 for all t by the easily verified matrix-vector norm property
A,1(t) are the pointwise eigenvalues of A(t),
IIQ(t)II n Ilvec[Q(t)}II. If X1(t)
then the n2 pointwise eigenvalues of [A T(t)J + IA T(t) J are
=

Then Re[

21.t,

+ A.1(t) ,

i,

j = I,

. .

,n

for all t, from which

det [AT(t)! + 1AT(t) I I =

[I

I.) =

for all t. Therefore AT(t)I iAT(t) is invertible at each t. Since A (t) is bounded,
A T(r)! + IA T(r) is bounded, and hence the inverse

[AT(t)! + IAT(t)]'

Slowly-Varying Systems

137

bounded for all r by Exercise 1.12. The right side of (14) is constant, and therefore
(t)] is bounded.
we conclude that
Clearly Q (t) is symmetric and continuously differentiable, and next we show that
there exists a v > 0 such that
is

AT(t)Q(r) Q(t)A 0') + Q(t)


for all t. Using (12) this requirement can be rewritten as

Q(t)(l v)!
Differentiation of (12) with respect to t yields
AT(t)Q0') + Q(t)A(t) = _AT(t)Q(t)

Q(t)A(t)

At each t this Lyapunov equation has a unique solution

Q(t) =

[AT(t)Q

(1) + Q (t)A(t) I

da

again since the eigenvalues of A 0') have negative real parts at each t. To derive a
bound on II Q(t)Il, we use the boundedness of IIQ (t)ll. For any ii x vector x and any
1

[A

(1) + Q (t)A(f) I

)Q (1) + Q (t)A(t) II
Thus

[AT0')Q

IxTO(tx I =

IIA(t)

(t) + Q (t)A(t) I

da I

Q (t) II xTQ (t)x

Maximizing the right side over unity norm x, Exercise 1.10 gives, for all x such that
IIx II =

1,
I

I 2 IIA(t) III Q (t) 112

This yields, on maximization of the left side of(17) over unity norm x,
II Q(.r) II 2 IIA(r) liii Q 0') 112

for all t. Using the bound on IIQ(t)II, the bound

on IIA(t)II can be chosen so that,


for example, IIQ(t)II 1/2. Then the choice v = 1/2 can be made for(15).
It only remains to show that there exists a positive 11 such that Q (t) ii! for all t,
and this involves a maneuver similar to one in the proof of Theorem 7.8. For any t and
any ii x 1 vector x,

Additional Stability Criteria

Chapter 8

138

+
(18)

Therefore, since eA

goes to zero exponentially as

00,

_2axTQ(t)x

=f

(19)

That is,

for any t, and the proof is complete.

EXERCISES
Derive a necessary and sufficient condition for uniform exponential stability of a
scalar linear state equation.
Exercise 8.1

Exercise 8.2

Show that the linear state equation

= A (t)x (t) is not uniformly stable if for

some

urn Jtr[A(a)]da=oo
Exercise 8.3 Theorem 8.2 implies that the linear time-invariant state equation

i(t) =

Ax (t)

is exponentially stable if all eigenvalues of A + AT are negative. Does the converse hold?

Exercise 8.4 Is it true that all solutions y (t) of the n "i-order linear difibrential equation
+

approach zero as t

00 if for some

(t)yt"

(t) =

there is a positive constant a such that


urn
t *

Exercise 8.5 For the time-invariant linear state equation

c(t)=(A +F)x(f)
suppose constants a and K are such that
IIe'"II

Show that

t0

Exercises

139
jj

Exercise 8.6

+F)t

>0

FU)I

Suppose that the linear state equation

k(z) = A (t)x(t)
uniformly exponentially stable. Prove that if there exists a finite constant 13 such that
J IIF(t)II

dt13

kr all t, then the state equation


k(t) = [A(z) +

F(t)]x(i)

is uniformly exponentially stable.

Exercise 8.7 Suppose the linear state equation

k(s) = [A + F(s) ]x(t)

x(t,,) =

is such that the constant matrix A has negative-real-part eigenvalues and the continuous matrix
F(s) satisfies

tim IIF(t)Il =0
Prove that given any ,, and x1, the resulting solution satisfies

lim v(t)=0

I -*00

Exercise 8.8 For an n x n matrix function A (t), suppose there exist positive constants a,
such
dint. for all t, IA (5)11 a and the pointwise eigenvalues of A (t) satisfy Re[ X(t)J pa. If Q (5) is
the unique positive definite solution of

AT(t)Q(t) + Q(t)AO) = I
show that the linear state equation

k(s) = [A(s)

is uniformly exponentially stable.

Extend Exercise 8.8 to a proof of Theorem 8.7 by using the Gronwall-Bellman


inequality to prove that if A (5) is continuously differentiable and IIA(s)II 13 for all t, with 13
sufficiently small, then uniform exponential stability of the linear state equation
Exercise 8.9

i(s) =A(t):(t)
is implied by uniform exponential stability of the state equation

k(s) = [A(s)

(t)Q(t) ]x(t)

Exercise 8.10 Suppose A(s) satisfies the hypotheses of Theorem 8.7. Let
F(s) = A(s) + (i.i12)! ,

Q(t) = 5

eFT(

Additional Stability Criteria

Chapter 8

140

and let p be such that Q (t) p1, as in the proof of Theorem 8.7. Show that for any value oft,
lie %(I)t

+ j.i)p e

r 0

Hint. See the hint for Exercise 7.17.


Exercise 8.11

Consider the single-input, n-dimensional, nonlinear state equation

i(I) =A(u(t))x(t)

b(u(t)) , .v(0) =x,,

where the entries of A


and b
are twice-continuously-ditlerentiable functions of the input.
Suppose that for each constant ii,, satisfying oo <11mm U,, tmax <00 the eigenvalues of A
have negative real parts. For a continuously-differentiable input signal u(1) that satisfies
11m1n

U (1)

and Ii,(t) I S for all t 0. let

q(t)= A'(u(tflh(u(i))
Show that if S is sufficiently small and llx,, q

(0) H

is small, then

(t)q (1)11 remains small for

alIt 0.
Exercise 8.12

Consider the nonlinear state equation

is(t) = [A + F(t)].v(t) + g(t, .v(t))

.v(t,,) =x,,

where A is a constant n x ti matrix with negative-real-part eigenvalues, F(t) is a continuous ii x n


for all t, and g(t, .v) is a continuous function that satisfies
hg (I, v) II <sM x Ii for all t, .v. Suppose .v (i) is a continuously differentiable solution defined for
all t i,,. Show that if and S are sufficiently small, then there exists finite positive constants y, X

matrix function that satisfies F(t)

such that

flx(t) U ye

II

for all t

NOTES
Note 8.1

Example 8.1 is from

L. Markus, H. Yarnabe, "Global stability criteria for differential systems," Osaka Mathematical
Journal. Vol. 12, pp.305 317, 1960
An example of a uniformly exponentially stable linear state equation where A(t) has a pointwise
eigenvalue with positive real part for all t, but is slowly varying, is provided in
R.A. Skoog, G.Y. Lau, "Instability of slowly varying systems." IEEE Transactions on Automatic
Control, Vol. 17. No. I, pp. 86 92. 1972

A survey of results on uniform exponential stability under the hypothesis that pointwise
eigenvalues of the slowly-varying A (t) have negative real parts is in

A. Ilchmann, D.H. Owens, D. Pratzel-Wolters, "Sufficient conditions for stability of linear timevarying systems," Systems & Control Lerers, Vol. 9, pp. 157 163, 1987

An influential paper not cited in this reference is

C.A. Desoer, "Slowly varying system * = A (1 )x." IEEE Transactions on Automatic Control. Vol.
l4,pp.'18078l, 1969

Notes

141

Recent work has produced stability results for slowly-varying linear state equations where
cigenvalues can have positive real parts, so long as they have negative real parts 'on average.' See

V. Solo, "On the stability of slowly time-varying linear

Mathematics of Control,

Signals, and Systems, to appear, 1995

A sufficient condition for exponential decay of solutions in the case where A (t) commutes with its
integral is that the matrix function

be bounded and have negative-real-part eigenvalues for all t

This is proved in Section 7.7 of

D.L. Lukes, Differential Equations: Classical to Controlled, Academic Press, New York, 1982

Note 8.2

Tighter bounds of the type given in Theorem 8.2 can be derived by using the matrix

measure. This concept is developed and applied to the treatment of stability in

W.A. Coppel, Stability and Asymptotic Behavior of Differential Equations, Heath, Boston, 1965

Note 8.3 Finite-integral perturbations of the type in Theorem 8.5 can induce unbounded
solutions when the unperturbed state equation has bounded solutions that approach zero
asymptotically. An example is given in Section 2.5 of
R. Bellman, Stability Theory of

Equations, McGraw-Hill, New York, 1953

Also in Section 1.14 state variable changes to a time-variable diagonal form are considered. This
approach is used to develop perturbation results for linear state equations of the form

i(t) = [A + F(t) ]x(t)


For additional results using a diagonal form for A (t), consult

M.Y. Wu, "Stability of linear time-varying systems," International Journal of System Sciences.
Vol. 15, pp. 137 150, 1984

More-advanced perturbation results are provided in

D. Hinrichsen, A.J. Pritchard, "Robust exponential stability of time-varying linear systems,"


International Journal of Robust and Nonlinear Control. Vol. 3, No. 1, pp. 63 83. 1993
Note 8.4 Extensive information on the Kronecker product is available in

R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge,
England, 1991

Note 8.5 Averaging techniques provide stability criteria for rapidly-varying periodic linear state
equations. An entry into this literature is

R. Bellman, J. Bentsman, S.M. Meerkov, "Stability of fast periodic systems," IEEE Transactions
on Automatic Control, Vol. 30, No. 3, pp. 289
291, 1985

9
CONTROLLABILITY
AND OBSERVABILITY

The fundamental concepts of controllability and observability for an rn-input, p-output,

n-dimensional linear state equation

i(t)A(t)x(t) + B(t)u(t)
y(t) = C(r)x(t)

D(t)u(t)

are introduced in this chapter. Controllability involves the influence of the input signal
on the state vector, and does not involve the output equation. Observability deals with
the influence of the state vector on the output signal, and does not involve the effect of a
known input signal. In addition to their operational definitions in terms of driving the
state with the input, and ascertaining the state from the output, these concepts play
fundamental roles in the basic structure of linear state equations. The latter aspects are
addressed in Chapter 10, and, using stronger notions of controllability and observability,
in Chapter 11. For the time-invariant case further developments occur in Chapter 13 and
Chapter 18.

Controllability
a tune-varying linear state equation, the connection of the input signal to the state
variables can change with time. Therefore the concept of controllability is tied to a
For

specific, finite time interval denoted [t0,

with, of course, t1> t,,.

The linear state equation (1) is called controllable on


if given
any initial state x (t0) =
there exists a continuous input signal U (t) such that the
9.1 Definition

corresponding solution of (1) satisfies x (ti) = 0.


141

Controllability
The

143

continuity requirement on the input signal is consonant with our default

technical setting, though typically much smoother input signals can be used to drive the
state of a controllable linear state equation to zero. Notice also that Definition 9.1
implies nothing about the response of (1) for t > t1. In particular there is no requirement
that the state remain at 0 for t > t1. However the definition reflects the notion that the

input signal can independently influence each state variable on the specified time
interval.

As we develop criteria for controllability, the observant will notice that


contradiction proofs, or proofs of the contrapositive, often are used. Such proofs
sometimes are criticized on the grounds that they are unenlightening. In any case the
contradiction proofs are relatively simple, and they do explain why a claim must be true.

9.2 Theorem

The linear state equation (1) is controllable on [t0,

if and only if the

n X n matrix

W(t0,

t) dt

t)B

(2)

is invertible.

Proof Suppose W (t0 , t4 is invertible. Then given an n x 1 vector x0 choose

u(t) = _BT(t)c1T(r0, t)W'(t0, t1)x0 , t e [t0, tj]

(3)

and let the obviously-immaterial input signal values outside the specified interval be any
continuous extension. (This choice is completely unmotivated in the present context,
though it is natural from a more-general viewpoint mentioned in Note 9.2.) The input
signal (3) is continuous on the interval, and the corresponding solution of (1) with
x (t0) = x0 can be written as
x (t1) =

t0)x0 +

= c1(t1, t0 )x0

a)B (a)u

f c1(t1,

da

a)W' (t0, t1)x0

(cy)B

Using the composition property of the transition matrix gives


x (If) =

t0)x0

(a)B

da W' (ta, t1)x0

=0
Thus the state equation is controllable on [t0, ti].

To show the reverse implication, suppose that the linear state equation (1) is
controllable on [t0, t1] and that W (ta,

is not invertible. On obtaining a contradiction

Chapter 9

144

we conclude that W(t0,

exists

nonzero n x 1 vector

Controllability and Observability

must be invertible. Since W(i'0, t1) is not invertible there


such that
(I

0 =

tj)X0 =

t)x0 dt

t)B

(4)

Because the integrand in this expression is the nonnegative, continuous function


t)B(t)112, it follows that

t)B(r) = 0, t E [ta, tj]


Since

the state equation is controllable on [t0,

choosing

(5)

= x0

there exists a

continuous input u (t) such that


0=

a)B (a)u (a) da

to)Xa + J

'1

=J

a)B (a)u (a) da

Multiplying through by 4 and using (5) gives


If

= f 4'cb(t<,, a)B (a)u (a) da = 0


and

this contradicts x0

(6)

0.

ODD
The controllability Grarnian W (t0, r1) has many properties, some of which are
explored in Exercises. For every
t0 it is symmetric and positive semidefinite. Thus
if and only if W(t0, t1) is positive
the linear state equation (1) is controllable on [t0,
definite, If the state equation is not controllable on [ta, ti], it might become so if (j is
increased. And controllability can be lost if t1 is lowered. Analogous observations can
be made in regard to changing
Computing W (t0, t1) from the definition (2) is not a happy prospect. Indeed
W (t0,
usually is computed by numerically solving a matrix differential equation
satisfied by W (t, tj) that is the subject of Exercise 9.4. However if we assume
smoothness properties stronger than continuity for the coefficient matrices, the Gramian
condition in Theorem 9.2 leads to a sufficient condition that is easier to check. Key to
the proof is the fact that W (ta, tj) fails to be invertible if and only if (5) holds for some
X0
0. Since (5) corresponds to a type of linear dependence condition on the rows of
D(t0, t)B (t), controllability criteria have roots in concepts of linear independence of
vector functions of time. However this viewpoint is not emphasized here.

9.3 Definition Corresponding to the linear state equation (1), and subject to existence
and continuity of the indicated derivatives, define a sequence of n x m matrix functions
by

Controllability

145

K0(t) = B(t)

K1(t) = A(t)K11(t) +

= 1, 2,

An easy induction proof shows that for all t, a,

{1(t, a)B(a)] =

a)K1(a), j

0, 1,...

(7)

Specifically the claim obviously holds for j = 0. With J a nonnegative integer, suppose
that

[4(t,

a)B

(a)] =

a)Kj(a)

Then, using this inductive hypothesis,

cy)B(a)] =

a)Kj(a)]
a)A(a)Kj(a) + ct(t,

=D(t, a)Kj1(a)
Therefore the argument is complete.

Evaluation of (7) at a = t

gives

a simple interpretation of the matrices in

Definition 9.3:

[1(t, a)B (a)]

a=t

j = 0, 1,

(8)

9.4 Theorem Suppose q is a positive integer such that, for r [ta, tj], B (t) is q-times
continuously differentiable, and A (t) is (q 1)-times continuously differentiable. Then
the linear state equation (1) is controllable on [ta, t4 if for some
[t0, t1]
ranic

Kq(tc)

K1
[

] =

'1

(9)

Proof
Suppose for some
[ti,, tjI the rank condition holds. To set up a
contradiction argument suppose that the state equation is not controllable on [t0, t1].
Then W (t0, t1) is not invertible and, as in the proof of Theorem 9.2, there exists a
nonzero n X

vector Xa such that


t c

Letting x,, be the nonzero vector x1, = c1T(t0,

['a. tjj

we have from (10) that

r)B (r) = 0 , t E [r0,

t1J

(10)

Controllability and Observability

Chapter 9

146

In particular this gives, at r = ti.,


t gives

= 0. Next, differentiating (10) with respect to

r)K1(t) = 0, t E [ta, tf]


from which

0.

Continuing this process gives, in general,

t)B(t)]

0,

t=tC

= 0,

1,..., q

Therefore

4 [Ko(tc) K (te)

Kq(lc)] = 0

and this contradicts the linear independence of the ,z rows implied by the rank condition
in (9). Thus the state equation is controllable on [t0, t1].
DOD
Reflecting on Theorem 9.4 we see that if the rank condition (9) holds for some q
then the linear state equation is controllable on any interval [t0, tjj
containing
(assuming of course that t1> to, and the continuous-differentiability

and some

hypotheses hold). Such a strong conclusion partly explains why (9) is only a sufficient
condition for controllability on a specified interval.
For a time-invariant linear state equation,

i(t) =A.x(t) + Bu(t)

y(t)=Cx(t) +Dn(t)

(11)

the most familiar test for controllability can be motivated from Theorem 9.4 by noting
that

j=0, I,...

However to obtain a necessary as well as sufficient condition we base the proof on


Theorem 9.2.

9.5 Theorem The time-invariant linear state equation (11) is controllable on [ta, t4 if
and only if the n x nm controllability matri' satisfies

rank {

AB ...

A" - 'B] =

Proof We prove that the rank condition (12) fails if and only if the controllability
Gramian
W(t0, t1) = J eA

di

not invertible. If the rank condition fails, then there exists a nonzero n x 1 vector .rc,
such that
is

147

k=0,...,n1
This implies, using the matrix-exponential representation in Property 5.8,
IlI

)BTe

t1) = 5

AT

=0
and thus W(t0,

dt

(13)

is not invertible.

Conversely if the controllability Gramian is not invertible, then there exists a


nonzero x0 such that
=0

t1).Va

This implies, exactly as in the proof of Theorem 9.2,


.T

we
At t =
gives
=

obtain 4B =

0,

IE

r
11

and differentiating k times and evaluating the result at

k =0,..., ni

= 0,

(14)

Therefore

...
which proves that the rank condition (12) fails.
9.6 Example

Consider the linear state equation

] x(t)

] u(t)

+ [

(15)

the constants a and a7 are not equal. For constant values b (t) = b1,
h-,(t) = h-,, we can call on Theorem 9.5 to show that the state equation is controllable if

where

and only if both

and h2 are nonzero. However for the nonzero, time-varying

b1

coefficients
b

another straightforward calculation shows that


e(0I +a,)z,,

W(t,,, tj) =
Since det W(t0, tj) = 0

er"'

+02)!,,

the time-varying linear state equation is not controllable on any


interval [ta, If]. Clearly pointwise-in-time interpretations of the controllability property
can be misleading.

ooi

Controllability and Observability

Chapter 9

148

Since the rank condition (12) is independent of


and t1, the controllability
property for (11) is independent of the particular interval [t0, ti]. Thus for time-invariant
linear state equations the term controllable is used without reference to a time interval.

Observability
The second concept of interest for (1) involves the effect of the state vector on the output

of the linear state equation. It is simplest to consider the case of zero input, and this does
not entail loss of generality since the concept is unchanged in the presence of a known

input signal. Specifically the zero-state response due to a known input signal can be
computed, and subtracted from the complete response, leaving the zero-input response.
Therefore we consider the unforced state equation

=A(t)x(t) ,

=x0

y(t) = C(t)x(t)

(16)

Definition The linear state equation (16) is called observable on [t0, tj] if any
initial state x (t1,) = x0 is uniquely determined by the corresponding response y (t) for
9.7

te

[ta, t1].

Again the definition is tied to a specific, finite time interval, and ignores the
response for t >
The intent is to capture the notion that the output signal is
independently influenced by each state variable.
The basic characterization of observability is similar in form to the controllability
case, though the proof is a bit simpler.
9.8 Theorem

The linear state equation (16) is observable on [ta, tf I if and only if the

n x n matrix
If

M(t0, tj) =

t0)CT(t)C(t)Ci(t, t0) cit

$
(I)

is

invertible.
P,-oof

Multiplying the solution expression

y(t)

t0)x0

on both sides by c1T(t, t0)CT(t) and integrating yields

to)CT(t)y(t)dt = M(t0, tj)x0

left side is determined by y(t), t E [t0, tf]. and therefore (18) represents a linear
is uniquely determined. On
algebraic equation for x0. If M (ta, tj) is invertible, then
the other hand, if M
tj) is not invertible, then there exists a nonzero ii x 1 vector
The

Observability

149

that M (t0,
= 0. This implies
Theorem 9.2, it follows that
such

(t0, ti)xa

0 and, just as in the proof of

C(r)1J(t, t0)x(, = 0, t e [t0, tj]

Thus x(t0) = x0 +Xa yields the same zero-input response for (16) on
x(t0) = x0, and the state equation fails to be observable on [r0,

[t0, t1] as

t1].

The proof of Theorem 9.8 shows that for an observable linear state equation the
initial state is uniquely determined by a linear algebraic equation, thus clarifying a vague
aspect of Definition 9.7. Of course this algebraic equation is beset by the interrelated
difficulties of computing the transition matrix and computing M (t0, t1).
The observability Grarnian M (t0, tf), just as the controllability Gramian W (t0, fj),
has several interesting properties. It is symmetric and positive semidefinite, and positive
definite if and only if the state equation is observable on [t0, tj]. Also M (t0, tf) can be
computed by numerically solving certain matrix differential equations. See the Exercises
for profitable activities that avow the dual nature of controllability and observability.
More convenient criteria for observability are available, much as in the
controllability case. First we state a sufficient condition for observability under
strengthened smoothness hypotheses on the linear state equation coefficients, and then a
standard necessary and sufficient condition for time-invariant linear state equations.

9.9 Definition Corresponding to the linear state equation (16), and subject to existence
and continuity of the indicated derivatives, define p x n matrix functions by

L0(t) = C(t)

j = 1,2,...

It is easy to show by induction that


=

cy)]

j = 0, 1,...

(20)

9.10 Theorem Suppose q is a positive integer such that, for t e [t0, t1], C (t) is qtimes continuously differentiable, and A (t) is (q 1)-times continuously differentiable.
Then the linear state equation (16) is observable on {t0,
if for some e [t0, t1],
L0(ta)
rank
Lq(ta)

Similar to the situation in Theorem 9.4, if q and ta are such that (21) holds, then
the linear state equation is observable on any interval [t0, t1] containing ta.

Chapter

150

Controllability and Observability

9.11 Theorem If A(t) =A and C(t) = C in (16), then the time-invariant linear state
equation is observable on
t1] if and only if the np x n observability matrix satisfies

C
CA

rank

=H

(22)

CA

The concept of observability for time-invariant linear state equations is


independent of the particular (nonzero) time interval. Thus we simplify terminology and
use the simple adjective observable for time-invariant state equations. Also comparing
(12) and (22) we see that

i(t) =Ax(t)

Bu(t)

is controllable if and only if

=ATz(t)

y(t)=BTz(t)

(23)

observable.
This permits quick translation of algebraic consequences of
controllability for time-invariant linear state equations into corresponding results for
is

observability. (Try it on, for example, Exercises 9.79.)

Additional Examples
In particular physical systems the controllability and observability properties of a
describing state equation might be completely obvious from the system structure, less
obvious but reasonable upon reflection, or quite unclear. We consider examples of each
situation.
9.12 Example

The perhaps strange though feasible bucket system in Figure 9.13, with
all parameters unity, is introduced in Example 6.18. It is physically apparent that u(t)
cannot affect x2(t), and in this intuitive sense controllability is impossible.

Figure 9.13 A disconnected bucket system.


Indeed it is easy to compute the linearized state equation description

kL,nal Examples

151

(t)

)'(t)

(t)

0]x(t)

[1

it is not controllable. On the other hand consider the bucket system in Figure

again with all parameters unity. The failure of controllability is not quite so
though some thought reveals that x1 (t) and x3(t) cannot be independently
by the input signal. Indeed the linearized state equation
l

i(t) =
%'(t)= [0

x(t) +

zi(t)

0}x(t)

(24)

the controllability matrix


0

[B AB A2B] =

3 Il
I

(25)

has rank two.

Figure 9.14 A parallel bucket system.

The linearized state equation for the system shown in Figure 9.15 is controllable. We
leave confirmation to the hydrologically inclined.

Figure 9.15 A controllable parallel bucket system.

9.16 Example In Example 2.7 a linearized state equation for a satellite in circular orbit
is introduced. Assuming zero thrust forces on the satellite, the description is

152

Chapter 9
0

00

0
0

Controllability and Observability

0
0

(26)

the first output is radial distance, and the second output is angle. Treating these
two outputs separately, first suppose that only measurements of radial distance,
where

y1(t)=
are

[1

0]x(t)

available on a specified time interval. The observability matrix in this case is


c
cA
cA2
cA3

000

0
0

(27)

which has rank three. Therefore radial distance measurement does not suffice to compute

the complete orbit state. On the other hand measurement of angle,

y7(f)= [0
does

0]x(t)

suffice, as is readily verified.

EXERCISES
Exercise 9.1

For what values of the parameter a is the time-invariant linear state equation

.(t)=

lal

x(t) +

nO)

y(t)= {0

0]

controllable? Observable?

Exercise 9.2

Consider the linear state equation


+

[hI(t)j()

this state equation controllable on [0, 1] for b (i) = b , an arbitrary constant? Is it controllable
on [0, I] for every continuous function h1 (1)?
Is

Exercise 9.3
p X I outputs:

Consider a controllable, time-invariant linear state equation with two different

Exercises

153

= Ax(t) + Bzi (1), .v(O) = 0


=

v,,(r) = C,,x(t)

Show that if the impulse response of the two outputs is identical, then

= C,,.

Exercise 9.4 Show that the controllability Gramian satisfies the matrix differential equation
W(t, r1) = A (t)W(!,

+ W(t, ff)AT(t) B(t)BT(,), W(t1,

=0

Also prove that the inverse of the controllability Gramian satisfies

-f

tj)

11)B(t)BT(t)W_t(t, t1)

,1)A(t) +

for values oft such that the inverse exists, of course. Finally, show that

tj) = lV(t,,, t) + 1(t,,,

11)cbT(,,,, t)

Establish properties of the observability Gramian M (t0, t1) corresponding to the


properties of W(t,,,
in Exercise 9.4.
Exercise 9.5

Exercise 9.6 For the linear state equation

i(t) = A(t)x(t) + B(t)u(t)


with associated controllability Gramian W(. ti), show that the transition matrix for

A(t) B(t)BT(1)
_AT(,)
0
is given by

Exercise 9.7

If

D,%(t, -r)

c)W(t, t)

t)

is a real constant, show that the time-invariant linear state equation

.i(t) =Ax(t) + Bu(t)


is

controllable if and only if

:(t) =(A 13!):(t) + Bu(r)


is

controllable.

Exercise 9.8 Suppose that the time-invariant linear state equation

k(t) =Ax(t) + Bu(t)


is

controllable and A has negative-real-part eigenvalues. Show that there exists a symmetric,

positive-definite Q such that

AQ + QAT = _BBT

Chapter

154
Exercise 9.9

Controllability and Observability

Suppose the time-invariant linear state equation

=Ax(t) Bu(t)
is

controllable and there exists a symmetric, positive-definite Q such that


AQ + QAT = _BBT

Show that all eigenvalues of A have negative real parts. Hint: Use the (in general complex) left

eigenvectors of A in a clever way.

Exercise 9.10 The linear state equation

k(r) =A(t)x(r) + B(t)u(i)


y(t) = C(i)x(i)
called output controllable on [ia,
if for any given
=
there exists a continuous input
signal u (t) such that the corresponding solution satisfies)' (ti) = 0. Assuming rank C (ii) = p. show
that a necessary and sufficient condition for output controllability on (ia, 14 is invertibility of the
is

p xp matrix
5

t)CT(,1) di

i)B

Explain the role of the rank assumption on C (ti). For the special case in = p =

express the

condition in terms of the zero state response of the state equation to impulse inputs.
Exercise 9.11

For a time-invariant linear state equation

.i(t) = Ax(i) + Bu(t)


y(t) = C'x(i)

with rank C = p. continue Exercise 9.10 by deriving a necessary and sufficient condition for
output controllability similar to the condition in Theorem 9.5. If rn = p = I, characterize an output
controllable state equation in terms of its impulse response and its transfer function.
It is interesting that continuity of C(t) is crucial to the basic Gramian condition
for observability. Show this by considering observability on [0, I] for the scalar linear state
Exercise 9.12

equation with zero A (t) and

C(t)

1, 1=0
0, t>0

Is continuity of B (1) crucial in controllability?

Exercise 9.13 Show that the time-invariant linear state equation


= A.r(t) + Bu(t)
is controllable

if and only if

=A:(t)

+ BBTV(,)

is controllable.

Exercise 9.14 Suppose the single-input, single-output, n-dimensional, time-invariant linear state
equation

Notes

155

i(t) = Ax(t) + bu(t)


y(t) = cx(t)
is controllable and observable. Show that A and bc do not commute if n 2.

Exercise 9.15

The linear state equation

=A(t)x(t)

B(t)uO) , x(r0) =x0

called reachable on
tjJ if for x0 = 0 and any given n x 1 vector Xf there exists a continuous
input signal u (t) such that the corresponding solution satisfies x (t1) = Xf. Show that the state
equation is reachable on [ta, Ij] if and only if the n x n reachability Gramian
is

If

= f cD(t1, r)B (r)BT(t)1T(tf, t) dt

W'R(t0,

is invertible. Show also that the state equation is reachable on [ta, tj] if and only if it is
controllable on [t0, tj].

Exercise 9.16 Based on Exercise 9.15, define a natural concept of output reachability for a
time-varying linear state equation. Develop a basic Gramian criterion for output reachability in
the style of Exercise 9.10.
Exercise 9.17 For the single-input, single-output state equation
k(r) =A(t)x(t) + b(t)u(t)
y(t) = c(t)x(t)
suppose that

L0(t)
L1(t)

M(t)

L,,1(t)
is invertible for all t. Show that y (t) satisfies a linear n th_order differential equation of the form
nI

ii

j=O

j=O

where

[cxo(t)

(A recursive formula for the

... cL,,1(t)] =L,,(t)M'(t)

coefficients can be derived through a messy calculation.)

NOTES
As indicated in Exercise 9.15, the term 'reachability' usually is associated with the
ability to drive the state vector from zero to any desired state in finite time. In the setting of
Note 9.1

continuous-time linear state equations, this property

is equivalent

to the

property of

controllability, and the two terms sometimes are used interchangeably. However under certain
types of uniformity conditions that are imposed in later chapters the equivalence is not preserved.
Also for discrete-time linear state equations the corresponding concepts of controllability and

Chapter 9

Controllability and Observability

reachability are not equivalent. Similar remarks apply to observability and the concept of
'reconstructibility,' defined roughly as follows. A linear state equation is reconstructible on [t0,
can be determined from a knowledge of y(t) for to [t,,, ti]. This issue arises in the
if
discussion of observers in Chapter 15.
The concepts of controllability and observability introduced here can be refined to
consider controllability of a particular state to the origin in finite time, or determination of a

Note 9.2

particular initial state from finite-time output observation. See for example the treatment in
R.W. Brockett, Finite Dimensional Linear Systems, John Wiley, New York, 1970

For time-invariant linear state equations, we pursue this refinement in Chapter 18 in the course of
developing a geometric theory. A treatment of controllability and observability that emphasizes
the role of linear independence of time functions is in
C.T. Chen, Linear Systems Theoiy and Design, Holt, Rinehart and Winston, New York, 1984

In many references a more sophisticated mathematical viewpoint is adopted for these topics. For
controllability, the solution formula for a linear state equation shows that a state transfer from
x(10)
to .v(tj) = 0 is described by a linear map taking in x 1 input signals into n x 1 vectors.
Setting up a suitable Hilbert space as the input space and equipping R" with the usual inner
product, basic linear operator theory involving adjoint operators and so on can be applied to the
problem. Incidentally this formulation provides an interpretation of the mystery input signal in
the proof of Theorem 9.2 as a minimum-energy input that accomplishes the transfer from x0 to
zero.

State transfers in a controllable time-invariant linear state equation can be


accomplished with input signals that are polynomials in t of reasonable degree. Consult

Note 9.3

A. Ailon, L. Baratchart, J. Grimm, G. Langholz, "On polynomial controllability with polynomial


state for linear constant systems," IEEE Transactions on Automatic Control, Vol. 31, No. 2, pp.
155 156, 1986

D. Aeyels, "Controllability of linear time-invariant systems," International Journal on Control,


Vol.46, No. 6, pp. 2027 2034, 1987

Note 9.4 For a linear state equation where A 0) and B (t) are analytic, Theorem 9.4 can be
restated as a necessary and sufficient condition at any point t, E [t(,, t1J. That is, an analytic linear
state equation is controllable on the interval if and only if for some nonnegative integer j,
=n

rank
]

The proof of necessity requires two technical facts related to analyticity, neither obvious. First, an
analytic function that is not identically zero can be zero only at isolated points. The second is that
cD(t, r) is analytic since A 0) is analytic. In particular it is not true that a uniformly convergent
series of analytic functions converges to an analytic function. Therefore the proof of analyticity of

cb(t, r) must be specific to properties of analytic differential equations. See Section 3.5 and
Appendix C of
E.D. Sontag, Mathematical Control Theory, Springer-Verlag, New York, 1990

Note 9.5 Controllability is a point-to-point concept, in which the connecting trajectory is


immaterial. The property of making the state follow a preassigned trajectory over a specified time
interval is called functional reproducibility or path controllability. Consult

157

Notes

11

A. Grasse, "Sufficient conditions for the functional reproducibility of time-varying, input-

output systems," SIAM Journal on Control and Optimization, Vol. 26, No. 1, pp. 230 249, 1988
See also the references on the closely related notion of linear system inversion in Note 12.3.

Note 9.6

For T-periodic linear state equations, controllability on any nonempty time interval is

to controllability on [0, nT], where n is the dimension of the state equation. This is
established in

P. Brunovsky, "Controllability and linear closed-loop controls in linear periodic systems,"


Journal of Differential Equations, Vol. 6, pp. 296 313, 1969

Anempts to reduce this interval and alternate definitions of controllability in the periodic case are
discussed in
S. Biuanti,

P. Colaneri, G. Guardabassi, "H-controllability and observability of linear periodic

systems," SIAM Journal on Control and Optimization, Vol. 22, No. 6, pp. 889 893, 1984

H. Kano, T. Nishimura, "Controllability, stabilizability, and matrix Riccati equations for periodic
systems," IEEE Transactions on Automatic Control, Vol. 30, No. 11, pp. 1129 1131, 1985

Note 9.7 Controllability and observability properties of time-varying singular state equations
4See Note 2.4) are addressed in

Si. Campbell, N.K. Nichols, Wj. Terrell, "Duality, observability, and controllability for linear
lime-varying descriptor systems," Circuits, Systems, and Signal Processing, Vol. 10, No. 4, pp.
455470,1991
Note 9.8

Additional aspects of controllabilty and observability,

some of which arise in Chapter

11, are discussed in

L.M.

Silverman, H.E. Meadows, "Controllability and observability in time-variable linear

systems," SIAM Journal on Control and Optimization, Vol.5, No.

1,

pp. 64 73, 1967

We examine important additional criteria for controllability and observability in the timeinvariant case in Chapter 13.

10
REALIZABILITY

In

this chapter we begin to address questions related to the input-output (zero-state)

behavior of the standard linear state equation

i(t) = A (t)x (t) + B (t)u (t)

y() = C(t)x(t)

D(t)u(t)

With zero initial state assumed, the output signal v (t) corresponding to a given input
signal u (t) is described by

y(t)= JG(t, cy)u(a)da + D(t)u(t), tt()


where

G (t, a) = C

a)B (a)

Of course given the state equation (1), in principle G (t, a) can be computed so that the
input-output behavior is known according to (2). Our interest here is in the reversal of
this computation, and in particular we want to establish conditions on a specified G (t, a)
that guarantee existence of a corresponding linear state equation. Aside from a certain
theoretical symmetry, general motivation for our interest is provided by problems of
implementing linear input/output behavior. Linear state equations can be constructed in
hardware, as discussed in Chapter 1, or programmed in software for numerical solution.
Some terminology mentioned in Chapter 3 that goes with (2) bears repeating. The
input-output behavior is causal since, for any
the output value y (ta) does not
depend on values of the input at times greater than ti,. Also the input-output behavior is
linear since the response to a (constant-coefficient) linear combination of input signals
+
+
is
in the obvious notation. (In particular the response to
1

Formulation

159

zero input is y (t) = 0 for all t.) Thus we are interested in linear state equation
representations for causal, linear input-output behavior described in the form (2).
the

Formulation
While the realizability question involves existence of a linear state equation (I)
corresponding to a given G (t, a) and D (t), it is obvious that D (t) plays an unessential
role. Therefore we assume henceforth that D (r) = 0, for all t, to simplify matters.
When there exists one linear state equation corresponding to a specified G (t, a),
there exist many, since a change of state variables leaves G (t, a) unaffected. Also there
exist linear state equations of different dimensions that yield a specified G (t, a). In

particular new state variables that are disconnected from the input, the output, or both,

can be added to a state equation without changing the corresponding input-output


behavior.
10.1 Example

If the linear state equation (I) corresponds to a given input-output

behavior, then a state equation of the form

1(t)

A(r)

x(t)

F(t)

2(t)

y(t)= [C(t)
yields

0]

B(t)
0

u(t)

x(t)
z(t)

the same input-output behavior. This is clear from Figure 10.2, or, since the

transition matrix for (3) is block diagonal, from the easy calculation
0]

[B (a)]

(t, a)

a)]

=C

(t, a)B (a)

DOD
Example 10.1 shows that if a linear state equation of dimension n has the inputoutput behavior specified by G (t, a), then for any positive integer k there are state
equations of dimension n +k that also have input-output behavior described by G (r, a).
Thus our main theoretical interest is to consider least-dimension linear state equations
corresponding to a specified G (t, a). But this is in accord with prosaic considerations:
a least-dimension linear state equation is in some sense a simplest linear state equation
yielding input-output behavior characterized by G (t, a).

There is a more vexing technical issue that should be addressed at the outset. Since
the response computation in (2) involves values of G (1, a) only for t a, it seems most

natural to assume that the input-output behavior is specified by G (t, a) only for
arguments satisfying t a. With this restriction on arguments G (t, a) often is called an
impulse response, for reasons that should be evident. However if G (t, a) arises from a
linear state equation such as (1), then as a mathematical object G (t, a) is defined for all

Chapter 10
t,

Realizability

a. And of course its values for a> t might not be completely determined by its

values for t a. Delicate matters arise here. Some involve mathematical technicalities
such as smoothness assumptions on G (t, a), and on the coefficient matrices in the state
equations. Others involve subtleties in the mathematical representation of causality. A
simple resolution is to insist that linear input-output behavior be specified by a p x ni
matrix function G (t, a) defined and, for compatibility with our default assumptions,
continuous for all t, a. Such a G (t, a) is called a weighting pattern.

x(t0)

u(t)

10.2

I
1(t) = A(t)x(t) + B(t)u(t)

r(t)

y(t)
C(t)

Figure Structure of the linear state equation (3).

A hint of the difficulties that arise in the realization problem when G (t, a) is
specified only for t a is provided by considering Exercise 10.7 in light of Theorem
10.6. For strong hypotheses that avert trouble with the impulse response, see the further
consideration of the realization problem in Chapter 11. Finally notice that for a timeinvariant linear state equation the distinction between the weighting pattern and impulse
response is immaterial since values of
for t a completely determine the
values for t <a. Namely for t <a the exponential
is the inverse of

Realizability
Terminology that aids discussion of the realizability problem can be formalized as
follows.

10.3 Definition

A linear state equation of dimension n

=A(t)x(t) B(t)u(t)
y(t) = C(t)x(t)
is called a realization of the weighting pattern G (t, a) if, for all t and a,
G (t, a) = C

a)B (a)

If a realization (4) exists, then the weighting pattern is called realizable, and if no
realization of dimension less than ii exists, then (4) is called a minimal realization.

161

Theorem The weighting pattern G (t, a) is realizable if and only if there exist a
x
p n matrix function H(t) and an n x ni matrix function F(r), both continuous for all t,
11.4

ch that

G(t, a) = H(t)F(a)

krall

t and a.

Proof Suppose there exist continuous matrix functions F(t) and H(t) such that (6)

$ satisfied. Then the linear state equation (with continuous coefficient matrices)
i-(t) =

(t)

y(t) = H(t)x(t)
is a realization of G (t, a) since the transition matrix for zero is the identity.
Conversely suppose that G (t, a) is realizable. We can assume that the linear state
equation (4) is one realization. Then using the composition property of the transition
matrix we write
G (t, a) = C (t)c1(t, a)B (a) = C (t)cD(t,

and by defining H (I) = C (t)D(t, 0) and F (t) =

a)B (a)

t)B (t) the proof is complete.

ZED
While Theorem 10.4 provides the basic realizability criterion for weighting

patterns, often it is not very useful because determining if G (t, a) can be factored in the
requisite way can be difficult. In addition a simple example shows that the realization (7)
can be displeasing compared to alternatives.
10.5 Example For the weighting pattern

G(t, a) =
an

obvious factorization gives a dimension-one realization corresponding to (7) as

i(t) = e'u(t)
y(t) = e'x(t)
While this linear state equation has an unbounded coefficient and clearly is not uniformly
exponentially stable, neither of these ills is shared by the dimension-one realization

= x(t) + u(t)

y(t) =xQ)

Chapter 10

162

Realizability

Minimal Realization
now consider the problem of characterizing minimal realizations of a realizable
weighting pattern. It is convenient to make use of some simple observations mentioned
in earlier chapters, but perhaps not emphasized. The first is that properties of
controllability on [ti,,
and observability on [t0, tj] are not influenced by a change of
We

state variables. Second, if (4) is an n-dimensional realization of a given weighting


pattern, then the linear state equation obtained by changing variables according to
z (t) = P '(t)x (t) also is an n-dimensional realization of the same weighting pattern. In
particular it is easy to verify that P(t) = 'I)A(t, t0) satisfies

P'(t)P(t) = 0

for all t, so the linear state equation in the new state z (t) defined via this variable
change has the economical form
=

y(t) = C(t)P(t)z(t)
Therefore we often postulate realizations with zero A (t) for simplicity, and without loss
of generality.
It is not surprising, in view of Example 10.1, that controllability and observability

play a role in characterizing minimality. However it might be a surprise that these


concepts tell the whole story.

Suppose the linear state equation (4) is a realization of the weighting


pattern G (t, a). Then (4) is a minimal realization of G (t, a) if and only if for some t0
and t1> t0 it is both controllable and observable on [t0, ti].
10.6 Theorem

Sufficiency is proved via the contrapositive, by supposing that an nProof


dimensional realization (4) is not minimal. Without loss of generality it can be assumed
that A (t) = 0 for all t. Then there is a lower-dimension realization of G (t, a), and again
it can be assumed to have the form
= F(t)u(t)

y(t) = H(t)z(t)
where the dimension of z (t)
realizations gives

is

n- <n. Writing the weighting pattern in terms of both

C(t)B(a) = H(t)F(a)
for all t and a. This implies
CT(t)C (t)B (a)B T(a) = C T(t)H (t)F (a)B T(a)

for all t, a. For any t0 and any tf> t0 we can integrate this expression with respect to
t, and then with respect to a, to obtain

Minimal Realization

163
Ij

M(t0, t1)W(t0, t1) =

CT(t)H(t) dt f

dcr

(10)

n matrix, it cannot be
Since the right side is the product of an n x matrix and an
full rank, and thus (10) shows that M (t0, t1) and W(r0, tf) cannot both be invertible.
Furthermore this argument holds regardless of t,,, and t1> to, so that the state equation

(4), with A (t)

zero,

cannot be both controllable and observable on any interval.

Therefore sufficiency of the controllability/observability condition is established.

For the converse suppose (4) is a minimal realization of the weighting pattern
again with A (t) = 0 for all t. To prove that there exist t0 and tf> t0 such that
G (t,
W(t0, tf) = $B (t)BT(t) dt
and
M(r0, t1) = J CT(t)C(t) dt

are invertible, the following strategy is employed. First we show that if either W(t0, i1)
or M
t1) is singular for all t0 and t1 with t1> t0, then minimality is contradicted.
and
such that W(ta0,
and
This gives existence of intervals [tg,
and t1 = max
the positiveboth are invertible. Then taking t0 = mm
definiteness properties of controllability and observability Gramians imply that both
W(t0, tj) and M (t0, r1) are invertible.
Embarking on this program, suppose that for every interval [t0, t1} the matrix
W(t0, t1) is not invertible. Then given t0 and tf there exists a nonzero n x 1 vector x,
in general depending on t0 and t1, such that
0 = xTw(t0, t1)x = $ xTB (t)BT(t)x dt

This gives xTB (t) = 0 for t E [t0, t4. Next an analysis argument is used to prove that
there exists at least one such x that is independent of t0 and tj.

By the remarks above, there is for each positive integer k an n x 1 vector Xk


satisfying

IIXkII=1;

xTB(t)=0, tE[k,k]

In this way we define a bounded (by unity) sequence of n x 1 vectors {xk }


follows that there exists a convergent subsequence
}
i. Denote the limit as
x0 = urn

j-400

To

xk

(t) = 0 for all t, suppose we are given any time t0. Then there
t0
for all j J0. Therefore
for all j
, which implies, passing to the limit, $B
= 0.

conclude that

exits a positive integer J0 such that

4B (t0) = 0

and it

Chapter

164
Now

10

Realizability

let P' be a constant, invertible, n x n matrix with bottom row 4. Using

as a change of state variables gives another minimal realization of the weighting


pattern, with coefficient matrices

P'B(t) =

B1(t)

C(t)P =

C2(t)]

IXm

where B1(t) is (nI) x m, and C1(t) isp x (ni). Then an easy calculation gives

G(t, a)= C1(t)B1(a)


so that the linear state equation

=B1(t)u(t)

y(t) = C1(t)z(t)

(12)

is a realization for G (t, a) of dimension n

1. This contradicts minimality of the


original, dimension-n realization, so there must be at least one
and one
t0 such
that
is invertible.
and one
>
A similar argument shows that there exists at least one
such
is invertible. Finally taking t0 = mm
and tf = max
shows
that
that the minimal realization (4) is both controllable and observable on [t0, r1].

DOD
Exercise 10.9 shows, in a somewhat indirect fashion, that all minimal realizations
of a given weighting pattern are related by an invertible change of state variables. (In
the time-invariant setting, this result is proved in Theorem 10.14 by explicit construction
of the state variable change.) The important implication is that minimal realizations of a
weighting pattern are unique in a meaningful sense. However it should be emphasized
that, for time-varying realizations, properties of interest may not be shared by different
minimal realizations. Example 10.5 provides a specific illustration.

Special Cases
Another issue in realization theory is characterizing realizability of a weighting pattern

given in the general time-varying form in terms of special classes of linear state
equations. The cases of periodic and time-invariant linear state equations are addressed
here. Of course by a T-periodic linear state equation we mean a state equation of the
form (4) where A (t), B (t), and C (t) all are periodic with the same period T.

10.7 Theorem A weighting pattern G (t, a) is realizable by a periodic linear state


equation if and only if it is realizable and there exists a finite positive constant T such
that

G(r+T,a+T)=G(t,a)
for all t

and

(13)

a. If these conditions hold, then there exists a minimal realization of

G (t, a) that is periodic.

Special Cases

165

Proof if G (t, a) has a periodic realization with period T, then obviously G (t, a) is
realizable. Furthermore in terms of the realization we can write

a)B(a)

G(t, a) =
and

a+T)B(a+T)

G(t+T,

In the proof of Property 5.11 it is shown that 1A (t + T, a + T) =


(t, a) for T-periodic
A (t), so (13) follows easily.
Conversely suppose that G (t, a) is realizable and (13) holds. We assume that

i(t) =B(t)u(t)
y(t) = C(t)x(t)
is a minimal realization of G (t, a) with dimension n. Then

G(t, a) = C(t)B(a)

(14)

and there exist finite times t0 and t1> to such that


W(t0, t1) =

f B(a)BT(a) da

M(t0,

dt

both are invertible. (Be careful in this proof not to confuse the transpose T and the
constant T in (13).) Let
W(t0, t1) = S

M(t0, t1) =

CT(a)C(a + T) da

Then replacing a by aT in (13), and writing the result in terms of(14), leads to

C(t+T)B(a)=C(t)B(aT)

(15)

t and a. Postmultiplying this expression by B T(a) and integrating with respect


to a from t0 to tf gives

for all

C(t + T) = C(t)W(t0, t1)W' (r0, t1)

(16)

for all t. Similarly, premultiplying (15) by CT(t) and integrating with respect to t yields

B(aT) =M'(t0,

t1)B(a)

(17)

for all a. Substituting (16) and (17) back into (15), premultiplying and postmultiplying

Chapter

166
by CT(,) and BT(a) respectively, and integrating with
t1)W(t0,

10

Realizability

respect to both t and a gives

t1)W(t0,

t1)W(t0,

t1)

tj)

t1)M(r0,

We denote by P the real n x n matrix in (18), and establish invertibility of P by a


simple contradiction argument as follows. If P is not invertible, there exists a nonzero
n x 1 vector x such that xTP = 0. Then (17) gives

for all a. This implies


t1+T

xT f B(aT)BT(aT)dax=0
a change of integration variable shows that xTW(t0 T, tj)x = 0. But then
xTW(t0, t1)x = 0, which contradicts invertibility of W(t0, tj).

and

Finally we use the mathematical fact (see Exercise 5.20) that there exists a real
n x ii matrix

A such

that

p2 =

eA2T

Letting

H(t) = C(t)e_At

F(t) =
it is easy to see from (14) that the state equation

i(t)=Az(t) F(t)u(t)
y(t) =H(t)z(t)
is a realization of G (t, a). Furthermore, using (16),

H(t + 2T) = C(t +


=
=

=H(t)
A similar demonstration for F (t), using (17), shows that (20) is a 2T-periodic realization
for G (t, a). Also, since (20) has dimension n, it is a minimal realization.

DOD

Special Cases

167

Next we consider the characterization of weighting patterns that admit a time-

invariant linear state equation

=Ax(t) + Bu(t)
y(t)
as

= Cx(t)

a realization.

10.8 Theorem A weighting pattern G (t, a) is realizable by a time-invariant linear


state equation (21) if and only if G (t, a) is realizable, continuously differentiable with
respect to both t and a, and

G(t, a) = G(ta, 0)

(22)

for all t and a. If these conditions hold, then there exists a minimal realization of
G (t, a) that is time invariant.

Proof If the weighting pattern has a time-invariant realization (21), then obviously
it is realizable. Furthermore we can write
= CeAt

G(t, a) =
and

continuous differentiability is clear, while verification of (22) is straightforward.

For the converse suppose the weighting pattern

is

realizable, continuously

differentiable in both t and a, and satisfies (22). Then G (t, a) has a minimal

realization. Invoking a change of variables, assume that

i(t) =B(r)u(t)
y(t)

C(t)x(t)

(23)

is an n-dimensional minimal realization, where both C (t) and B (t) are continuously
differentiable. Also from Theorem 10.6 there exists a and tj> t0 such that
If

t1) =

5B(t)BT(t)dt
II

M (t0,

t1)

f CT(t)C(t) dt

both are invertible. These Gramians are deployed as follows to replace (23) by a time-

invariant realization of the same dimension.


From (22), and the continuous-differentiability hypothesis,
G (t, a) =

G (t,

a)

t and a. Writing this in terms of the minimal realization (23) and


postmultiplying by BT(a) yields

for all

Chapter 10

168

Realizability

0=

for all t, a. Integrating both sides with respect to a from to,

tO

gives

0= [fC(t)]W(10,, tj) + C(t)J [JfrB(a)]BT(a)da

(24)

Now define a constant n x n matrix A by

A=
Then (24) can be rewritten as

C(t) = C(t)A
and this matrix differential equation has the unique solution

C(t)

Therefore

G(t, a)=C(t)B(a)=C(ta)B(0)
=
and

the time-invariant linear state equation


(t)
(t)

is

(t)

(25)

a realization of G (t, a). Furthermore (25) has dimension n, and thus is a minimal

realization.

DnD
In the context of time-invariant linear state equations, the weighting pattern (or

impulse response) normally would be specified as a function of a single variable, say,


G(t). In this situation we can set Ga(t, a) = G(ta). Then (22) is satisfied
automatically, and Theorem 10.4 can be applied to Ga(t, a). However more explicit
realizability results can be obtained for the time-invariant case.
10.9 Example

The weighting pattern

G(t, a) =
is realizable by Theorem 10.4, though the condition (22) for time-invariant realizability
clearly fails. For the weighting pattern

G(t, a) =
(22) is easy to verify:

G(ta, 0) =

= G(t, a)

Time-Invariant Case

169

However it takes a bit of thought even in this simple case to see that by Theorem 10.4

the weighting pattern is not realizable. (Remark 10.12 gives the answer more easily.)

Time-Invariant Case
Realizability and minimality issues are somewhat more direct in the time-invariant case.

While realizability conditions on an impulse response G (t) are addressed further in


Chapter 11, here we reconstitute the basic realizability criterion in Theorem 10.8 in
terms of the transfer function G(s), the Laplace transform of G (t). Then Theorem 10.6
is replayed, with a simpler proof, to characterize minimality in terms of controllability
and observability. Finally we show explicitly that all minimal realizations of a given
transfer function (or impulse response) are related by a change of state variables.
In place of the time-domain description of input-output behavior

v(t) = 5G(tt)u(t)dt
consider the input-output relation written in the form

Y(s) = G(s)U(s)

(26)

Of course

G(s)=JG(t)e_'dt
similarly, Y(s) and U(s) are the Laplace transforms of the output and input
signals. Now the question of realizability is: Given a p x transfer function G(s),
and,

when does there exist a time-invariant linear state equation of the form (21) such that

C(sl

A)1B

= G(s)

(27)

Recall from Chapter 5 that a rational function is strictly proper if the degree of the
numerator polynomial is strictly less than the degree of the denominator polynomial.

The transfer function G(s) admits a time-invariant realization (21) if


and only if each entry of G(s) is a strictly-proper rational function of s.
10.10 Theorem

Proof If G(s) has a time-invariant realization (21), then (27) holds. As argued in
is a strictly-proper rational function. Linear
combinations of strictly-proper rational functions are strictly-proper rational functions,

Chapter 5, each entry of

so G(s) in (27) has entries that are strictly-proper rational functions.


Now suppose that each entry, G11 (s) is a strictly-proper rational function. We can
assume that the denominator polynomial of each
is ?nonic, that is, the coefficient
of the highest power of s is unity. Let

d(s)s'
be

+ d0

the (monic) least common multiple of these denominator polynomials. Then

Chapter 10

Realizability

d(s)G(s) can be written as a polynomial in s with coefficients that are p x m constant


matrices:

d(s)G(s) =Nr_iSr_I

+ N1s + N0

(28)

From this data we will show that the mr-dimensional linear state equation specified by

the partitioned coefficient matrices


0,,,

0,,,

0,,,

0,,,

0,,,

n,

B=

n,
d01,,,

d 'n,

C={N0N1

Nr1]

0,,,
C!,.

1,,,

4,,

is a realization of G(s). Let


Z(s) = (sI

AY'B

(29)

and partition the mr x rn matrix Z(s) into r blocks Z1 (s),. , Zr(s), each m X m.
Multiplying (29) by (siA) and writing the result in terms of submatrices gives the set
of relations

i=

l,...,r 1

and

sZr(s) + d0Z1(s)

+ dr_1Z1(S) = 1,,,

Using (30) to rewrite (31) in terms of Z1 (s) gives

Z1(s) =
Therefore, from (30) again,
I,,

si,,,

Z(s)=

I,

Finally multiplying through by C yields

C(si

=G(s)
ODD

[NO

+ N1s +

(30)

Tune-Invariant Case

171

The realization for G(s) provided in this proof usually is far from minimal,
though it is easy to show that it always is controllable. Construction of minimal

in both the time-varying and time-invariant cases is discussed further in


11.

Wi! Example

Form = p = the calculation in the proof of Theorem 10.10 simplifies


yield, in our customary notation, the result that the transfer function of the linear state
1

0
o

0
0

0
0

0
a0 a1

y(t) =

c1

a,_1

]x(t)

(32)

is given by

cn_IsflI ..
5" + a,,_1s' l +

cts + Co
+ a1s + a0

(33)

Thus the realization (32) can be written down by inspection of the numerator and
denominator coefficients of the strictly-proper rational transfer function in (33). An easy
drill in contradiction proofs shows that the linear state equation (32) is a minimal

realization of the transfer function (33) if and only if the numerator and denominator
polynomials in (33) have no roots in common. Arriving at the analogous result in the
multi-input, multi-output case takes additional work that is carried out in Chapters 16
and 17.

10.12 Remark Using partial fraction expansion, Theorem 10.10 yields a realizability
condition on the weighting pattern G (t) of a time-invariant system. Namely G (t) is
realizable if and only if it can be written as a finite sum of the form

G(t) =
k=I j=I

with the following conjugacy constraint. If


is complex, then for some r, Ar =
and
the corresponding p x ni coefficient matrices satisfy Grj Gqj, J = 1,..., 1. While this
condition characterizes realizability in a very literal way, it is less useful for technical
purposes than the so-called Markov-parameter criterion in Chapter 11.

ODD
Proof of the following characterization of minimality follows the strategy of the
proof of Theorem 10.6, but perhaps bears repeating in this simpler setting. The finicky

Chapter 10

172

Realizability

are asked to forgive mild notational collisions caused by yet another traditional use of
the symbol G.
1013 Theorem Suppose the time-invariant linear state equation (21) is a realization of
the transfer function G(s). Then (21) is a minimal realization of G(s) if and only if it is
both controllable and observable.

Proof

Suppose (21) is an n-dimensional realization of G(s)

that is not minimal.

Then there is a realization of G(s), say


=

F:(t) + Gu(r)

y(t) = H:(r)

(34)

of dimension n- <n. Therefore

t0

Ce4B=HeFIG,

and repeated differentiation with respect to t, followed by evaluation at t = 0, gives

k=0, 1,...
Arranging this data, for k = 0
CB

CAB

2,z 2, in matrix form yields

...

HG

CA"'B CA11B ... CA2"2B

CA

[B

AB

(35)

An1B] =

Since the right side is the product of an (n:P)

HFG

HF" -l G

HF"'G HF"G ...

H
HF

[G FG

x n.. matrix and an n- x (n-rn) matrix, the

rank of the product is no greater than ii:. But

<ii and we conclude that the

realization (21) cannot be both controllable and observable. Thus, by the contrapositive,
a controllable and observable realization is minimal.
Now suppose (21) is a (dimension-n) minimal realization of G(s) but that it is not
controllable. Then there exists an n x 1 vector q 0 such that

qT [B
Indeed qTALB =

AB

=0

0 for all k 0 by the Cayley-Hamilton theorem. Let P'

be an

Time-Invariant Case

173

invertible n x n matrix with bottom row qT, and

let z(t)

to obtain the linear

state equation
1(t) =Az(t) + Bu(t)

y(t)=Cz(t)

(36)

which also is a dimension-n, minimal realization of 0(s).

The

coefficient matrices in

(36) can be partitioned as

A=P'AP=
where A11

is

A21A22

(n I) x (ni), B

=cP=

is

and C

(n 1) x

partitions we know by construction of P

that

AB =

is

P'AB

I x (n 1). In terms of these


has the form

=
0

A21B1

Furthermore, since the bottom row of P_IALB

AkI

is

zero for all k

A11B1
,

Then A ii' B

and C1 define an (,i 1)-dimensional

0,

kO

(37)

realization of G(s), since

C2IZABk!_ [
=

Of course this contradicts the original minimality assumption. A similar argument lead
to a similar contradiction if we assume the minimal realization (21) is not observable.
Therefore a minimal realization is both controllable and observable.

Next we show that a minimal time-invariant realization of a specified transfer


function, or weighting pattern, is unique up to a change of state variables, and provide a
formula for the variable change that relates any two minimal realizations.
10.14 Theorem
Suppose the time-invariant, n-dimensional linear state equations (21)
and (34) both are minimal realizations of a specified transfer function. Then there exists
a unique, invertible n x ii matrix P such that

G=P'B, H=CP

174

Chapter 10

Realizability

Proof To uncluuer construction of the claimed P. let

c1=

= [B AB

0a

CA

HF
'

[.G FG

(38)

CA"'

HF"'

By hypothesis,
CeAIB = HeFIG

for all t. In particular, at t

0, GB = HG. Differentiating repeatedly with respect to t,

and evaluating at t = 0, gives

CAkB = HFkG, k = 0, 1,...

(39)

These equalities can be arranged in partitioned form to yield


OaCa = OfCf

Since a variable change P that relates the two linear state equations is such that

C1-P

C11,

it is natural to construct the P of interest from these controllability and observability


matrices. If fli = p = 1, then C1,
and
all are invertible n x n matrices and
definition of P is reasonably transparent. The general case is fussy.
By hypothesis the matrices in (38) all have (full) rank n, so a simple contradiction
argument shows that the n x n matrices
011T00,

C0C11T, c1CJ,

all are positive definite, hence invertible. Then the n x ,i matrices


Ti

'
are

kLIfLJfJ

(/f Li11

such that, applying (40),


PoPc = (0Y0iYb0jT0a CaCIT(CjC1TY'

= (0 J01) -

c1cJ(c1cJ) -'

=1

Therefore we can set P =

and
=

= P0. Applying (40) again gives

(0J01Y'oJo11 Ca = (OJO1Y'0J01C1

=C1

(41)

Additional Examples
00P

=00

= O1C1CF(C1CJY'

(42)

Extracting the first rn columns from (41) and the first p rows from (42) gives

P'B=G, CP=H
Finally another arrangement of the data in (39) yields, in place of (40),
= O1FCj

from which

(OFOI)_t0300

A C0C1T(CJCJTY'

c1q(c1qy'

(43)

Thus we have exhibited an invertible state variable change relating the two minimal
realizations. Uniqueness of the variable change follows by noting that if P is another
such variable change, then
HFk =

= CAkP, k = 0, 1,

and thus

oaP = of
This gives, in conjunction with (42),

00(PP)=0
and since

(44)

has full rank n, P = P.

Additional Examples
Transparent examples of nonminimal physical systems include the disconnected bucket

system considered in Examples 6.18 and 9.12. This system is immediately recognizable
as a particular instance of Example 10.1, and it is clear how to obtain a minimal bucket
realization. Simply discard the disconnected bucket. We next focus on examples where
interaction of physical structure with the concept of a minimal state equation is more
subtle.

10.15 Example

The unity-parameter bucket system in Figure 10.16 is neither

controllable nor observable. As mentioned in Example 9.12, these conclusions might be


intuitive, and they are mathematically precise in terms of the linearized state equation
l

.r(t)=

y(f)= [0

0]x(t)

x(t) +

u(t)

(45)

Chapter 10

176

Realizability

ly(t)

Figure 10.16 A parallel three-bucket system.

Therefore (45) is not a minimal realization of its transfer function, and indeed a
transfer-function calculation yields (in three different forms)

(5+1)2

s+l

s3+5s2+5s+l s2+4s+1
s+1

(46)

= (s + 0.27)(s + 3.73)

Evidently minimal realizations of


(s) have dimension two. And of course any
number of two-dimensional linear state equations have this transfer function. If we want
to describe two-bucket systems that realize (46), matters are less simple. Series twobucket realizations do not exist, as can be seen from the general form for Ga(s) given in
Example 5.17. However a parallel two-bucket system of the form shown in Figure 10.17
can have the transfer function in (46). We draw this conclusion from a calculation of the
transfer function for the system in Figure 10.17,

__L_.
r1c1

5+
2

r2c2+r1c2+r1c1
r1r2c1c2

5+ r1r2c1c2

and comparison to (46). The point is that by focusing on a particular type of physical
realization we must contend with state-equation realizations of constrained forms, and
the theory of (unconstrained) minimal realizations might not apply. See Note 10.6.

Figure 10.17 A parallel two-bucket system.

10.18 Example For the electrical circuit in Figure 10.19, with the indicated currents
and voltages as input, output, and state variables, the state-equation description is

177

Exercises
0

1/rc

= [

I .v(t)

+ [

hr

y(t) =

1/re 1

I u(i)
j

III

1 ].v(t) + (1/r)u(t)

(48)

u(t)

Figure 10.19 An electrical circuit.

The transfer function, which is the driving-point admittance of the circuit, is

G(s)=

/)s

r c/s ,- + (ri + r c)s + r

+:-

(49)

G(s) = hr. In this case (48) clearly is


not minimal, and it is easy to check that (48) is neither controllable nor observable.
Indeed when r2c = / the circuit shown in Figure 10.19 is simply an over-built version of
If the parameter values are such that r2c =

1, then

the circuit shown in Figure 10.20, at least as far as driving-point admittance


concerned.

u(t)

Figure 10.20 An extremely simple electrical circuit.

EXERCISES
Exercise 10.1

For what values of the parameter a is the following state equation minimal?

.v(t)=

102

3 0 x(t) +
Ocxl

y(f) = [1

l]x(t)

u(t)

is

Chapter 10

178

Realizability

Exercise 10.2 Show that the time-invariant linear state equation

=A.v(t) + Bu,(t)
y(t) = Cx(t)
with p =

nz

is minimal if and only if


1(1) = (A

y(t)
is

+BC)z(t) + Bu(t)

= Cz(t)

minimal.

Exercise 10.3 For

(51)2
provide

time-invariant realizations that are controllable and observable, controllable but not

observable, observable but not controllable, and neither controllable nor observable.

Exercise 10.4 If F is n x n and

is n x a, show that

G(t,a)
has a

time-invariant realization if and only if

j=0,l,2,...
Exercise 10.5 Prove that the weighting pattern of the linear state equation

i(t) =Ax(t) + eFtBu(t)


y(t)
admits a time-invariant realization if AF = FA. Under this condition give one such realization.

Exercise 10.6 For a time-invariant realization

i(t) =Ax(t) + Bu(t)


y(t) = Cx(t)
where P(t) =
Show that the coefficients
consider the variable change z(t) =
of the new realization are bounded matrix functions, arid that a symmetry property is obtained.
Exercise 10.7 Consider a two-dimensional linear state equation with zero A (t) and

h(t)=

[b(t)J

c(t)=

II

where

b1(t)=

sint , t c

sint , t e [0, 27t]


0, otherwise

0,

[ 2ic, 0]

otheni'ise

Prove that this state equation is a minimal realization of its weighting pattern. What is the impulse
for t
What is the dimension of a minimal
response of the state equation, that is, G (t,
realization of this impulse response?

Exercises

179

Exercise 10.8 Given a weighting pattern G (t, a) = H (t)F (a), where H (r) is p x n and F (a) is
n x rn, and a constant x n matrix A, show how to find a realization of the form

=Ax(t) B(t)u(t)
y(t) = C(t)x(t)
Exercise 10.9

Suppose the linear state equations

=B(t)u(t)
y(t) = C(t)x(1)
and

=F(t)u(t)
y(() =H(t)z(t)
both are minimal realizations of the weighting pattern G (a', a). Show that there exists a constant
invertible matrix P such that z (r) = Px (t). Conclude that any two minimal realizations of a given
weighting pattern are related by a (time-varying) state variable change.
Show that the weighting pattern G (a', a) admits a time-invariant realization if
and only if G (t, a) is realizable, continuously differentiable with respect to both I and a, and

Exercise 10.10

G (t +

a + t) = G (t, a)

for all a', a, and t.


Exercise 10.11

Using

techniques from the proof of Theorem 10.8, prove that the only

differentiable solutions of the ii x

matrix functional equation

X (a' + a)

=X

(t)X (a), X (0) =1

are matrix exponentials.

Exercise 10.12 Suppose the p x rn transfer function G(s) has the partial fraction expansion

A.r are real and distinct, and G


realization of G(s) has dimension

where ?9

n = rank G1 +

Gr are p x rn matrices. Show that a minimal


+ rank

G = CB1 and consider the corresponding diagonal-A realization of G(s).

Exercise 10.13 Given any continuous, n x n matrix function A (a'), do there exist continuous
n X 1 and I x n vector functions b (a') and c (a') such that

i(a')=A(t)x(t) + b(t)u(t)
y(t) = c(t)x(t)
is

minimal? Repeat the question for constant A, b, and c.

Chapter

180

10

Realizability

NOTES
In setting up the realizability question. we have circumvented fundamental issues
involving the generality of the input-output representation
Note 10.1

v(t)=JG(t.
This can be defended on grounds that the integral representation suffices to describe the inputoutput behaviors that can be generated by a linear state equation, but leaves open the question of
more general linear input-output behavior. Also the definitions of concepts such as causality and
time jig variance for general linear input-output maps have been avoided. These matters call for a
more sophisticated mathematical viewpoint, and they are considered in

l.W. Sandberg. "Linear maps and impulse

IEEE Transactions on Circuits and

Systems, Vol. 35. No. 2. pp. 201 206. 1988

LW. Sandberg. "Integral representations for linear maps," IEEE Transactions on Circuits and
Systems. Vol. 35. No.5. pp. 536 544. 1988

Note 10.2 An important result we do not discuss in this chapter is the canonical structure
theorem. Roughly this states that for a given linear state equation there exists a change of state
variables that displays the new state equation in terms of four component state equations. These
are, respectively, controllable and observable, controllable but not observable, observable but not
controllable, and neither controllable nor observable. Furthermore the weighting pattern of the
original state equation is identical to the weighting pattern of the controllable and observable part
of the new state equation. Aside from structural insight, to compute a minimal realization we can
start with any convenient realization, perform a state-variable change to display the controllable
and observable part, and discard the other parts. This circle of ideas is discussed for the timevarying case in several papers, some dating from the heady period of setting foundations:

R.E. Kalman. 'Mathematical description of linear dynamical systems," SIAM Journal on Control
and Optimi:atioii, Vol. I, No. 2. pp. 152 192, 1963

R.E. Kalman. "Ott the computation of the reachable/observable canonical form," SIAM .Fournal
on Control and Optimi:aiion. Vol. 20. No. 2, pp. 258 260, 1982

D.C. Youla. "The synthesis of linear dynamical systems from prescribed weighting patterns."
SIAM Journal on Applied Mathematics. Vol. 14, No. 3, pp. 527 549, 1966

L. Weiss, "On the structure theory of linear differential systems,'' SIAM Journal on Control and
Oprimi:ation, Vol. 6, No. 4, pp. 659 680. 1968
P.

D'Alessandro. A. Isidori, A. Ruberti.

'A new approach

to the

theory of canonical

decomposition of linear dynamical systems,'' SIAM Journal on Control amid Optimization, Vol. Il,

No. l,pp. 148158.1973


We treat the time-invariant canonical structure theorem by geometric methods in Chapter 18.
There are many other sourcesconsult an original paper

E.G. Gilbert. "Controllability and observability in multivariable control systems." SIAM Journal
on Control and Optimization. Vol. l,No. 2, pp. 128 152. 1963
or the detailed textbook exposition, with variations, in Section 17 of

Notes

181

D.F. Deichamps, State Space and Input-Output Linear Systems, Springer-Verlag, New York, 1988

For a computational approach see

D.L. Boley, "Computing the Kalman decomposition: An optimal method," IEEE Transactions on
Control, Vol.29, No. II, pp.51 53. 1984 (Correction: Vol.36, No. II. p. 1341, 1991)
Finally some results in Chapter 13, including Exercise 13.14. are related to the canonical structure
o( time-invariant linear state equations.

Note 10.3 Subtleties regarding formulation of the realization question in terms of impulse
versus formulation in terms of weighting patterns are discussed in Section 10.13 of
R.E. Kalman, P.L. Faib, M.A. Arbib, Topics in Mathematical System Theory. McGraw-Hill. New
York. 1969

Note 10.4 An approach to the difficult problem of checking the realizability criterion in Theorem
10.4 is presented in

C. BrUni, A. Isidori, A. Ruberti, "A method of factorization of the impulse-response matrix,"


IEEE Transactions on Automatic Control, Vol. 13, No.6, pp.739741, 1968

The hypotheses and constructions in this paper are related to those in Chapter Il.
Note 10.5 Further details and developments related to Exercise 10.11 can be found in

D. Kalman, A. Unger, "Combinatorial and functional identities in one-parameter matrices,"


.Ameri can Mathematical Month/v. Vol. 94, No. 1, pp. 21 35, 1987

Note 10.6 Realizability also can be addressed in terms of linear state equations satisfying
constraints corresponding to particular types of physical systems. For example we might be
interested in realizability of a weighting pattern by a linear state equation that describes an
electrical circuit, or a compartmental (bucket) system, or that has nonnegative coefficients. Such
constraints can introduce significant complications. Many texts on circuit theory address this
issue, and for the other two examples we cite

H. Maeda, S. Kodama, F. Kajiya "Compartmental system analysis: Realization of a class of linear


systems with constraints," IEEE Transactions on Circuits and Systems, Vol. 24, No. 1, pp. 8 14,
1977

Y. Ohta, H. Maeda, S. Kodama, "Reachability, observability, and realizability of continuous-time


positive systems," SIAM Journal on Control and Optimization. Vol. 22, No. 2, pp. 171 180,
1984

11
MINIMAL REALIZATION

further examine the realization question introduced in Chapter 10, with two goals in
mind. The first is to suitably strengthen the setting so that results can be obtained for
realization of an impulse response rather than a weighting pattern. This is important
because the impulse response in principle can be determined from input-output behavior
of a physical system. The second goal is to obtain solutions of the minimal realization
We

problem that are more constructive than those discussed in Chapter 10.

Assumptions
adjustment we make to obtain a coherent minimal realization theory for impulse
response representations is that the technical defaults are strengthened. It is assumed that
a given p x rn impulse response G (t, a), defined for all t, a with t a, is such that any
derivatives that appear in the development are continuous for all t, a with t a.
Similarly for the linear state equations considered in this chapter,
One

i(t) =A(t)x(t) + B(t)u(t)


y(t) = C(t)x(t)
we

assume A(t), B(t), and C(t) are such that all derivatives that appear are continuous

for all t. Imposing smoothness hypotheses in this way circumvents tedious counts and
distracting lists of differentiability requirements.
Another adjustment is that strengthened forms of controllability and observability
are used to characterize minimality of realizations. Recall from Definition 9.3 the n x nz
matrix functions

K0(t) =B(r)
=

and for convenience let

j = 1,2,...

183

WL.(t) =

[Ko(t) K1 (t)

k = 1,

Kk_I

2,...

(3)

from Definition 9.9 recall the p x n matrix functions

L0(t) = GO')
=L1_1(r)A(t) +

j = 1,2,...

(4)

let

L0(t)
L1(t)
Mk(t) =

k =

1,

2,...

(5)

define new types of controllability and observability for (1) in terms of the matrices
and M,,(t), where of course ii is the dimension of the linear state equation (1).

L,iortunately the terminology

is

not standard, though some justification for our

can be found in Exercises 11.1 and 11.2.

Ii.! Definition

The linear state equation (1) is called instantaneously controllable if


n for every t.

= n for every t, and instantaneously observable if rank M,,(t) =

If (1) is a realization of a given impulse response G (t, a), that is,

a)B(a), ta

G(t,
a straightforward calculation shows that
a 'J

a'

(t, a) =

i, j

= 0, 1,

(6)

kw alIt, a with t a. This motivates the appearance of the instantaneous controllability


instantaneous observability matrices, W,,(t) and M,,(t), in the realization problem,
leads directly to a sufficient condition for minimality of a realization.
11.2 Theorem Suppose the linear state equation (1) is a realization of the impulse
G (t, a). Then (1) is a minimal realization of G (t, a) if it is instantaneously
controllable and instantaneously observable.

Proof Suppose G (t, a) has a dimension-n realization (1) that is instantaneously


controllable and instantaneously observable, but is not minimal. Then we can assume
that there is an (n 1)-dimensional realization

=A(t)z(t) + B(t)u(t)
y(t)

and write

= C(t)z(t)

(7)

Chapter 11

184

Minimal Realization

G(t, a) = C(t)4A(t, cy)B(a) = C(t)bA(t, a)B(a)


for all t, a with t

a. Difihrentiating repeatedly with respect to both t and a as in (6),


evaluating at a = t, and arranging the resulting identities in matrix form gives, using the
obvious notation for instantaneous controllability and instantaneous observability
matrices for (7),
= M,,(t)W,,(t)

Since M,,(t) has n I columns and


has n I rows, this equality shows that
rank
ii I for all t, which contradicts the hypotheses of instantaneous
controllability and instantaneous observability of (1).
With slight modification the basic realizability criterion for weighting patterns,
Theorem 10.4, applies to impulse responses. That is, an impulse response G (t, a) is
realizable if and only if there exist continuous matrix functions H (t) and F (t) such that

G(t, a)=H(t)F(a)
for all t, a with t

a. However we will develop alternative realizability tests that lead

to more effective methods for computing minimal realizations.

Time-Varying Realizations
The

algebraic structure

of the realization problem

as well

as connections

to

instantaneous controllability and instantaneous observability are captured in terms of


properties of a certain matrix function defined from the impulse response. Given
positive integers i, j, define an (ip) x (jm) behavior matrix corresponding to G (t, a)
with r, q block entry given by
r

G(t, a)
a. That is, in outline form,

for all t, a such that t

G(t, a)
=

a)

a)

G (1, a)

..

aa'-'

G(t, a)
G(t, a)

...

(8)

at''

behavior matrix of suitable dimension to develop a realizability test and a

construction for a minimal realization that involve submatrices of

a).

Time-Varying Realizations

185

A few observations might be helpful in digesting proofs involving behavior

matrices. A subniatriv, unlike a partition, need not be formed from adjacent rows and
columns. For example one submatrix of a 3 x 3 matrix A is
a11 a13
a31 a33

Matrix-algebra concepts associated with F',1(t, a) in the sequel are applied pointwise in
and a (with t a). For example linear independence of rows of
a) involves
linear combinations of the rows using coefficients that are scalar functions of t and a.
To visualize the structure of behavior matrices, it is useful to write (8) in more detail on a
large sheet of paper, and use a sharp pencil to sketch various relationships developed in
the proofs.
11.3 Theorem

Suppose for the impulse response G (t, a) there exist positive integers

1, k, n such that I, kn and


a) = rank

rank

k+! (t,

a) =

(9)

a. Also suppose there is a fixed a x a submatrix of f,k(t, a) that is


a. Then G (t, a) is realizable and has a minimal
realization of dimension a.
for all t, a with t

invertible for all

t, a with t

Proof Assume (9) holds and F (t, a) is an a x a submatrix of flk(t, a) that is


invertible for all t, a with t a. Let
a) be the p x a matrix comprising those
columns of Flk(t, a) that correspond to columns of F(t, a), and let

a)=Fjt, a)F'(t, a)

(10)

a) specify the linear combination of rows


matrix
formed from those rows of F1 l(t. a) that correspond to rows of F(t, a), and let

That is, the coefficients in the i'1'-row of

of F(t, a) that gives the itII_row of

Br(t, a) =

a). Similarly let F,(r, a) be the a x

a)F,(t, a)

(11)

of B,(t, a) specifies the linear combination of columns of F(t, a) that


gives the j"-column of F,(t, a). Then we claim
The

G(t, a) = C, (t, a)F(t, a)B,(t, a)

(12)

for all t, a with r a. This relationship holds because, by (9), any row (column) of
F',t(t, a) can be represented as a linear combination of those rows (columns) of F,k(t, a)
that correspond to rows (columns) of F(t, a). (Again, the linear combinations resulting
from the rank property (9) have scalar coefficients that are functions of t and a, defined

forta.)

In particular consider the single-input, single-output case. If rn = p = 1, then


= k = a, F(t, a) = F,,,,(t, a), and
a) is just the first row of F,,,,(t, a). Therefore

Minimal Realization

Chapter 11

186

a) = ef, the first row of

Similarly B,(t, a) =

and

(12) turns out to be the

obvious

G(t, a) =

F11(t,

a) =

a)e1

(Throughout this proof consideration of the rn = p =

case is a good way to gain

understanding of the admittedly-complicated general situation.)

The next step is to show that


a) = C((t, a)F(t, a), and therefore

a)

of a. From (10),

is independent

F(t, a)

a)

F(t, a)

(13)

a) each column of (aF/aa)(t, a) occurs rn columns to the right of the


corresponding column of F(t, a), and the same holds for the relative locations of
columns of
/ aa)(t, a) and
a). By the rank property in (9), the linear
combination of the
entries of
a) specified by the i'1'-row of
Cjt, a) gives precisely the entry that occurs m columns to the right of the i,j-entry of
a). Of course this is the i,j-entry of
In

a)

F(t, a)

a)

(14)

Comparing (13) and (14), and using the invertibility of F(t, a), gives

a) = 0
for all t, a with t a.
A similar argument can be used to show that Br(t, a) in (11) is independent of t.
Then with some abuse of notation we let

t)F'(t, t)
B,(a) =

a)Fr(a, a)

and write (12) as

G(t, a) =

a)Br(a)

(15)

for all t, a with r a.


The remainder of the proof involves reworking the factorization of the impulse
response in (15) into a factorization of the type provided by a state equation realization.
To this end the notation

a) =
is temporarily convenient. Clearly

F(t, a)

a) is an n x n submatrix of F,.1.1 k + 1(t, a), and


each entry of
a) occurs exactly p rows below the corresponding entry of F(r, a).
Therefore the rank condition (9) implies that each row of F5(t, a) can be written as a

Time-Varying Realizations

187

linear combination of the rows of F(t, a). That is, collecting these linear combination
coefficients into an n x ii matrix A (t, a),

a) = A(t, a)F(t, a)
Also each entry of (aF/aa)(t, a) as a submatrix of

a) occurs rn columns to the

right of the corresponding entry of F (r, a). But then the rank condition and the
interchange of differentiation order permitted by the differentiability hypotheses give

a) =A(t,

F(t, a) =

a)

(17)

This can be used as follows to show that A(r, a) is independent of a. Differentiating


(16) with respect to a gives

a)] F(t, a) + A(t, a)

F(t, a) =

F(t, a)

(18)

From (18) and (17), using the invertibility of F(t, a),

a) =

for all t, a with t a. Thus A(t, a) depends only on t, and replacing the variable a in
(16) by a parameter r (chosen in various, convenient ways in the sequel) we write

t)
Furthermore the transition matrix corresponding to A (t) is given by

a)

=F(t,

t)

is easily shown by verifying the relevant matrix differential equation with identity
initial condition at t = a. Again c is a parameter that can be assigned any value.
To continue we similarly show that F - (f, t)F(t, a) is not a function of I since
as

[F - '(t, t)F(t, a)] =

- '(t, c) [

+ F ' (t, t)

F (I, c)] F - '(t, t)F(t, a)

F(t, a)

F'(t, t)A (t)F(t, a) +

F'(t,

=0
In particular this gives

t)F(t, a)
that is,

t)F(a, a)

r)A (t)F(r, a)

Chapter 11

188

Minimal Realization

F(t, a) = F (t, r)F ' (a, r)F (a, a)

This means that the factorization (15) can be written as

G(t, a) =
=

for all t, a with t

t)F (a, a)B,(a)

(t, r)F

[F, (t, t)F ' (t, t) I

a)

F,(a, a)

a. Now it is clear that an n-dimensional realization of G (t, a) is

specified by

A(t)=Fc(t,

t)
t)

C(t) =

r)

Finally since 1, k n, r,,,,(t, a) has rank at least n for all t, a such that t a.
t) has rank at least ,i for all t. Evaluating (6) at a = t and forming
r) gives F,m(t, t) = M,1(t)W,,(t), so that the realization we have constructed is

Therefore

instantaneously controllable and instantaneously observable, hence minimal.

ODD
Another minimal realization of G (t, a) can be written from the factorization in

(19), namely

t)Fr(t, t)zi(f)
y (t) =

r a parameter). However it is easily shown that the realization specified by (20),


unlike (21), has the desirable property that the coefficient matrices turn out to be constant
if G (t, a) admits a time-invariant realization.
11.4 Example

Given the impulse response


G (t, a) = e'sin(t a)

the realization procedure in the proof of Theorem 11.3 begins with rank calculations.
These show that, for all t, a with t a,

a) =

e'sin(ta)
e

e'cos(ta)
l

[cos(ta)sin(ra)] e [cos(ta)-i-sin(ta)]
.

has rank 2, while dci r13(r, a) = 0. Thus the rank condition (9)
= k = n = 2, and we can take F(t, a) = r',,(t, a). Then

F(t,r)=

e'

e'
e

is

satisfied with

Time-Invariant Realizations

189

Straightforward differentiation of F(t, a) with respect to t


F(t, t) =

leads to

[e2: e'

Finally since Fjt, r) is the first row of F11(t, t),

arid

(22)

Fr(t, t)

is

the first column, the

minimal realization specified by (20) is

x(t)

01
=

y(t)=

[1

x(t)

11(t)

O]x(t)

Time-Invariant Realizations
We

now pursue the specialization and strengthening of Theorem 11.3 for the time-

invariant case. A slight modification of Theorem 10.8 to fit the present setting gives that

a realizable impulse response has a time-invariant realization if it can be written as


G (t a). For the remainder of this chapter we simply replace the difference t a by t,
and work with G (t) for convenience. Of course G (r) is defined for all I 0, and there
is no loss of generality in the time-invariant case iii assuming that G (t) is analytic.
(Specifically a function of the form CeAtB is analytic, and thus a realizable impulse
response must have this property.) Therefore G (t) can be differentiated any number of
times, and it is convenient to redefine the behavior matrices corresponding to G (t)

G(t)

fG(t)

...

as

G(t)

r'1(t)

(23)

G(t)

G(t)

. .
.

where

i, j are positive integers and t 0. This differs from the definition of

a) in

(8) in the sign of alternate block columns, though rank properties are unaffected. As a
corresponding change, involving only signs of block columns in the instantaneous
controllability matrix defined in (3), we will work with the customary controllability and
observability matrices in the time-invariant case. Namely these matrices for the state
equation

i(t) =Ax(t) + Bu(t)

y(t)=Cx(t)

(24)

are

Minimal Realization

Chapter 11

190

given in the current notation by


C

=
=

[B

AB

CA

M,,

(25)

CA" -'
Theorem 11.3, a sufficient condition for realizability, can be restated as a necessary and

sufficient condition in the time-invariant case. The proof is strategically similar,


employing linear-algebraic arguments applied pointwise in t.

Theorem The analytic impulse response G (t) admits a time-invariant realization


(24) if and only if there exist positive integers 1, k, ii with 1, k n such that
11.5

rank F',k(t)= rank F,+l,L+l(t)=n ,

tO

(26)

that is invertible for all t


conditions hold, then the dimension of a minimal realization of G (t) is n.

and there is a fixed n x ii submatrix of F,k(1)

0.

If these

Proof Suppose (26) holds and F(t) is an n x n submatrix of r,k(t) that is invertible
be the p x n matrix comprising those columns of rlk(1) that
for all t 0. Let
correspond to columns of F (t), and let Fr(t) be the n x iii matrix of rows of r,1(t) that
correspond to rows of F (t). Then
=

Br(t) =

yields the preliminary factorization

G(t) = Cc(t)F(t)Br(t), t 0
exactly as in the proof of Theorem 11.3.
Next we show that
is a constant matrix by considering
'(t)

'

(t)F(r)F

F' (t)

(t)
(28)

each entry of F(t) occurs m columns to the right of the corresponding


entries of
entry of F(t). By the rank property (26) the linear combination of
F(t) specified by the i'1'-row of C((t) gives the entry that occurs ni columns to the
This is precisely the i,j-entry of Fjt), and so (28) shows
right of the i,j-entry of
that
= 0, t 0. A similar argument shows that B,(t) = 0, t 0. Therefore, with a
familiar abuse of notation, we write these constant matrices as
Ifl

Time-Invariant Realizations

191

C,

(29)

Br

Then (27) becomes


G (t)

(t)Br, t 0

(30)

The remainder of the proof involves further manipulations to obtain a factorization


corresponding to a time-invariant realization of G (t); that is, a three-part factonzation
with a matrix exponential in the middle. Preserving notation in the proof of Theorem
must
11.3, consider the submatrix F,(t) =F(t) of F,Ik(t). By (26) the rows of
be expressible as a linear combination of the rows of F (t) (with t-dependent scalar
coefficients). That is, there is an n x n matrix A(t) such that
= AQ)F(t)

However we can show that A (t) is a constant matrix. From (31),


F5(t) = A(t)F(t) + A(t)F(t)

(32)

It is not difficult to check that FcC!) is a submatrix of Fl+lk+I (t), and the rank condition
gives

= A(t)F(t)

(33)

Therefore from (32), (33), and the invertibility of F(t), we conclude A(t) = 0, t 0. We
simply write A for A (t), and use, from (31),
A

Also from (31),

F(t) = eAIF(0), t 0

(34)

Putting together (29), (30), and (34), gives the factorization

G(t) = Fc(0)F_t(0)eAtFr(0)
from which we obtain an n-dimensional realization of the form (24) with coefficients
A

Fr(0)
(35)

Of course these coefficients are defined in terms of submatrices of


and bear a
close resemblance to those specified by (20).
Extending the notation for controllability and observability matrices in (25), it is
easy to verify that

Chapter 11

192

Minimal Realization

flk(t)=MIeMWk, 1,k=1,2,...

(36)

and since

n rank r,k(o)

rank r',,,, (0) rank M,1

W,1

realization specified by (35) is controllable and observable. Therefore by Theorem


10.6 or by independent contradiction argument as in the proof of Theorem 11.2, we

the

conclude that the realization specified by (35) is minimal.

For the converse argument suppose (24) is a minimal realization of G (t). Then
(36) and the Cayley-Hamilton theorem immediately imply that the rank condition (26)
holds. Also there must exist invertible n x n submatrices
composed of linearly
independent rows of M,1, and Fr composed of linearly independent columns of W,,.
Consequently

F(t) = F0e AIF


is a fixed n x ii submatrix of r,,,,(t) that has rank n for t
11.6 Example

0.

Consider the impulse response


[2e_t a(e'e _1)]

G (t)

where a is a real parameter, inserted for illustration. Then F11 (t) = G (t), and
2

a(e2'l) 2 a(e2+ 1)

-l

-2 a(e2'+l)

a(e2'-l)

For a = 0,

rankF11(t)=rankF77(t)=2, t0
so a minimal realization of G (t)

has

dimension two. We can choose

F(t)=f11(t)=e'

Then
F(t)

F(t)

Fr(t) =

and

F(t)

the prescription in (35) gives the minimal realization (a = 0)


1

x(t)=
[

0l]X(t)

?] u(t)

(37)

Time-Invariant Realizations

193

Ii

x(t)
(38)
j
For the parameter value a = 2, it is left as an exercise to show that minimal
realizations again have dimension two. If a 0, 2, then matters are more interesting.
0

Straightforward calculations verify

rank F,,(t) = rank r33(t) = 3 ,

The upper left 3 x 3 submatrix of f'12(r) is not invertible, but selecting columns 1, 2,
and 4 of the first three rows of F27(t) gives the invertible (for all t 0) matrix
2

F(i)=e'

a(e2'l)

a(e2'+ 1)

(39)

a(e2' + 1) a(e2' 1)

This specifies a minimal realization as follows. From F(r) we get

2 2a 0
Fc(0)=F(O)=

l l

2a

and, from F (0),

4a(a+2)

2a

4a2

4a 2a+2

2a

2a+2 4a

Columns 1, 2 and 4 of f,2(0) give

and

the first three rows of

(0) provide

20
Fr(0)

2 2a
Then a minimal realization is specified by (a

0, 2)

001
AF(O)F1(O)

0
1

C =
The

(0)

1 0

00

BFr(0)

20
1

22a

skeptical observer might want to compute Ce"B

to

check controllability and observability to confirm minimality.

verify this realization, and

Chapter 11

194

Minimal Realization

Realization from Markov Parameters


is an alternate formulation of the realization problem in the time-invariant case
that often is used in place of Theorem 11.5. Again we restrict attention to impulse
responses that are analytic for t 0, since otherwise G (t) is not realizable by a timeinvariant linear state equation. Then the realization question can be cast in terms of
There

coefficients in the power series expansion of G (t) about

t = 0.

The sequence of p x m

matrices

(41)

where
,

dt

i=0,l,...

called the Markov parameter sequence corresponding to the impulse response G (t).
Clearly if G (t) has a realization (24), that is, G (t) = CeAIB, then the Markov parameter
is

sequence can be represented in the form


G, = CA'B, i = 0,

1,...

(42)

This shows that the minimal realization problem in the time-invariant case can be
viewed as the matrix-algebra problem of computing a minimal-dimension factorization
of the form (42) for a specified Markov parameter sequence.
The Markov parameter sequence also can be determined from a given transfer
function representation G(s). Since G(s) is the Laplace transform of G (t), the initial
value theorem gives, assuming the indicated limits exist,
= urn sG(s)
S

G1 =lims[sG(s)G0J
G, = urn s[s2G(s) sG0 C1]
and so on. Alternatively if G(s) is a matrix of strictly-proper rational functions, as by
Theorem 10.10 it must be if it is realizable, then this limit calculation can be
implemented by polynomial division. For each entry of G(s), dividing the denominator
polynomial into the numerator polynomial produces a power series in s - Arranging
these power series in matrix form, the Markov parameter sequence appears as the

sequence of matrix coefficients in the expression

G(s) =

+ C1s2 + G,s3 +

The time-invariant realization problem specified by a Markov parameter sequence


leads to consideration of the behavior matrix in (23) evaluated at t = 0. In this setup
often is called a block Hankel matrix corresponding to G (1), or G(s), and is

written as

Realization from Markov Parameters

195

G0 G1
G1

G7

(43)

G_1 G,
By repacking the data in (42) it is easy to verify that the controllability and observability

matrices for a realization of a Markov parameter sequence are related to the block
Hankel matrices by
= MW1

1,

= 1,

2, ...

(44)

In addition the pattern of entries in (43), as i and/or j increase indefinitely, captures


essential algebraic features of the realization problem, and leads to a realizability
criterion and a method for computing minimal realizations.

11.7 Theorem The analytic impulse response G (t) admits a time-invariant realization
(24) if and only if there exist positive integers 1, k, n with I, k n such that

rankf,k=rankF,+Ik+J=n,

j=l,2,...

(45)

If this rank condition holds, then the dimension of a minimal realization of G (t) is n.
Proof Assuming 1, k, and n are such that the rank condition (45) holds, we will
compute a minimal realization for G (t) of dimension n by a method roughly similar to
preceding proofs. Again a large sketch of a block Hankel matrix is a useful scratch pad
in deciphering the construction.
Let Hk denote the ii x km submatrix formed from the first n linearly independent
rows of rlk, equivalently, the first ii linearly independent rows of
Lk. Also let HI
be another n x km submatrix defined as follows. The i '1'-row of HI is the row of
.k
residing p rows below the row of Fj+lk that is the i(iz_row of
A realization of
G (t) can be constructed in terms of these submatrices. Let
(a) F be the invertible n x n matrix formed from the first n linearly independent
columns of Hk,
(b) F, be the n x n matrix occupying the same column positions in HI as does F in Hk,
(c)
be the p x n matrix occupying the same column positions in F'Ik as does F in '1k'
(d) Fr be the n x ni matrix occupying the first in columns of Hk.
Then consider the coefficient matrices defined by
A = F5F

',

B =

F,, C =

(46)

Since F3 = AF, entries in the

of A give the linear combination of rows of F


that results in the i'1' row of F5. Therefore the
of A also gives the linear
combination of rows of Hk that yields the
of HI, that is, HI = AHk.
In fact a more general relationship holds. Let H1 be the extension or restriction of
Hk in
j = 1, 2,..., prescribed as follows. Each row of Hk, which is a row of
either is truncated (if] <k) or extended (if] > k) to match the corresponding row of f11.

Chapter

196

Similarly define

as the extension or restriction of

11

in

Minimal Realization

Then (45) implies

H=AH1, j=1,2,...

(47)

Also
= [Fr

j = 2, 3, ...

(48)

For example H, and H2 are formed by the rows in

G0 G,
0,
G11 G1

respectively, that correspond to the first ,i linearly independent rows in F/A. But then
can be described as the rows of H7 with the first ni entries deleted, and from the
definition of Fr it is immediate that H2 = [Fr
}.
Using (47) and (48) gives
= [Fr AFr

(49)

and, continuing,

From (46) the

[Fr

[B

Fr]

AF,.

Al-tB]. 1=1,2,...

AR

of C specifies the linear combination of rows of F that gives the

i'1'-row of
But then the
of C specifies the linear combination of rows of
Since every row of
can be written as a linear combination of rows of
that gives
it follows that
= CH1 = [GB

CAB

- 'B]

j=l,2,...

[G0 G,

Therefore

j=O,l,...

(50)

and this shows that (46) specifies an n-dimensional realization for G (t). Furthermore it
is clear from a simple contradiction argument involving the rank condition (45), and
(44), that this realization is minimal.

To prove the necessity portion of the theorem, suppose that G (t) has a timeinvariant realization. Then from (44) and the Cayley-Hamilton theorem there must exist
integers 1, k, n, with 1, k n, such that the rank condition (45) holds.
DEEI

Realization from Markov Parameters

197

It should be emphasized that the rank test (45) involves an infinite sequence of
matrices, and this sequence cannot be truncated. We offer an extreme example.

11.8 Example

The Markov parameter sequence for the impulse response


tWO

G(t)=e'

(51)

has l's in the first 101 places. Yielding to temptation and pretending that (45) holds for
I = k = n = would lead to a one-dimensional realization for G (t) a dramatically
incorrect result. Since the transfer function corresponding to (51) is
1

s'0' +s1

-=

observations in Example 10.11 lead to the conclusion that a minimal realization has
dimension n = 102.
As further illustration of these matters, consider the Markov parameter sequence
the

forG(t) =

exp (t2):

(_1)k/2k!
(k/2)!

k even

0, kodd
fork = 0,

1
Pretending we don't know from Example 10.9 (or Remark 10.12) that
this second G (t) is not realizable, determination of realizability via rank properties of
the corresponding Hankel matrix

12

12

12

0
12

0
120
0
120
0
120
0
1680

clearly is a precarious endeavor.

ODD
Suppose we know a priori that a given impulse response or transfer function has a
realization of dimension no larger than some fixed number. Then the rank test (45) on an
infinite number of block Hankel matrices can be truncated appropriately, and
construction of a minimal realization can proceed. Specifically if there exists a
realization of dimension n, then from (44), and the Cayley-Hamilton theorem applied to
M and
rank r,,,, = rank
n , i, j = 1, 2, . . .
(52)

Therefore (45) need only be checked for I, k <n and k +j n. Further discussion of

Chapter 11

198

Minimal Realization

this issue is left to Note 11.3, except for an illustration.

11.9 Example

For the two-input, single-output transfer function

G(s)=

4s2 + 7s +

3
(53)

s3 + 4s2 + 5s + 2

a dimension-4 realization can be constructed by applying the prescription in Example


10. 11 for each single-input, single-output component. This gives the realization
0

00

.v(t)

0001

v(t)=

[3

14(f)

01

7 4 l]x(t)

To check minimality and, if needed, construct a minimal realization, the first step is to
divide each transfer function to obtain the corresponding Markov parameter sequence,

G0={4

1],

G1 =[9 I], G,=[19 i],

G3={39 1], G4=[79 1], G3=[159

1],

Beginning application of the rank test,


rank F27 = rank

rank F3, = rank

[49

=2

11

19

19

(54)

39 l

and continuing we find


rank

=2

Thus by (52) the rank condition in (45) holds with I = k = n = 2, and the dimension of
minimal realizations of G(s) is two. Construction of a minimal realization can proceed
on the basis of F'7, and F'3, in (54). The various submatrices

H
2
F

191
9

[49

19

_l]' F=

' H5
2

9119
19

39 1

Fr=F,

[4

1]

Exercises

199

yield via (46) the minimal-realization coefficients

A=

3]' B= [_9 II' c= [i

0]

The dimension reduction from 4 to 2 can be partly understood by writing the transfer
function (53) in factored form as

(4s+3)(s+l)
(s+2)(s+l)2

(55)

Canceling the common factor in the first entry and applying the approach from Example

10.11 yields a realization of dimension 3. The remaining dimension reduction to


minimality is more subtle.

EXERCISES
If the single-input linear state equation

Exercise 11.1

iit) =A(t)v(t) +
is instantaneously controllable, show that at any time an 'instantaneous' state transfer from any
x(10) to the zero state can be made using an input of the form

u(t) =
where

is the unit impulse,

is the unit doublet, and so on. Hint: Recall the sifting

property
5

Exercise 11.2 If the linear state equation

i(t) =A(t)x(t)
y(t) = C(t)x(t)
is instantaneously observable, show that at any time

the state x (1,,) can be determined


'instantaneously' from a knowledge of the values of the output and its first n derivatives at .
1

Exercise 113

Show that instantaneous controllability and instantaneous observability are

preserved under an invertible time-varying variable change (that has sufficiently many continuous
derivatives).

Exercise 11.4 Is the linear state equation

Chapter 11

200

x(t)

v(t) =

Minimal Realization

I Jx(t)

a minimal realization of its impulse response? If not, construct such a minimal realization.
Exercise 11.5

Show that

k(t)= [
v(t)

t3
=

x(t)

is a minimal realization of its impulse response, yet the hypotheses of Theorem 11.3 are not
satisfied.

Exercise 11.6

Construct a minimal realization for the impulse response

G(t) = ze'
using Theorem 11.5.
Exercise 11.7

Construct a minimal realization for the impulse response

ta

G(t,a)=l

Exercise 11.8 For an n-dimensional, time-varying linear state equation and any positive integers
i, j, show that (under suitable differentiability hypotheses)

rank F,,(z, a) n

for all t, a such that t a.


Show that two instantaneously controllable and instantaneously observable
Exercise 11.9
realizations of a scalar impulse response are related by a change of state variables, and give a
formula for the variable change. Hint: See the proof of Theorem 10.14.
Exercise 11.10

Show that the rank condition (45) implies


= n ; i,

rank

j = 1, 2,

Exercise 11.11 Compute a minimal realization corresponding to the Markov parameter sequence
given by the Fibonacci sequence
0, 1, 1.2,3,5, 8, 13,

Hint:f(k+2) =f(k+1) +f(k).


Exercise 11.12 Compute a minimal realization corresponding to the Markov parameter sequence

I, I,

1,

1, 1,

1,

1, 1,

Then compute a minimal realization corresponding to the truncated' sequence

I, 1, 1,0, 0,0,0, .

Notes

201

Exercise 11.13 For a scalar transfer function G(s), suppose the infinite block Hankel matrix has
Show that the first ii columns are linearly independent, and that a minimal realization is
rank
given by
G,,

G,

G,_1

G,,1

G1

G0

G,,
,

.c=[io

B=

G,,, 2

G,,

0]

NOTES
Note 11.1

Our treatment of realization theory is based on

and realization of time-variable linear systems," Technical


L.M. Silverman,
Report No. 94. Department of Electrical Engineering. Columbia University, New York, 1966
L.M. Silverman, . 'Realization of linear dynamical systems," IEEE Transactions on Automatic
Control, Vol. 16, No.6. pp. 554 567, 1971

It can be shown that realization theory in the time-varying case can be founded on the singlevariable matrix obtained by evaluating r11(t, a) at a = t. Furthermore the assumption of a fixed
invertible submatrix F (t. a) can be dropped. Using a more sophisticated algebraic framework,
these extensions are discussed in

E.W. Kamen, "New results in realization theory for linear time-varying analytic systems," IEEE
Transactions on Automatic Control, Vol. 24, No. 6, pp. 866 877, 1979
For the time-invariant case a different realization algorithm based on the block Hankel matrix is in

B.L. Ho, R.E. Kalman, "Effective construction of linear state variable models from input-output
functions," Regelungstec/znik, Vol. 14, pp. 545 548, 1966.

Note 11.2
A special type of exponentially-stable realization where the controllability and
observability Gramians are equal and diagonal is called a balanced realization, and is introduced
for the time-invariant case in
B.C. Moore. "Principal component analysis in linear systems: Controllability, observability, and
model reduction," IEEE Transactions on Automatic Control, Vol. 26, No. 1, pp. 17 32, 1981
For time-varying systems see

S. Shokoohi, L.M. Silverman, P.M. Van Dooren, "Linear time-variable systems: balancing and
model reduction," IEEE Transactions on Automatic Control, Vol. 28, No. 8, pp. 810822, 1983
E. Verriest, 1. Kailath, "On generalized balanced realizations," IEEE Transactions on Automatic
Control, Vol. 28, No. 8, pp. 833 844, 1983

Recent work on a mathematically-sophisticated approach to avoiding the stability restriction is


reported in

U. Helmke. "Balanced realizations for linear systems: A variational approach," SIAM Journal on
Control and Optimization, Vol. 31, No, 1, pp. 1 15, 1993

202

Chapter 11

Minimal Realization

In the time-invariant case the problem of realization from a finite number of Markov
parameters is known as partial rc'ali:ation. Subtle issues arise in this problem, and these are
studied in, for example,
Note 11.3

Kalman, P.L. FaIb, M.A. Arbib, Topics in Mathe,natical S)'ste,n Theory, Mc-Graw Hill, New
York, 1969
R.E.

R.E. Kalman, 'On minimal partial realizations of a linear input/output map," in Aspects of
Network and System Theory. R.E. Kalman and N. DeClaris, editors, Holt, Rinehart and Winston.
New York, 1971

Note 11.4 The time-invariant realization problem can be based on information about the inputoutput behavior other than the Markov parameters. Realization based on the time-moments of the
impulse response is discussed in
C.

Bruni, A. Isidori, A. Ruberti, "A method of realization based on moments of the impulse-

response matrix," IEEE Transactions on Automatic Control, Vol. 14, No. 2, pp. 203 204, 1969

The realization problem also can be formulated as an interpolation problem based on evaluations
of the transfer function. Recent, in-depth studies can be found in the papers
A.C. Antoulas, B.D.O. Anderson, 'On the scalar rational interpolation problem." IMA Journal of
Maihe,natical Control and Information, Vol. 3, pp. 61 88, 1986
B.D.O. Anderson. A.C. Antoulas. "Rational interpolation and state-variable realizations," Linear
Algebra and its Applications, Vol. 137/138. pp. 479 509. 1990

One motivation for the interpolation formulation is that certain types of transfer function
evaluations in principle can be determined from input-output measurements on an unknown linear
system. These include evaluations at s = i w determined from steady-state response to a sinusoid
of frequency o. as discovered in Exercise 5.2!, and evaluations at real, positive values of s as
suggested in Exercise 12.12. Finally the realization problem can be based on arrangements of the

Markov parameters other than the block Hankel matrix. See

A.A.H. Damen, P.M.J. Van den Hof, A.K. Hajdasinski, "Approximate realization based upon an
alternative to the Hankel matrix: the Page matrix." Sysems & Control Letters, Vol. 2, No. 4, pp.
202208,1982

12
INPUT-OUTPUT STABILITY

In this chapter we address stability properties appropriate to the input-output behavior

(zero-state response) of the linear state equation

=A(t)x(t) + B(t)u(t)
= C(t)x(t)
is, the initial state is set to zero, and attention is focused on boundedness of the
response to bounded inputs. There is no D (t)u (t) term in (1) because a bounded D (t)
does not affect the treatment, while an unbounded D (t) provides an unbounded response
to an appropriate constant input. Of course the input-output behavior of (1) is specified
by the impulse response
That

ta

G(t,
and

stability results are characterized in terms of boundedness properties of IIG(t, cy)lI.

(Notice in particular that the weighting pattern is not employed.) For the time-invariant

case, input-output stability also is characterized in terms of the transfer function of the
linear state equation.

Uniform Bounded-Input Bounded-Output Stability


Bounded-input, bounded-output stability is most simply discussed in terms of the largest
value (over time) of the norm of the input signal, lu (1)11, in comparison to the largest
value of the corresponding response norm lly (t) II. More precisely we use the standard

notion of suprernurn. For example


v = sup llu(t)Il
I I,,

is

defined as the smallest constant such that

lu (t)

II v for t

to.

If no such bound
203

Input-Output Stability

Chapter 12

204

exists, we write

sup Iu(t)II
I

The basic notion is that the zero-state response sh'ould exhibit finite 'gain' in terms of the
input and output suprema.

12.1 Definition

The linear state equation (1) is called uniformly hounded-input,


and any
if there exists a finite constant such that for any

bounded-output stable

11

input signal u (t) the corresponding zero-state response satisfies


sup IIy(t)II
(Io

sup IIu(t)II

The adjective 'uniform' does double duty in this definition. It emphasizes the fact
works for all input
that the same 11 works for all values of t0, and that the same
signals. An equivalent definition based on the pointwise norms of u (t) and y (t) is
explored in Exercise 12.1. See Note 12.1 for discussion of related points, some quite
subtle.

12.2 Theorem

The linear state equation (1) is uniformly bounded-input, boundedt,

output stable if and only if there exists a finite constant p such that for all t, t with t

JjG(t, a)Il dcNp

Proof Assume first that such a p exists. Then for any t0

and

any input defined for

t ti,, the corresponding zero-state response of (1) satisfies

,)B(cy)u(c)da

IIy(t)Il = II

tt0

IIG(t, a)II Iu(a)H


Replacing

lu

by its supremum

over a t0,

and

II

using (4),

lly(t)ll 5 IG(t, a)ll dasup Ilu(t)ll

tt0
I

t,,

Therefore, taking the supremum of the left side over t t0, (3) holds with 11 = p. and the
state equation is uniformly bounded-input, bounded-output stable.

Uniform Bounded-Input Bounded-Output Stability


Suppose

205

now that (1) is uniformly bounded-input, bounded-output stable. Then

there exists a constant 11


input signal such that

so

that. in particular. the zero-state response for any t,,, and any
sup
I?

IIu(t)II I

satisfies

sup IIy(t)II

set up a contradiction argument, suppose no finite p exists that satisfies (4). In other
>
words for any given constant p there exist
and
such that

To

If)

da>p

J
tp

By Exercise 1.19 this implies, taking p


such that

that there exist


the i,j-entry of the impulse response satisfies

>

and

indices i,

In

dcr>i

With

t0 =

(5)

consider the rn x 1 input signal u (t) defined for t t0 as follows. Set


t e [ta,
set every component of 11(t) to zero except for

u(t) = 0 for t >

the j"-component given by (the piecewise-continuous signal)

I,
=

0, G1(t11, t) =0
I ,

te [t0, t1]

t) < 0

1, for all t
II ii (t) II
corresponding zero-state response satisfies, by (5),

This input signal satisfies

but

the P"-component of the

= J

contradiction is obtained that completes the proof.

DOD
An alternate expression for the condition in Theorem 12.2 is that there exist a
finite p such that for all t

Chapter 12

206

Input-Output Stability

IIG(t,a)lIdap

For a time-invariant linear state equation, G (t, a) = G (t a), and the impulse response

customarily is written as G (t)


I
0. Then a change of integration variable
shows that a necessary and sufficient condition for uniform bounded-input, boundedoutput stability for a time-invariant state equation is finiteness of the integral

J IIG(t)II

Relation to Uniform Exponential Stability


now turn to establishing connections between uniform bounded-input, boundedoutput stability and the property of uniform exponential stability of the zero-input
We

response. This is not a trivial pursuit, as a simple example indicates.

12.3 Example

The time-invariant linear state equation

i(t) =
y(t)=

x(t) +
[1

1]x(t)

is not uniformly exponentially stable, since the eigenvalues of A are 1, 1. However


the impulse response is given by G (t) = e '. and therefore the state equation is
uniformly bounded-input, bounded-output stable.

ODD
In the time-invariant setting of this example, a description of the key difficulty is
that scalar exponentials appearing in eM might be missing from G (t). Again

controllability and observability are involved, since we are considering the relation
between input-output (zero-state) and internal (zero-input) stability concepts.
In one direction the connection between input-output and internal stability is easy
to establish, and a division of labor proves convenient.

12.4 Lemma Suppose the linear state equation (1) is uniformly exponentially stable,
and there exist finite constants and such that for all t
IIB(t)II

IIC(t)II j.t

Then the state equation also is uniformly bounded-input, bounded-output stable.

Proof Using the transition matrix bound implied by uniform exponential stability,
5

IG(t, a)II daf IIC(t)lI 1k1(t, a)II IIB(a)IJ

Relation to Uniform Exponential Stability

for all t, t with

207

t r. Therefore the state equation is uniformly bounded-input,

bounded-output stable by Theorem 12.2.

ODD
That coefficient bounds as in (8) are needed to obtain the implication in Lemma
12.4 should be clear. However the simple proof might suggest that uniform exponential
stability is a needlessly strong condition for uniform bounded-input, bounded-output
stability. To dispel this notion we consider a variation of Example 6.11.

12.5 Example

The scalar linear state equation with bounded coefficients

i(t) =

x(t) + 11(t), x(t(,)

y(t) =x(t)

(9)

is not uniformly exponentially stable, as shown in Example 6.11. Since

4(t, t0)

it is easy to check that the state equation is uniformly stable, and that the zero-input
response goes to zero for all initial states. However with = 0 and the bounded input
u (t) = 1 for t 0, the zero-state response is unbounded:

y(t)=j t2+1 da=


In developing implications of uniform bounded-input, bounded-output stability for
uniform exponential stability, we need to strengthen the usual controllability and
observability properties. Specifically it will be assumed that these properties are
uniform in time in a special way. For simplicity, admittedly a commodity in short supply
for the next few pages, the development is subdivided into two parts. First we deal with
linear state equations where the output is precisely the state vector (C (t) is the n X n
identity). In this instance the natural terminology is uniform bounded-input, bounded-

state stability.
Recall from Chapter 9 the controllability Gramian
If

W(r1,,

t)B

t)dt

Chapter 12

208
12.6

Theorem

Input-Output Stability

Suppose for the linear state equation

i(t) = A (t)x(t) +

B (t)ii (t)

y(t) =x(t)
there exist finite positive constants cc,

IIAO)II a,

f3, e, and 6 such that for all t

eI

IIB(t)II

W(t, t)

Then the state equation is uniformly bounded-input, bounded-state stable if and only if it

is uniformly exponentially stable.

Proof One direction of proof is supplied by Lemma 12.4, so assume the linear state
equation (1) is uniformly bounded-input, bounded-state stable. Applying Theorem 12.2
with C (t) = 1, there exists a finite constant p such that

dap
for all t, r such that t
such that

t. We next show that this implies existence of a finite constant qi

$ Ikt)(t, a)II

for all t,

such that t

t,

and

thus conclude uniform exponential stability by Theorem

6.8.

We need to use some elementary facts from earlier exercises. First, since A (t) is
bounded, corresponding to the constant 6 in (10) there exists a finite constant K such that

Ikb(t,a)IIK,
(See

Exercise 6.6.) Second, the lower bound on the controllability Gramian in (10)

together with Exercise 1.15 gives


t) <

for all t, and therefore

t)II 1/c
for all t. In particular these bounds show that
a)II IIBT(y)II

for all a, y satisfying a6?J 6. Therefore writing

Relation to Uniform Exponential Stability

209

a)W'(a& a)

a) = b(t,
=

a) dy

y)

we obtain, since

implies IayI
IIcD(t,

a-)II

dy

kD(t,

Then

Ikb(t, a-)jI d(a6)

[5

d?] d(a-)

IkD(t,

(14)

The proof can be completed by showing that the right side of (14) is bounded for all t, t
such that t t.
In the inside integral on the right side of(14), change the integration variable from
y to = y a + and then interchange the order of integration to write the right side of

(14) as
3

$11 4(t,

13K

d(a)

In the inside integral in this expression, change the integration variable from

to

to obtain

.!L

(15)

IIcD(t,

C)

Since

we can use (11) and (12) with the composition property to bound the

inside integral in (15) as

d IkD(t, t

dt

Therefore (14) becomes


f

&

a6)II d(a8)

13K

f3K p

r such that t t, so uniform exponential stability of the linear state


equation with C (t) = I follows from Theorem 6.8.

This holds for all t,

Chapter 12

210

Input-Output Stability

To address the general case, where C (t) is not an identity matrix, recall that the
observability Gramian for the state equation (1) is defined by

t0)CT(t)C

M (t0, t1) = $

12.7 Theorem
constants a,

t0) dt

Suppose that for the linear state equation (1) there exist finite positive
c1, 3h e2, and 62 such that
IIA(t)II

a,

IIB(t)lI

e11W(t1, t)

IIC(t)II

t+62)

for all t. Then the state equation is uniformly bounded-input, bounded-output stable if
and only if it is uniformly exponentially stable.

Proof
Again uniform exponential stability implies uniform bounded-input,
bounded-output stability by Lemma 12.4. So suppose that (I) is uniformly boundedinput, bounded-output stable, and

is

such that the zero-state response satisfies

sup Ily(t)II Tl sup IIu(t)II


tto
for all inputs u (t). We will show that the associated state equation with C (t) = 1,
namely,

i-(t) =A(t)x(t) + B(t)u(t)


x(t)

Ya(t)
also

is uniformly bounded-input, bounded-state stable. To set up a contradiction

argument, assume the negation. Then for the positive constant


t0 > t0, and bounded input signal Ub(t) such that
>

11y0(t0)II =

there exists a to,

SUp

(20)

Furthermore we can assume that Ub(t) satisfies Ub(t) = 0 for t > t0. Applying this input

to (1), keeping the same initial time t0, the zero-state response satisfies
'a +

62

IIy(t)112

sup
'a

IIy(t)II2dt

+ 8,

.1

,0)CT(t)C (t)dI)(t, t0)x (t0) dt

XT(ta)M(ta, t,, +62)x(t0)


Invoking the hypothesis on the observability Gramian, and then (20),

Time-Invariant Case

211

62

sup

I,,

Using elementary properties of the supremum, including


)2

sup

sup

IIy(t) 112

yields

sup Ily(t)II

sup 11u1,(r)II

(21)

we have shown that the bounded input uh(t) is such that the bound (18) for
uniform bounded-input, bounded-output stability of (1) is violated. This contradiction
implies (19) is uniformly bounded-input, bounded-state stable. Then by Theorem 12.6
the state equation (19) is uniformly exponentially stable, and hence (1) also is uniformly
exponentially stable.
Thus

Time-Invariant Case
Complicated and seemingly contrived manipulations in the proofs of Theorem 12.6 and
Theorem 12.7 motivate separate consideration of the time-invariant case. In the timeinvariant setting, simpler characterizations of stability properties, and of controllability
and observability, yield more straightforward proofs. For the linear state equation

x(t)=Ax(t) +Bu(t)
y(t) = Cx(t)

(22)

the main task in proving an analog of Theorem 12.7 is to show that controllability,
observability, and finiteness of
J II CeA1B II dt

(23)

imply finiteness of

I
Theorem Suppose the time-invariant linear state equation (22) is controllable and
observable. Then the state equation is uniformly bounded-input, bounded-output stable
if and only if it is exponentially stable.
12.8

Proof Clearly exponential stability implies uniform bounded-input, bounded-output


stability since

Input-Output Stability

Chapter 12

212

5 llCe"B ii dt

IC II

lB II file" Ii dt

Conversely suppose (2) is uniformly bounded-input, bounded-output stable. Then (23) is


finite, and this implies
urn

(24)

=0

Using a representation for the matrix exponential from Chapter 5, we can write the
impulse response in the form
/

Ce"B=
A=l

where

GA1(I

j=I

(25)

I)'

are the distinct eigenvalues of A, and the GA] are p x

constant

matrices. Then
+

(fi)!

+ (J2)! )]

If we suppose that this function does not go to zero, then from a comparison with (25) we
arrive at a contradiction with (24). Therefore

lim (* Ce"B) = 0
That is,

urn CAe"B =

(-400

urn

(-400

Ce"AB =

This reasoning can be repeated to show that any time derivative of the impulse response
goes to zero as t 3 oo Explicitly,

i,j=0, I,...
This data implies
C
CA

et" [B AB ... A!'_'B]

lim

= 0

(26)

CA" -'
Using the controllability and observability hypotheses, select

ii

linearly independent

columns of the controllability matrix to form an invertible matrix


and n linearly
independent rows of the observability matrix to form an invertible M,,. Then, from (26),

lirnMae"W,,

Time-Invariant Case
Therefore

tim

= 0

and exponential

stability follows from arguments in the proof of Theorem 6.10.

ODD
For some purposes it is useful to express the condition for uniform bounded-input,
bounded-output stability of (22) in terms of the transfer function G(s) = C(sI A)'B.
We use the familiar terminology that a pole of 0(s) is a (complex, in general) value of
=
5, say s0, such that for some i, j,

If each entry of G(s) has negative-real-part poles, then a partial-fraction-

expansion computation, as discussed in Remark 10.12, shows that each entry of G (t)

has a 'sum of (-multiplied exponentials' form, with negative-real-part exponents.


Therefore
5

IIG(t)II dt

(27)

is finite, and any realization of 0(s) is uniformly bounded-input, bounded-output stable.


On the other hand if (27) is finite, then the exponential terms in any entry of G (t) must
have negative real parts. (Write a general entry in terms of distinct exponentials, and use
a contradiction argument.) But then every entry of 0(s) has negative-real-part poles.
Supplying this reasoning with a little more specificity proves a standard result.

12.9 Theorem The time-invariant linear state equation (22) is uniformly boundedinput, bounded-output stable if and only if all poles of the transfer function
0(s) = C(sJ AY'B have negative real parts.
For the time-invariant linear state equation (22), the relation between input-output
stability and internal stability depends on whether all distinct eigenvalues of A appear as
poles of G(s) = C(sI AY'B. (Review Example 12.3 from a transfer-function
perspective.) Controllability and observability guarantee that this is the case.
(Unfortunately, eigenvalues of A sometimes are called 'poles of A,' a loose terminology
that at best obscures delicate distinctions.)

12.10 Example

The linearized state equation for the bucket system with unity

parameter values shown in Figure 12.11, and considered also in Examples 6.18 and 9.12,
is not exponentially stable. However the transfer function is
(28)

and the system is uniformly bounded-input, bounded-output stable. In this case it is


physically obvious that the zero eigenvalue corresponding to the disconnected bucket
does not appear as a pole of the transfer function.

Chapter 12

214

Input-Output Stability

Figure 12.11 A disconnected bucket system.

EXERCISES
Exercise 12.1

Show that the linear state equation


=

A(t).r(t) + B(i)i,(t)

y(t) = C(t).v(t)
uniformly bounded-input, bounded output stable if and only if given any finite constant 8 there
exists a finite constant such that the following property holds regardless of ti,. If the input signal
satisfies
IIu(i)II 6, 1
is

then

the corresponding zero-state response satisfies

t t,,

II)'(t)II

(Note that a depends only on 8, not on the particular input signal, nor on ta.)

Is the state equation below uniformly bounded-input, bounded-output stable? Is it


uniformly exponentially stable?
Exercise 12.2

1/2
0

x(r)=

0
0

001

y(t) = [0

x(f)

ii(r)

1 ]x(i)

Exercise 12.3 For what values of the parameter a is the state equation below uniformly
exponentially stable? Uniformly bounded-input, bounded-output stable?
x(t)

Oa
=

.v(t)

0
+

11(1)
1

y(f)= [I 0]x(t)
Exercise 12.4 Determine whether the state equation given below is uniformly exponentially
stable, and whether it is uniformly bounded-input, bounded-output stable.
+
01

)'(t)

[1

0]x(t)

[et]

u(t)

215

Exercises
Exercise 12.5 For the scalar linear state equation
=
show that for any 6 > 0, W (t 6,

W(t5, 1)

t) > 0 for all t. Do there exist positive constants a and 6 such that

>afor all t?

Find a linear state equation that satisfies all the hypotheses of Theorem 12.7
except for existence of
and 62, and is uniformly exponentially stable but not uniformly
Exercise 12.6

bounded-input, bounded-output stable.

Exercise 12.7 Devise a linear state equation that is uniformly stable, but not uniformly
bounded-input, bounded-output stable. Can you give simple conditions on B(t) and C(t) under
which the positive implication holds?

Exercise 12.8 Show that a time-invariant linear state equation is controllable if and only if there
exist positive constants 6 and a such that for all t
W(t&, t)
Find a time-varying linear state equation that does not satisfy this condition, but is controllable on
[t 6, 1] for all t and some positive constant 6.

Exercise 12.9 Give a counterexample to the following claim. If the input signal to a uniformly
bounded-input, bounded-output, time-varying linear state equation goes to zero as I * oo, then the
oo. What about the time-invariant
corresponding zero-state response also goes to zero as
a'

case?

Exercise 12.10 With the obvious definition of uniform bounded-input, bounded-state stable, give
proofs or counterexamples to the following claims.

(a) A linear state equation that is uniformly bounded-input, bounded-state stable also is uniformly
bounded-input, bounded-output stable.

(b) A linear state equation that is uniformly bounded-input, bounded-output stable also

is

uniformly bounded-input, bounded-state stable.


Exercise 12.11 Suppose the linear state equation
= A (!).v(t)
with A (a') bounded, satisfies the following total stability property.

>0 such that if liz,, II


for all z and I, then the solution of

Given a> 0 there exist

and the continuous function g(z, t) satisfies lig(z, t)II <82


+

g(:(t), a'),

IIz(t)II

<a, tt,,

=A(t)z(t)

satisfies

for any a',,. Show that the state equation i(r) = A(t)x(a') is uniformly exponentially stable. Hint:
Use Exercise 12.1.

Exercise 12.12 Consider a uniformly bounded-input, bounded-output stable, single-input, timeinvariant linear state equation with transfer function G(s). if X and ii are positive constants, show

Chapter 12

216

Input-Output Stability

that the zero-state response y (t) to

u(t)=e_Xr, tO
satisfies

Jy(t)ehtdt

Under what conditions can such a relationship hold if the state equation is not uniformly
bounded-input, bounded-output stable?
Exercise 12.13

Show that the single-input, single-output, linear state equations

=Av(t) + hu(t)
)'(l) =v(i) + 11(1)
and

i(t)

= (A bc )x(t) + bU(t)

y(I)

ti(t)

inverses for each other in the sense that the product of their transfer functions is unity. If the
first state equation is uniformly bounded-input, bounded-output stable, what is implied about

are

input-output stability of the second?

Exercise 12.14

For the linear state equation

i(t) = A..v(t) + Bu(t) ,

x(O)

y(t) = Cx(t)
suppose rn = p and CB is

invertible. Let P = I B(CB)- 'C and consider the state equation

AB(CBY',(t), z(O)=x(,
w(t) = (CB)-' CAPZ (t) (CBY 'CAB (GB)' i' (t) + (CB)'
Show that if v(t) = r(t) for

0, then w(t) = u(r) for I 0. That is, show that the second state
equation is an inverse for the first. If the first state equation is uniformly bounded-input, boundedoutput stable, what is implied about input-output stability of the second? If the first is
exponentially stable, what is implied about internal stability of the second?

NOTES
Note 12.1

By introduction of suprema in Definition 12.1 we surreptitiously employ a function-

space norm, rather than our customary pointwise-in-time norm. See Exercise

12.1

for an

equivalent definition in terms of pointwise norms. A more economical definition is that a linear
state equation is bounded-input, bounded-output stable if a bounded input yields a bounded zerostate response. More precisely given a t,, and ii (a') satisfying lu (I) II 3 for a' , where 3 is a
finite positive constant, there is a finite positive constant e such that the corresponding zero-state
response satisfies II)' (1)11 c for a' a',,. Obviously the requisite c depends on 6, but also e can
depend on a',, or on the particular input signal u(t). Compare this to Exercise 12.1. where c
depends only on 6. Perhaps surprisingly, bounded-input, bounded-output stability is equivalent to

Notes

217

Definition 12.1, though the proof is difficult. See the papers:

C.A. Desoer, A.J. Thomasian, "A note on zero-stale stability of linear systems," Proceedings of
i/ic First Allerton Conference on Circuit and System Theory. University of Illinois. Urbana,
Illinois, 1963

D.C. Youla, "On the stability of linear systems," IEEE Transactions on Circuits and Systems, Vol.
10, No. 2, pp. 276279, 1963
By this equivalence Theorem 12.2 is valid for the superficially weaker property of bounded-input,
bounded-output stability, though again the proof is less simple.
Note 12.2 The proof of Theorem 12.7 is based on

L.M. Silverman, B.D.O. Anderson, "Controllability, observability, and stability of linear


SIAM .Iournal on Control and Opiimi:ation, Vol. 6. No. 1, pp. 121 130, 1968

This paper contains a number of related results and citations to earlier literature. See also
B.D.O. Anderson. J.B. Moore. "New results in linear system stability." SIAM Journal on Control
and Optimi:ation. Vol. 7, No. 3. pp. 398414, 1969

A proof of the equivalence of internal and input-output stability under weaker hypotheses, called
stabili:abilitv and detectability, for time-varying linear state equations is given in

R. Ravi, P.P. Khargonekar, "Exponential and input-output stability are equivalent for linear
time-varying systems," Sad/,ana. Vol. 18, Part I, pp.31 37, 1993

Note 123 Exercises 12.13 and 12.14 are examples of inverse .system calculations, a notion that is
connected to several aspects of linear system theory. A general treatment for time-varying linear
state equations is in

L.M. Silverman. "Inversion of multivariable linear systems," IEEE Transactions on Automatic


Control. Vol. 14. No. 3, pp. 270276, 1969

Further developments and a more general formulation for the time-invariant case can be found in

L.M. Silverman, H.J. Payne. 'Input-output structure of linear systems with application to the
decoupling problem," SIAM Journal on Control and Optimi:ation. Vol. 9, No. 2, pp. 199 233,
1971

P.J. Moylan. "Stable inversion of linear systems," IEEE Transactions on Automatic Control. Vol.

22,No. l,pp.7478, 1977


E. Soroku, U. Shaked, "On the geometry of the inverse system," IEEE Transactions on Automatic
Control, Vol. 31, No. 8, pp. 751 754, 1986

These papers presume a linear state equation with fixed initial state. A somewhat different
formulation is discussed in

H.L. Weinert, "On the inversion of linear systems," iEEE Transactions on Automatic Control,
Vol. 29, No. 10, pp. 956958, 1984

13
CONTROLLER AND OBSERVER
FORMS

In

this chapter we focus on further developments for time-invariant linear state

equations. Some of these results rest on special techniques for the time-invariant case,
for example the Laplace transform. Others simply are not available for time-varying
systems, or are so complicated, or require such restrictive hypotheses that potential
utility is unclear.
The material is presented for continuous-time state equations. For discrete time
the treatment is essentially the same, differing mainly in controllability/reachability
terminology, and the use of the z-transform variable z in place of s. Thus translation to
discrete time is a matter of adding a few notes in the margin.
Even in the time-invariant case, multi-input, multi-output linear state equations
have a remarkably complicated algebraic structure. One approach to coping with this

complexity is to apply a state variable change yielding a special form for the state
equation that displays the structure. We adopt this approach and consider variable
changes related to the controllability and observability structure of time-invariant linear
state equations. Additional criteria for controllability and observability are obtained in

the course of this development. A second approach. adopting an abstract geometric


viewpoint that subordinates algebraic detail to a larger view, is explored in Chapter 18.
The standard notation
.k(t) =Ax(t) + Bu(t)

y(t) =
continued for an n-dimensional, time-invariant, linear state equation with in inputs
and p outputs. Recall that if two such state equations are related by a (constant) state
variable change, then the n x nm controllability matrices for the two state equations have
the same rank. Also the two np x ii observability matrices have the same rank.
is

110

Controllability

219

Controllability
We

begin by showing that there is a state variable change for (1) that displays the

'controllable part' of the state equation. This result is of interest in itself, and it is used to
develop new criteria for controllability.

13.1 Theorem

Suppose the controllability matrix for the linear state equation (1)

satisfies

rank [B
where

AB ..

=q

0 <q <n. Then there exists an invertible n x n matrix P such that

P'AP=

A,1

B,,

A,2

(nq)xq A22
where

(2)

(3)

(nq)xni

A,, is q x q, B1, is q x rn, and

...

rank [n1,
Proof

=q

The state variable change matrix P is constructed as follows. Select q

linearly independent columns,


from the controllability matrix for (1), that is,
pick a basis for the range space of the controllability matrix. Then let Pq +
p,, be
additional n x 1 vectors such that

P = [,

... Pq

Pq + I

...

p,1]

is invertible. Define G =
B, equivalently, PG = B. The J" column of B is given by
postmultiplication of P by the Jill column of G, in other words, by a linear combination
of columns of P with coefficients given by the Jill column of G. Since the Jill column
of B can be written as a linear combination of p
Pq' and the columns of P are
linearly independent, the last n q entries of the Jth column of G must be zero. This
argument applies for J = 1,..., m, and therefore G = 'B has the claimed form.
Now let F = P'AP so that

PF = [Api Ap2

...

Ap,,]

Since each column of AkB, k 0, can be written as a linear combination of p,


Pq'
the column vectors Ap,
Apq can be written as linear combinations of
Pq.
Thus an argument similar to the argument for G gives that the first q columns of F

must have zeros as the last n q entries. Therefore F has the claimed form. To
complete the proof multiply the rank-q controllability matrix by the invertible matrix

P' to obtain

Chapter 13

220

[B

AIIIB}

AB

[PIB

[G FG

The

Controller and Observer Forms

rank is preserved at each step in (5), and applying again the Cayley-Hamilton

theorem shows that


rank

AUH

=q

ODD
An interpretation of this result is shown in Figure 13.2. Writing the variable

change as

is q x 1, yields a linear state equation that can be written in the

where the partition

decomposed form
(t) = A

+A

+ B 11u (t)

= A221,,c(t)

Clearly z,,jt) is not influenced by the input signal. Thus the second component state
equation is not controllable, while by (6) the first component is controllable.
(0)
u(f)

= A1 i:(t) + A 1:flC(t) + B1
,,.(O)

I
=

13.2 Figure A state equation decomposition related to controllability.

The character of the decomposition aside, Theorem 13.1 is an important technical


device in the proof of a different characterization of controllability.

13.3 Theorem
complex scalar

The linear state equation (1) is controllable if and only if for every
the only complex n x I vector p that satisfies

Controllability

221

PTBO
is

p=

(7)

0.

Proof The strategy is to show that (7) can be satisfied for some and some p 0 if
and only if the state equation is not controllable. If there exists a nonzero, complex,
n x

1 vector p and a complex scalar

pr[B

AB

...

such that (7) is satisfied, then


=

[prB pTAB

... PTAn_IB]

n rows of the controllability matrix are linearly dependent, and thus the
state equation is not controllable.
On the other hand suppose the linear state equation (1) is not controllable. Then by

Theorem 13.1 there exists an invertible P such that (3) holds, where 0 <q <n. Let
=
where Pq is a left eigenvector for A22. That is, for some
><q
complex scalar

0,

and
PTB

[o

PTA

= [0

[A11

P' = [0

This completes the proof.


DDCI
A solution
p of (7) with p 0 must be an eigenvalue and left eigenvector for
A. Thus a quick paraphrase of the condition in Theorem 13.3 is: "there is no left

eigenvector of A that is orthogonal to the columns of B." Phrasing aside, the result can
be used to obtain anpther controllability criterion that appears as a rank condition.
13.4 Theorem

The linear state equation (1) is controllable if and only if

rank [siA B]=n


for every complex scalar s.

Proof Again we show equivalence of the negation of the claim and the negation of
the condition. By Theorem 13.3 the state equation is not controllable if and only if there
is a nonzero, complex, ii x 1 vector p and complex scalar such that (7) holds. That is,
if and only if

Chapter 13

222

Controller and Observer Forms

BJ=0.
But this condition is equivalent to

rank [2JA B]<n


that is, equivalent to the negation of the condition in (8).

DOD
Observe from the proof that the rank test in (8) need only be applied for those
values of s that are eigenvalues of A. However in many instances it is just as easy to

argue the rank condition for all complex scalars, thereby avoiding the chore of
computing eigenvalues.

Controller Form
A special form for a controllable linear state equation (1) that can be obtained by a

change of state variables is discussed next. The derivation of this form is intricate, but
the result is important in revealing the structure of multi-input, multi-output, linear state
equations. The special form is used in our treatments of eigenvalue placement by linear
state feedback, and in Chapter 17 where the minimal realization problem is revisited for
time-invariant systems.
To avoid fussy and uninteresting complications, we assume that
rank B =

rn

in addition to controllability. Of course if rank B <m, then the input components do not
independently affect the state vector, and the state equation can be recast with a lower-

dimensional input. For notational convenience the k" column of B is written as Bk.
Then the controllability matrix for the state equation (1) can be displayed in columnpartitioned form as
F

BI

in

. .

in

A11I

.
I

nI
in

To begin construction of the desired variable change, we search the columns of


(10) from left to right to select a set of n linearly independent columns. This search is
made easier by the following fact. If
is linearly dependent on columns to its left in

(10), namely, the columns in


B, AB

then

A481,

is linearly dependent on the columns in


AB, A2B,. .

That is,
is linearly dependent on columns to its left in (10). This means that, in
the left-to-right search of (10), once a dependent column involving a product of a power

of A and the column Br is found, all columns that are products of higher powers of A
and Br can be ignored.

Controller Form

223

ni, the
controllability index Pj for the controllable
For j = 1
linear state equation (1) is the least integer such that column vector
is linearly
13.5

Definition

dependent on column vectors occurring to the left of it in the controllability matrix (10).

The columns to the left of


B1 ,

. .

in (10) can be listed as

. . .

. .

where, compared to (10), a different arrangement of columns is adopted to display the


columns defining the controllability index Pj. For use in the sequel it is convenient to
express A
as a linear combination of only the linearly independent columns in (11).
From the discussion above,
.

,..

.,

,. .,

is a linearly independent set of columns in (10). This is the linearly independent set
obtained from a complete left-to-right search. Therefore any column to the left of the
semicolon in (11) and not included in (12) is linearly dependent. Thus A
can be
written as a linear combination of linearly independent columns to its left in (10):
,,,

min(pj.

fI

r=l

r=I

q=I

P1 <Pr

Additional facts to remember about this setup are that p


+ . + p,,, =

p,,1 1 by (9), and

n by the assumption that (I) is controllable. Also it is easy to show

that the controllability indices for (1) remain the same under a change of state variables
(Exercise 13.10).
Now consider the invertible n x n matrix defined column-wise by
= [B1 AB1
and

. .

B,,, AB,,,

. .

. .

partition the inverse matrix by rows as

M
M,,

The change of state variables we use is constructed from rows P1, P1 + P2


p,,, =n of M by setting
P1 +
p1

,pj
+

Chapter 13

224
13.6

Controller and Observer Forms

Lemma Then x n matrix P in (14) is invertible.

Proof Suppose there is a linear combination of the rows of P that yields zero,
0
1=1 q=I

the scalar coefficients in this linear combination can be shown to be zero as


follows. From MM' I, in particular rows p1. Pi + P2
+
= n of this
p1 +
identity, we have, for I = 1,..., ni,
Then

AB1

B,,, AB,,,

. .

=[o.o

0...O]

This can be rewritten as the set of identities

0,
0,

q=pi
1, 1=1, q=pj

Now suppose the columns


controllability-index value Pj =

of B correspond to the largest


= pd,. Multiplying the linear combination in (15)

B1

B1,

on the right by any one of these columns, say

gives

=0
i=l q=l

The highest power of A in this expression is p, 1 P1. 1. Therefore, using (16), the

only nonzero coefficient of a y on the left side of (17) corresponds to indices


=

q = PJr' and this gives

Of course this argument shows that (18) holds for r = 1


s. Now repeat the
calculation with the columns of B corresponding to the next-largest controllability
index, and so on. At the end of this process it will have been shown that

= 0, i = 1

rn

Therefore the linear combination in (15) can be written as


I,,

i=I q=I

Controller Form

225

where of course the values of i for which p, = I are neglected.


By,, a column of B corresponding to the largest
controllability-index value , multiply (19) on the right by ABk to obtain

Again working with

fli

=0

Ti.qMp1

q=I

1=1

From (16) the only nonzero y-coefficient on the left side of (20) is the one with indices

= Jr, q =

1, and therefore

=0

= 1,..., s. Proceeding with the columns of B corresponding to


the next largest controllability index, and so on, gives
Again (21) holds for r

i=1,...,lfl

Yi.p11'

That is, the q =

term in the linear combination (20) can be removed, and we


P
proceed by multiplying by A
and repeating the argument. Clearly this leads to the
conclusion that all the i-scalars in the linear combination in (15) are zero. Thus the n
rows of P are linearly independent, and P is invertible. (To appreciate the importance
of proceeding in decreasing order of controllability-index values, consider Exercise
1

13.6.)

ODD
To ease description of the special form obtained by changing state variables via P,
we introduce a special notation.

13.7 Definition Given a set of k positive integers a1


at, with a1 +
the corresponding integrator coefficient rnat,-ices are defined by

001
A0=blockdiagonal

1=1

B0 = block diagonal

0
(cx

xl)

(22)

Chapter 13

226

Controller and Observer Forms

The dimensional subscripts in (22) emphasize the diagonal-block sizes, while


overall A0 is a x n, and B(, is n x k. The terminology m this definition is descriptive in
that the n-dimensional state equation specified by (22) represents k parallel chains of
integrators, with cc integrators in the
chain, as shown in Figure 13.8. Moreover (22)
provides a useful notation for our special form for controllable state equations. Namely
the core of the special form is the set of integrator chains specified by the controllability
indices.

13.8

Figure State variable diagram for the integrator-coefficient state equation.

For convenience of definition we invert our customary notation for state variable
change. That is, setting z (t) = Px (t) the resulting coefficient matrices are PAP ' PB,

and

CP'.

Suppose the time-invariant linear state equation (I) satisfies


13.9 Theorem
rank B = m, and is controllable with controllability indices ps,...,
Then the
change of state variables z (t) = Px (t), with P as in (14), yields the controller foirn state
equation

)z(t) + B0Ru(t)

y(t) =

(23)

where A0 and
are the integrator coefficient matrices corresponding to p,..., pa,,
and where the rn x a coefficient matrix U and the in x ni invertible coefficient matrix R
are given by
1W

Pi

MP
R

(24)

Controller Form

227

P,-oof The relation

PAP' =A0 +B0

can be verified by easy inspection after multiplying on the right by P and writing out
terms using the special forms of P, A0, and B0. For example the i"-block of p, rows in
the resulting expression is
0

Unfortunately it takes more work to verify

PB = B0R

(25)

However invertibility of R will be clear once this is established, since P is invertible


and rank B0 rank B = rn. Writing (25) in terms of the special forms of P, B0, and R
gives, for the itII_block of p1 rows,
0

'B

...

Therefore we must show that


(26)

for i, j = 1,..., in. Firstnote that ifi


and PiPj+l, then (26) follows
directly from (16). So suppose i
and p, = Pj + K, where K 2. Then we need to
prove that

0,
Again using (16), it remains only to show

q= 1,.. ., pl=

pd-i- ici

Chapter 13

228

Controller and Observer Forms

q=p1+1

(27)

To set up an induction proof it is convenient to write (27) as

k=O

0, we use (13), which is repeated here for

where, again, K 2. To establish (28) for k


convenience:
,,,

min[p,.

(28)

ic2

'Br +

(13)

r=I
p1

Replacing

<p,

by p, K on the right side, and multiplying through by

gives

r=I

(1=1

jI

(29)

<p.

In the first expression on the right side, all summands can be shown to be zero (ignoring
the scalar coefficients). For = i the summands are those corresponding to
Al
D
''pI++p,

and

Af
j

PI++Pi

AP,K1D

these terms are zero by (16) and the fact that K 2. For r

the summands are

those corresponding to
+

+pjBr,...,

and again these are zero by (16). For the second expression on the right side of (29), the
r = i term, if present (that is, if i <j), corresponds to
+

Again

this is zero by (16) and K 2. Any term with r

and since

this

that is present has the form

term is zero by (16). Thus (28) has been established for

k =0.
K, where K < ic2. Then fork = K+1,
Now assume that (28) holds fork = 0
K+
we multiply (13) by
+
+
and replace Pj by p, K on the right side, to obtain

229

Form
,,,

p,1

r=I

q=I

(30)

+
p,K <P,

In the first expression on the right side of (30), the summands for r = i correspond to

AK+ID

Al

'-'i

K+p,ic<
i involve

= p2, these terms are zero by (16). The summands for

Since

K+l D

Al

But

Al

DrIB

P1+"+Pi

no power of A in (31) is greater than Pr + K, so by the inductive hypothesis all

terms in (31) are zero.

Finally, for the second expression on the right side of (30), the r = i term, if
present, is
Al
K > K + 2, this term is zero by (16). For r
I the power of A present in the
summand is K+ 1
<K+ 1 +Pr, that is, K +
Therefore the
inductive hypothesis gives that such a term is zero since r i. In summary this

Since

induction establishes (27), and thus completes the proof.

DOD
Additional investigation of the matrix R in (23) yields a further simplification of
the controller form.
13.10 Proposition Under the hypotheses of Theorem 13.9, the invertible m x ni matrix
R defined in (24) is an upper-triangular matrix with unity diagonal entries.

Proof The (i, j)-entry of R is

and for i =j

this is unity by the

identities in (16). For entries below the diagonal, it must be shown that
Al
AD''D _A
''p1+"+p1"

To

1>]

do this the identities in (26), established in the proof of Theorem 13.7, are used.

Specifically (26) can be written as


+

= 0;

i, j

= 1,

. . ,

rn

(33)

To begin an induction proof, fix j = 1 and suppose I> 1. If


then (32) follows
from (16). So suppose p = + ic, where ic 1. Then (13) gives, after multiplying
through by
...

Chapter 13

230
An

Controller and Observer Forms

AKIAPIB
I

,,,

p
p

+p

q=I

r=I

the highest power of A among the summands is no greater than Pt +K2 = p2.
all the summands are zero by (33).
Now suppose (32) has been established for j = 1
J. To show the case
I,
J
+
=
first
note
that
if
i
J
+
2
and
pj+,,
then
(32)
is
zero
by (16). So suppose
j
i J + 2 and =
+ ic, where K
I. Using (13) again gives
Since

RI

L)J+I

''P1+

_RI

AKIAPJ+Ifl
LIJ+I

,,,

p,J

L
r=I

Aq+K2D
Dr
+p,'1

Ri

L
q1

cxJ+I.rqIYlpi+

+
PJ+I

In

<P.

the first expression on the right side, the highest power of A is no greater than

Pi+i + K2 = p, 2. Therefore (33) can be used to show that the first expression is zero.
For the second expression on the right side, any term that appears has the form (ignoring
the scalar coefficient)
+K

and these terms are zero by

'Br =

the inductive hypothesis. Therefore the proof is complete.

While the special structure of the controller form state equation in (23) is not

immediately transparent, it emerges on contemplating a few specific cases. It also

becomes obvious that the special form of R revealed in Proposition 13.10 plays an
important role in the structure of B,,R.

13.11 Example

For the case n = 6, rn = 2,

01000

00100
00010

00000

xxxxx

Pt =

4, and P2 = 2, (23) takes the form

00
00
00 u(t)
lx
00

01

= CP'z(t)

(34)

where "x" denotes entries that are not necessarily either zero or one. (The output
equation has no special structure, and simply is repeated from (23),)

231

Observability

The controller form for a linear state equation is useful in the sequel for addressing
the multi-input, multi-output minimal realization problem. and the capabilities of linear
= 1.
= ii is familiar from Example
state feedback. Of course controller form when
2.5. and Example 10.11.

Observability
Next we address concepts related to observability and develop alternate criteria and a

special form for observable state equations. Proofs are left as errant exercises since they
are so similar to corresponding proofs in the controllability case.

13.12 Theorem

Suppose the observability matrix for the linear state equation (1)

satisfies
C
CA

rank
:

CA" -'
where

0 <1 <n. Then there exists an invertible ii x n matrix Q

Q'AQ =
A21

where A

is I x I, C,,

X I,

P
A,,

CQ = [,1

such that

o}

(35)

and

C1 ,A1,

rank

=1

The state variable change in Theorem 13.12 is constructed by choosing n /


vectors in the nullspace of the observability matrix, and preceding them by / vectors
that yield a set of linearly independent vectors. The linear state equation resulting
from z(t) = Q'x(t) can be written as

=A21:(,(t) +

y(r) =
and is shown in Figure 13.13.

A22z,10(t)

Chapter 13

232

Controller and Observer Forms

:,,(O)

y(')

+A

13.13

13.14 Theorem
complex scalar

Figure Observable and unobservable subsystems displayed by (35).

The linear state equation (1) is observable if and only if for every
the only complex n x 1 vector p that satisfies

Cp=O
is p =

0.

A more compact locution for Theorem 13.14 is "observability is equivalent to


nonexistence of a right eigenvector of A that is orthogonal to the rows of C."
13.15 Theorem The linear state equation (1) is observable if and only if
rank

"

for every complex scalar s.

Exactly as in the corresponding controllability test, the rank condition in (36) need
be applied only for those values of s that are eigenvalues of A.

Observer Form
develop a special form for linear state equations that is related to the concept of
observability, we assume (1) is observable, and that rank C =p. Then the observability
matrix for (1) can be written in row-partitioned form, where the
-block of p rows is
To

C1

denotes the j"-row of C.

Observer Form

233

for the observable


For j =
p. the f" observahility index
is linearly
linear state equation (1) is the least integer such that row vector

13.16 Definition

dependent on vectors occurring above it in the observability matrix.

Specifically for each f.

is the least integer for which there exist scalars cLjrq

and I3jr such that


,,

minhii,.ii,)

r=I

q1

jI

+E

(37)

r=I
'I, <'ir

1, and
controllability case, our formulation is such that m
+ TI!, = n. Also it can be shown that the observability indices are unaffected by
a change of state variables.
Consider the invertible n x n matrix N - defined in row-partitioned form with the
rows
i"-block containing the
As in

the

m +

Ci

CA

Partition the inverse of N -

i1

J)

by columns as

[N1

N,

...

N,1]

Then the change of state variables of interest is specified by

Q=

N,,

...

(38)

On verification that Q is invertible, a computation much in the style of the proof of


Lemma 13.6, the main result can be stated as follows.

13.17 Theorem
Suppose the time-invariant linear state equation (1) satisfies
rank C = p, and is observable with observability indices ifl,..., TIE. Then the change of
state variables z (t) = Q tx (t), with Q as in (38), yields the observer form state equation
=

+ Q'B u(t)
(39)

where A0 and B0 are the integrator coefficient matrices corresponding to

TI

Chapter 13

234

Controller and Observer Forms

where the ii x p coefficient matrix V and the p x p invertible coefficient matrix S


are given by
and

v=

s=

(40)

13.18 Proposition Under the hypotheses of Theorem 13.17, the invertible p xp matrix
S defined in (40) is lower triangular with unity diagonal entries.

13.19 Example The special structure of an observer form state equation becomes
apparent in specific cases. With n = 7, p = 3, m
= 3, and
= 1, (39) takes the
form

OOxOOxx
I OxOOxx
Dl x0Oxx
00 x 00 xx z(t)

OOxlOxx
OOxOlxx

OOxOOxx

0010000

y(t)= 00 x 00 1

OOxOOx

where

:(t)

x denotes entries that are not necessarily zero or one. Note that a unity

observability index renders nonspecial a corresponding portion of the structure.

EXERCISES
Exercise 13.1

Show that a single-input linear state equation of dimension n = 2,

k(t) =Ax(t) + hu(t)


is controllable for every nonzero vector h if and only if the eigenvalues of A are complex. (For the
hearty a more strenuous exercise is to show that a single-input linear state equation of dimension
ii > I is controllable for every nonzero h if and only if n = 2 and the eigenvalues of A are
complex.)

Exercise 13.2 Consider the n-dimensional linear state equation

x(i) +
=

[B11

] u(t)

is q x q and B is q x m with rank q. Prove that this state equation is controllable if


and only if the (n q)-dimensional linear state equation
where A

Exercises

235

=A,,:(t)

A,1r(r)

is controllable.

Exercise 13.3

Suppose the linear state equations

= A,,x,,(t) + B,,u(t)

y(t) = C,,.v0(t)
and

= A,,xb(t) B,,u(f)

Show that if

are controllable, with p0 =

si

rank

A,, B,,

c,,

+ p,,

for each s that is an eigenvalue of A,,, then

x(t) =

B,,C,, Ab

x(i)

u(t)

is controllable. What does the last state equation represent?

Exercise 13.4 Show that if the time-invariant linear state equation

k(t) =Ax(t)

+ Bu(t)

y(t) = Cx(t) + Du(t)


with in p is controllable, and
rank

AB

CD

then the state equation

z(t)

A0 :(t)
0

B
+

u(t)

is controllable. Also prove the converse.

Exercise 13.5

Consider a Jordan form state equation

k(z) =Jx(i)

Bu(t)

in the case where I has a single eigenvalue of multiplicity ii. That is, I is block diagonal and each

block has the form

00
00
with the same A. Determine conditions on B that are necessary and sufficient for controllability.
Does your answer lead to a controllability criterion for general Jordan form state equations?

Chapter 13

236

Controller and Observer Forms

Exercise 13.6 In the proof of Lemma 13.6, show why it is important to proceed in order of
decreasing controllability indices by considering the case a = 3. in = 2. Pi = 2 and p, = I. Write
out the proof twice: first beginning with B and then beginning with B2.
Exercise 13.7 Determine the form of the matrix R in Theorem 13.10 for the case p = I.
p3 = 2. In particular which entries above the diagonal are nonzero?

Exercise 13.8

I Pt <P2
Exercise 13.9

= 3.

Prove that if the controllability indices for a linear state equation satisfy
p,,1. then the matrix R in Theorem 13.10 is the identity matrix.
By considering the example
0

00
00
1010

00

2200

010
001

1/200

show that in general the controllability indices cannot be placed in nondecreasing order by
relabeling input components.

Exercise 13.10 If P is an invertible a x ii matrix and G is an invertible ni x in matrix, show that


the controllability indices for

i(s) =A.v(t) + Bu(t)


(with rank B = a:) are identical to the controllability indices for

= P'AP.v(t) +
and are the same, up to reordering, as the controllability indices for

k(s) =Ax(t) + BGu(t)


Hint: Write, for example,
[BG ABGj=

lB AB]

g]

and show that the number of linearly dependent columns in AAB that arise in the left-to-right
search of [B AB
is the same as the number of linearly dependent columns in
ALBG that arise in the left-to-right search of [BG ABG
A"'BG 1.
Exercise 13.11

Suppose the linear state equation

i(s) = Ax(S) + Ba(s)


is controllable. If K is in x a, prove that

= (A + BK):(t) + Bv(t)
is controllable. Repeat the problem for the time-varying case, where the original state equation is
assumed to be controllable on [t,, ri]. Hint: While an explicit argument can be used in the timeinvariant case, apparently a clever, indirect argument is required in the time-varying case.
Exercise 13.12

Use controller form to show the following. If the sn-input linear state equation

=A.v(t) + Bu(t)

Exercises

237

is controllable (and rcrnk B = in), then there exists an in x n matrix K and an ni x I vector h such
that the single-input linear state equation
= (A + BK)x(t) + Bhu (1)

is controllable. Give an example to show that this cannot be accomplished in general with the
choice K = 0. Hint: Review Example 10.11.
Exercise 13.13

For a linear state equation

=Ax(t)

+ Bu(t)

y(t) = Cx(t)
define the controllability index p as the least nonnegative integer such that

rank [B AB

APIB] =rank [B AB

APB]

rank [B AB

AB

AkB]

Prove that

(a) foranykp.

(h) if rank B =r > 0, then I p ii r + 1,


(c) the controllability index is invariant under invertible state variable changes. State the
corresponding results for the corresponding notion of an observahility index
for the state
equation.

Exercise 13.14

Continuing Exercise 13.13. show that if

[BAB

rank

=s

then there is an invertible n x n matrix P such that

P'AP

A11

o A,

A21

A22 A23

P'B =

a,
0

where the s-dimensional state equation

y(r) = C112(t)
is controllable, observable, and has the same input-output behavior as the original n-dimensional
linear state equation.
Exercise 13.15

Prove that the linear state equation

Controller and Observer Forms

Chapter 13

238

.v(t)=A.x(t) +Bu(t)
is controllable if and only if the only ii x n matrix X that satisfies

XA=AX, XB=0
isX = 0. Hint: Employ right and left eigenvectors of A.
Exercise 13.16

Show that the time-invariant, single-input, single-output linear state equation

*(t) =Ax(i)

+ hu(i)

y(t) = cx(i) + duO)


is controllable and observable if and only if the matrices A and

Ab
c

have no eigenvalue in common.


Exercise 13.17

Show that the discrete-time, time-invariant linear state equation

.v(k+l) =Ax(k) + Bu(k)


is reachable and exponentially stable if and only if the continuous-time, time-invariant linear state
equation

.v(t)=(A!)(A +I)'x(t)

+ (A

+!)'Bu(t)

is controllable and exponentially stable. (Obviously this is intended for readers covering both

time domains.)

NOTES
Note 13.1 The state-variable changes yielding the block triangular forms in Theorem 13.1 and
Theorem 13.12 can be combined (in a nonobvious way) into a variable change that displays a
linear state equation in terms of 4 component state equations that are, respectively, controllable

and observable, controllable but not observable, observable but not controllable, and neither
controllable nor observable. References for this canonical structure theorem are cited in Note
10.2, and the result is proved by geometric methods in Chapter 18.

Note 13.2

The eigenvector test for controllability in Theorem 13.3 is attributed to W. Hahn in

on controllability and observability," Centro Internazionale Matematico


R.E. Kalman,
Estivo Seminar Notes, Bologna, Italy. 1968
The rank and eigenvector tests for controllability and observability are sometimes called 'PBH
tests" because original sources include

V.M. Popov, Hyperstability of Control Systems, Springer-Verlag, Berlin, 1973 (translation of a


1966 version in Rumanian)

V. Belevitch, Classical Network Theo,y, Holden-Day, San Francisco, 1968

M.L.J. Hautus, "Controllability and observability conditions for linear autonomous systems,"
Proceedings of the Koninklljke Akadeniie van Wetenschappen, Serie A, Vol. 72, pp. 443 448,
1969

Notes

239

Note 13.3 Controller form is based on

D.G. Luenberger, "Canonical forms for linear multivariable systems," iEEE Transactions on
Automatic Control, Vol. 12, pp. 290 293, 1967
Our different notation is intended to facilitate explicit, detailed derivation. (In most sources on the
subject, phrases such as 'tedious but straightforward calculations show' appear, perhaps for
humanitarian reasons.) When m = I the transformation to controller form is unique, but in
general it is not. That is, there are P's other than the one we construct that yield controller form,
with different x's. Also, possibly some x's in a particular case, say Example 13.11, are guaranteed
to be zero, depending on inequalities among the controllability indices and the specific vectors
that appear in the linear-dependence relation (13). Thus, in technical terms, controller form is not
1). Extensive discussion
a canonical form for controllable linear state equations (unless m = p
of these issues, including the precise mathematical meaning of canonical form, can be found in
Chapter 6 of

T. Kailath, Linear Systems. Prentice Hall, Englewood Cliffs, New Jersey, 1980
See also

V.M. Popov, "Invariant description of linear, time-invariant controllable systems," SIAM Journal
on Control and Optimization. Vol. 10, No. 2, pp. 252 264, 1972
Of course similar remarks apply to observer form.

Note 13.4 Controller and observer forms are convenient, elementary theoretical tools for
exploring the algebraic structure of linear state equations and linear feedback problems, and we
apply them several times in the sequel. However, dispensing with any technical gloss, the
numerical properties of such forms can be miserable. Even in single-input or single-output cases.
Consult

C. Kenney, A.J. Laub, "Controllability and stability radii for companion form systems,"
Mathematics of Control, Signals, and Systems, Vol. 1, No. 3, pp. 239256, 1988

Note 13.5 Standard forms analogous to controller and observer forms are available for timevarying linear state equations. The basic assumptions involve strong types of controllability and

observability, much like the instantaneous controllability and instantaneous observability of


Chapter 11. For a start consider the papers

L.M. Silverman, "Transformation of time-variable systems to canonical (phase-variable) form,"


IEEE Transactions on Automatic Control, Vol. 11, pp. 300 303, 1966

R.S. Bucy, "Canonical forms for multivariable systems," IEEE Transactions on Automatic
Control, Vol. 13, No. 5, pp. 567569, 1968
K. Ramar, B. Ramaswami, "Transformation of time-variable multi-input systems to a canonical
form," IEEE Transactions on Automatic Control, Vol. 16, No.4, pp. 371 374, 1971
A. Ilchmann, "Time-varying linear systems and invariants of system equivalence," International
Journal of Control, Vol. 42, No.4, pp. 759 790, 1985

14
LINEAR FEEDBACK

The theory of linear systems provides the basis for linear conti-ol theoi-y. In this chapter

we introduce concepts and results of linear control theory for time-varying linear state

equations. In addition the controller form in Chapter 13 is applied to prove the


celebrated eigenvalue assignment capability of linear feedback in the time-invariant
case.

Linear control theory involves modification of the behavior of a given rn-input, poutput, n-dimensional linear state equation

i(t) =A(t)x(t)

B(t)u(t)

y(t) = C(t)x(t)
this context often called the pla,it or open-loop state equation, by applying linear
feedback. As shown in Figure 14.1, linear state feedback replaces the plant input u 0)
by an expression of the form
in

u(t) = K(t)x(t) + N(t)r(t)


where r (t) is the new name for the in x I input signal. Convenient default assumptions
are that the in x n matrix function K(t) and the in x in matrix function NO') are defined
and continuous for all t. Substituting (2) into (I) gives a new linear state equation,
called the closed-loop state equation, described by

x(t)= [A(t)i-B(t)K(t)}x(r) + B(t)N(t)r(t)


y(t) = C(t)x(t)
Similarly linear output feedback takes the form

u(t)=L(t)y(t) + N(t)r(t)
where again coefficients are assumed to be defined and continuous for all t. Output
2dA

Effects of Feedback

241

14.1 Figure

Structure of linear state feedback.

feedback, clearly a special case of state feedback,


resulting closed-loop state equation is described by

is

diagramed in Figure 14.2. The

= [A (t) + B (t)L (t)C (t) ]x (t) + B (t)N (t)r (t)

y(t) = C(t)x(t)
One important (if obvious) feature of either type of linear feedback is that the
closed-loop state equation remains a linear state equation. If the coefficient matrices in
(2) or (4) are constant, then the feedback is called time invariant. In any case the
feedback is called static because at any t the value of a (t) depends only on the values
of r(t) and x(t) or y(:) at that same time. Dynamic feedback where ii(t) is the output
of a linear state equation with inputs r(t) and .v(t) or y (t) is considered in Chapter 15.

14.2 Figure

Structure of linear output feedback.

Effects of Feedback
We begin the discussion by considering the relationship between the closed-loop state
equation and the plant. This is the initial step in describing what can be achieved by

feedback. The available answers turn out to be disappointingly complicated for the
general case in that a convenient, explicit relationship is not obtained. However matters
are more encouraging in the time-invariant case, particularly when Laplace transform
representations are used.

Chapter 14

242

Linear Feedback

Several places in the course of the development we encounter the inverse of a


matrix of the form 1F(s), where F(s) is a matrix of strictly-proper rational functions.
To justify invertibility note that der [1F(s)] is a rational function of S. and it must be
a nonzero rational function since F(s) II * 0 as Is I f oo Therefore [1F(s)
exists for all but a finite number of values of s, and it is a matrix of rational functions.
(This argument applies also to the familiar case of (si AY' = (I /s)(1 A /sY',
though a more explicit reasoning is used in Chapter 5.)
First the effect of state feedback on the transition matrix is considered.
I

14.3 Theorem
and

+BK(t,

If 4A(t,

r)

is the transition matrix for the open-loop state equation (1)

t) is the transition matrix for the closed-loop state equation (3) resulting

from state feedback (2), then

t) =

t) da

r) +

(6)

If the open-loop state equation and state feedback both are time-invariant, then the
Laplace transform of the closed-loop matrix exponential can be expressed in terms of the
Laplace transform of the open-Joop matrix exponential as

(sI A BK)' = [1 (sI AY'BK]'(sI A)'

Proof

(7)

To verify (6), suppose t is arbitrary but fixed. Then evaluation of the right

side of (6) at t = t yields the identity matrix. Furthermore differentiation of the right
side of (6) with respect to t yields

* [DA (t, t) + 5

(t, a)B (a)K (cy)clA +BK(a, t) thr J

=A(t)4A(t, t)
+

(t, t)B (t)K (t

+BK(t, t) + JA

(t, cy)B (a)K (a)cbA +BK(a, r) dcs

t)

t)

Therefore the right side of (6) satisfies the matrix differential equation that uniquely

characterizes

+BK(t, t), and this argument applies for any value of t.

For a time-invariant linear state equation, rewriting (6) in terms of matrix


exponentials, with t = 0, gives
=

eA +

K)a da

Taking Laplace transforms, using in particular the convolution property, yields

Effects of Feedback
(si A

243

BKY' = (si A)1

+ (si

A)'BK(sl

(8)

an expression that easily rearranges to (7).


A result similar to Theorem 14.3 holds for static linear output feedback upon
replacing K (t) by L (t)C (t). For output feedback a relation between the input-output
representations for the plant and closed-loop state equation also can be obtained. Again
the relation is implicit, in general, though convenient formulas can be derived in the
time-invariant case. (It is left as an exercise to show for state feedback that (6) and (7)
yield only cumbersome expressions involving the open-loop and closed-loop weighting
patterns or transfer functions.)

If G (t, t) is the weighting pattern of the open-loop state equation (1)


the weighting pattern of the closed-loop state equation (5) resulting from
static output feedback (4), then
14.4 Theorem

and G(t, t)

is

G(t, t) = G(t, t)N(r) + JG(t, a)L(a)G(a, t) da

(9)

If the open-loop state equation and output feedback are time invariant, then the transfer

function of the closed-loop state equation can be expressed in terms of the transfer
function of the open-loop state equation by

G(s)= [1

(10)

Proof In (6), we can replace K (a) by L (a)C (a) to reflect output feedback. Then
premultiplying by C (t) and postmultiplying by B (t)N (t) gives (9). Specializing (9) to
the time-invariant case, with t = 0, the Laplace transform of the resulting impulseresponse relation gives
G(s) = G(s)N + G(s)LG(s)
From this (10) follows easily.

Don

An alternate expression for G(s) in (10) can be derived from the time-invariant
version of the diagram in Figure 14.2. Using Laplace transforms we write
[I LG(s)]U(s) = NA(s)

Y(s) = G(s)U(s)
This gives

G(s) = G(s)[I

LG(s)]'N

Of course in the single-input, single-output case, both (10) and (11) collapse to

Chapter

244
G(s)

G(s)= lG(s)L

14

Linear Feedback

In a different notation, with different sign conventions for feedback, this is a familiar
formula in elementary control systems.

State Feedback Stabilization


of the first specific objectives that arises in considering the capabilities of feedback
involves stabilization of a given plant. The basic problem is that of choosing a state
feedback gain K(t) such that the resulting closed-loop state equation is uniformly
exponentially stable. (In addressing uniform exponential stability, the input gain N(t)
plays no role. However if we consider any N(t) that is bounded, then boundedness
assumptions on the plant coefficient matrices B(t) and C(t) yield uniform boundedinput, bounded-output stability, as discussed in Chapter 12.) Despite the complicated.
implicit relation between the open- and closed-loop transition matrices, it turns out that
an explicitly-defined (though difficult to compute) state feedback that accomplishes
One

stabilization is available, under suitably strong hypotheses.


Actually somewhat more than uniform exponential stability can be achieved, and

for this purpose we slightly refine Definition 6.5 on uniform exponential stability by
attaching a lower bound on the decay rate.

14.5 Definition The linear state equation (I) is called uniformly exjonentialLv stable
rate A. where A is a positive constant, if there exists a constant 'y such that for any
t0 and V() the corresponding solution of (1) satisfies
IIx(t)

II

II.v0 II

Lemma The linear state equation (1) is uniformly exponentially stable with rate
A + a, where A and a are positive constants, if the linear state equation

14.6

[A(t)+aI]:(t)

is uniformly exponentially stable with rate A.

Proof It is easy to show by differentiation that .v(1) satisfies


= A (t )x (t)

if and only if:(t) =

x (1(J)

satisfies

= [A(t) +aI ]:(t)


Now assume there is a y such that for any x,, and

I:(t)II
Then, substituting for

(t).

=.v,,

the

resulting solution of (12) satisfies

tto

Stabilization
e

u(1 (,)

245

x(t) =e a(!t) x(t)

ye

X(tt,)

x0

immediately implies that (1) is uniformly exponentially stable with rate

+a

The following stabilization result relies on a strengthened form of controllability


for the state equation (1). Recalling from Chapter 9 the controllability

W(t0, t1) =

a) da

a)B

also the related notation


Wa(t(,,

for

a) da

a)B

a> 0.

14.7 Theorem

For the linear state equation (1), suppose there exist positive constants

6, Ci, and 2 such that

e11W(t,
for all t. Then given a positive constant a the state feedback gain

K(t) =

(t, t +6)

(16)

is such that the resulting closed-loop state equation is uniformly exponentially stable
with rate a.

Proof Comparing the quadratic forms XTWa((, t + 6)x and XTW (t, t + 6)x, using
the definitions (13) and (14), yields

for all t. Therefore (15) implies


2c1 e

t + 6)

2e2/

(17)

for all t, and in particular existence of the inverse in (16) is obvious. Next we show that
the linear state equation

i(t) = [A(t) B(t)BT(t)Wt (t, t + 6) +aI] z(t)


is uniformly exponentially stable by applying Theorem 7.4 with the choice

:)
Obviously Q (t) is symmetric and continuously differentiable. From (17),

Chapter

246

for all

t. Therefore

14

Linear Feedback

it remains only to show that there is a positive constant v such that


(t)

[A (t) B (t)BT(t)Q (t) +

+ Q(t)[A(t)_B(1)BT(t)Q(t)+uiI + Q(t) vi

(21)

for all t. Using the formula for derivative of an inverse,

Q(t) =

(t)[

Wa(t. t +

= Q (t)[ 2e

Q (r)

t + ) 2B (t)BT(t)

+ 4aQ'(t) + A(t)Q1(t) + Q'(r)A'(t)]Q(t)


Substituting this expression into (21) shows that the left side of (21) is bounded above

(in the matrix sign-definite sense) by 2aQ(e'). Using (20) then gives that an appropriate
choice for v is ale,. Thus uniform exponential stability of (18) (at some positive rate)
is established. Invoking Lemma 14.6 completes the proof
For a time-invariant linear state equation,

i(t) =Ax(t) + Bu(t)


y(t) = cx(t)
it

(22)

is not difficult to specialize Theorem 14.7 to obtain a time-varying linear state

feedback gain that stabilizes. However a profitable alternative is available by applying


algebraic results related to constant-Q Lyapunov functions that are the bases for some
exercises in earlier chapters. Furthermore this alternative directly yields a constant
state-feedback gain. For blithe spirits who have not worked exercises cited in the proof,
another argument is outlined in Exercise 14.5.
14.8 Theorem

Suppose the time-invariant linear state equation (22) is controllable,

and let
a,11 =

Then

for any a>

IA II

the constant state feedback gain

K=_BTQ_I
where Q

is

(23)

the positive definite solution of


(A + ai)Q + Q(A

j)T = BBT

(24)

is such that the resulting closed-loop state equation is exponentially stable with rate a.

Eigenvalue Assignment

247

Proof Suppose a> a,?, is fixed. We first show that the state equation
= (A

+aI):(r) + Bv(t)

(25)

is exponentially stable. But this follows from Theorem 7.4 with the choice Q (r) =

1.

Indeed the easy calculation

_(A+ctl)TQ_Q(A+aI)= _2a/_A_AT
2a1 +

2a,,,/

shows that an appropriate choice for v is 2(ac,,).


Therefore, using Exercise 9.7 to conclude that (25) also is controllable, Exercise
9.8 gives that there exists a symmetric, positive-definite Q such that (24) is satisfied.
Then (A +al_BBTQ_I) satisfies

(AaI_BBTQ_l)Q +Q(A+aI_BBTQ_l)T
= (A

+aI)Q +

Q(A

+cti) 2BB T

= _BBT

By Exercise 13.11 the linear state equation

= (A

a!_BBTQ_I)=(f) + Bv(t)

(26)

is controllable also, and thus by Exercise 9.9 we have that (26) is exponentially stable.
Finally Lemma 14.6 gives that the state equation

i(r) = (A_BBTQ_!)x(t)
is exponentially stable with rate a, and of course this is the closed-loop state equation
resulting from the state feedback gain (23).

Eigenvalue Assignment
Stabilization in the time-invariant case can be developed in several directions to further

show what can be accomplished by state feedback. Summoning controller form from
Chapter 13, we quickly provide one famous result as an illustration. Given a set of
desired eigenvalues, the objective is to compute a constant state feedback gain K such
that the closed-loop state equation

i(t) =

(A

+BK)x(t)

(27)

has precisely these eigenvalues. Of course in almost all situations eigenvalues are
specified to have negative real parts for exponential stability. The capability of assigning
specific values for the real parts directly influences the rate of decay of the zero-input

response component, and assigning imaginary parts influences the frequencies of


oscillation that occur.

Because of the minor, fussy issue that eigenvalues of a real-coefficient state


equation must occur in complex-conjugate pairs, it is convenient to specify, instead of
eigenvalues, a real-coefficient, degree-n characteristic polynomial for (27).

Chapter 14

248

Linear Feedback

Theorem Suppose the time-invariant linear state equation (22) is controllable and
rank B = rn. Given any monic degree-n polynomial p
there is a constant state
feedback gain K such that det (?. 1A BK) = p (k).
14.9

Proof First suppose that the controllability indices of (22) are Pie.. ., p,,,, and the
state variable change to controller form described in Theorem 13.9 has been applied.
Then the controller-form coefficient matrices are

PAP' = A(, + BQUP ', PB

B(,R

+
= X" + p,, +
a feedback gain KCF for the new state
given p
equation can be computed as follows. Clearly
and

PAP -' + PBKCF = A0 +

B0

UP1 +

=A0

+RKCF)

(28)

Reviewing the form of the integrator coefficient matrices A0 and B0, the i'1'-row of
UP -' + RKCF becomes row
+
+ p, of PAP -' + PBKCF. With this observation
there are several ways to proceed. One is to set
ep1

41

KCF=R'UP'
e91

Po

where

denotes the j"-row of the n x n identity matrix. Then from (28),

PAP'

+ PBKCF =A0 + B0
+

Po

P1

Pni

0
0

I...
0...

0
0

0'

Po

Pi

Either by straightforward calculation or review of Example 10.11 it can be shown that


PAP + PBKCF has the desired characteristic polynomial. Of course the characteristic
polynomial of A + BKcpP is the same as the characteristic polynomial of

Noninteracting Control

249

+PBKCF

Therefore the choice K =

KC.FP

(29)

is such that the characteristic polynomial of A + BK is

p(A).

ODD
The input gain N(t) has not participated in stabilization or eigenvalue placement,
obviously because these objectives pertain to the zero-input response of the closed-loop
state equation. The gain N (t) becomes important when zero-state response behavior is
an issue. One illustration is provided by Exercise 2.8, and another occurs in the next
section.

Noninteracting Control
stabilization and eigenvalue placement problems employ linear state feedback to
change the dynamical behavior of a given plantasymptotic character of the zero-input
response, overall speed of response, and so on. Another capability of feedback is that
structural features of the zero-state response of the closed-loop state equation can be

The

changed. As an illustration we consider a plant of the form (1) with the additional
assumption that p = in, and discuss the problem of iloninteracring control. This problem
involves using linear state feedback to achieve two input-output objectives on a specified
time interval [t0, ti]. First the closed-loop state equation (3) should be such that for i

the j"-input component r,(t) has no effect on the i"-output component v,(t) for all
r e [ti,, 'jl. The second objective, imposed in part to avoid a trivial solution where all
output components are uninfluenced by any input component, is that the closed-loop
state equation should be output controllable in the sense of Exercise 9.10.
It is clear from the problem statement that the zero-input response plays no role in

noninteracting control, so we assume for simplicity that .v(10) = 0. Then the first
objective is equivalent to the requirement that the closed-loop impulse response
G(t, a) =

a)B(a)N(a)

a diagonal matrix for all t and a such that tj t a t0. A closed-loop state
equation with this property can be viewed from an input-output perspective as a
be

collection of in independent, single-input, single-output linear systems. This simplifies


the output controllability objective, because from Exercise 9.10 output controllability is
achieved if each diagonal entry of G(t, a) is not identically zero for t1 t a t0.
(This condition also is necessary for output controllability if rank C (lj) = in.)
To further simplify analysis the input-output representation can be deconstructed
to exhibit each output component. Let C 1(t)
C,,,(t) denote the rows of the in x ii
matrix C (t). Then the i'1'-row of G(t, a) can be written as

G,(t, a) = C!(t)c1,t+BK(t, a)B(a)N(a)


and the

component is described by

(30)

Chapter 14

250

y(t)

= 5 G,(t,

Linear Feedback

a)r(a) da

In this format the objective of noninteracting control is that the rows of G(t, a) have the
form

i=

G,(t, a) =g1(t,
t a to, where each
denotes the i'1' -row of 11,1.

for t1

(31)

scalar function g(t, a) is not identically zero, and e

of the noninteracting control problem involves smoothness


assumptions stronger than our default continuity. To unclutter the development we
Solvability

proceed as in Chapters 9 and II, and simply assume every derivative that appears is
endowed with existence and continuity. After digesting the proofs, the fastidious will
find it satisfyingly easy to summarize the continuous-differentiability requirements.
An existence condition for solution of the noninteracting control problem can be

phrased in terms of the matrix functions L0(r), L (t), ... introduced in the context of
observability in Definition 9.9. However a somewhat different notation is both
convenient and traditional. Define a linear operator that maps I x n time functions, for
example C(t), into 1 x n time functions according to
LA [C,I(t) = C,(t)A (t) + C1(t)

(32)

In this notation a superscript denotes composition of linear operators,

[C,](t) =

[CJ(t) 1(t)

LA

= 1,2,

and, by definition,
=

C(t)

An analogous notation is used in relation to the closed-loop linear state equation:


LABK[Cj](t)

= C(r)[A(f) + B(t)K(t)] + C,(t)

It is easy to prove by induction that

a) =

a)], j = 0, 1,...

(33)

an expression that on evaluation at a = t and translation of notation recalls equation (20)


of Chapter 9. Going further, (30) and (33) give
a)

a)B(a)N(a), j = 0, 1,...

(34)

Noninteracting Control

251

A basic structural concept for the linear state equation (1) can be introduced in
terms of this notation. The underlying calculation is repeated differentiation of the 1"-

component of the zero-state response of (I) until the input

appears with a

14(t)

coefficient that is not identically zero. For example


= C,(t)x(t) +
= [1(t)

+ C(t)A(t)]x(t) + C,(t)B(t)u(t)

In continuing this calculation the coefficient of u (1) in the

is

[C1J(t)B(t)

at least up to and including the derivative where the coefficient of the input is nonzero.
The number of output derivatives until the input appears with nonzero coefficient is of
main interest, and a key assumption is that this number not change with time.
14.10 Definition The linear state equation (1) is said to have constant ,-elative degree
lCm on [t0, tj] if Kj,. .., ic,,, are finite positive integers such that
L'A[C,](t)B(t) = 0, t
[C1](t)B(t)

for i =

[re, t1]

, J

=0

K,2

0 , t a [t0, t1]

(35)

m.

We emphasize that the same constant ic, must be such that the relations (35) hold

at eveiy t in the interval. Straightforward application of the definition, left as a small


exercise, provides a useful identity relating open-loop and closed-loop operators.

14.11 Lemma Suppose the linear state equation (I) has constant relative degree
ic,..., ic,,, on [t0, tj]. Then for any state feedback gain K(t), and i =
rn,
1

j=0

ic

t E [t0, t1]

(36)

Existence conditions for solution of the noninteracting control problem on a


specified time interval [t,,, tf] rely on intricate but elementary calculations. A slight

complication is that N (t) could fail to be invertible (even zero) on subintervals of


[t0, t1], so that the closed-loop state equation ignores portions of the reference input yet
is output controllable on [t,,, t1]. We circumvent this impracticality by considering only

the case where N(t) is invertible at each t

[t,,, ff1.

In a similar vein note that the

following existence condition cannot be satisfied unless

rankB(t) =
14.12 Theorem

t E [t,,, t1]

Suppose the linear state equation (1) with p =

rn,

and suitable

differentiability assumptions, has constant relative degree 1(1


t1]. Then
IC,,, Ofl
there exist feedback gains K(t) and N(t) that achieve noninteracting control on [r0, t1],
with N(t) invertible at each t E
tfl, if and only if them >< m matrix

________

Chapter 14

252

Linear Feedback

(37)

[C,,j(t)B 0)
is

invertible at each t e [to, tf].


To streamline the presentation we compute for a general value of index i,
= 1,..., ,n, and neglect repetitive display of the argument range
t a
The

first step is to develop via basic calculus a representation for G1(t, a) in terms of its own
derivatives. This permits characterizing the objective of noninteracting control in terms
by (34).
of
For any a the I x rn matrix function G,(z, a) can be written as

G,(t, a) = G(t, a)

(38)

G,(a1, a) dcs1

+5

Similarly we can write

_LG1(a,a)=

+ 5
a

a,

and substitute into (38) to obtain


G1(t. a) = G,(z, a)

+5
a

a)

a)do,do1

55
ao

0i0

Next write
a.

a)=
a

and substitute into (39). Repeating this process

G,(t, a) = G(a, a) +

a1

a3

1 times yields the representation

G(a1, a)

(t a)

(t)K._l
+

a)

aK_l=a
a0

(39)

Noninteracting Control

253

Using (34) gives

(a)N(a) (fa)

(a)N(a) +

G1(t, a) =

[C1](cs)B

(a)N (a)

(ia)

KI

1)!

a1

JJ

Then from (35) and (36) we obtain

[C,](a)B(a)N(a)

G,(t,

(fa)

K1

a1

+ Jf

a)B(a)N(a) daK.

da1

(40)

In terms of this representation for the rows of the impulse response, noninteracting

control is achieved if and only if for each i there exist a pair of scalar functions g(a)
and f.(aK,, a), not both identically zero, such that
(a)N (a) = g1(a)e1

and

a)B (a)N (a) = f,(aK, a)e,

[CIJ(aK1 )1A

(42)

For the sufficiency portion of the proof we need to choose gains K (t) and N (t) to
satisfy (41) and (42) for i = 1,..., rn. Surprisingly clever choices can be made. The
assumed invertibility of
at each t permits the gain selection
(43)

Then

[C1](a)B(a)N(a) =

(a)

=
and

(41) is satisfied with g,(a) =

1.

= LA +8K [

To address (42), write


[C,J(i')]

[C](t) [A (t) + B (t)K(t) J

K (i) =

[C,](r)

(44)

Choosing the gain

where

'(t)

(t) +

(45)

Chapter 14

254

Linear Feedback

[C,,,](t)

and substituting into (44) gives


[C,](t)A (t)

(t) [Q(t)A (t) +

'[C1](r)B

+
=

L'

[C1](t)A (t)

'[C1](t)

=0
Therefore (42) is satisfied with

a) identically zero. Since the feedback gains (43)


and (45) are independent of the index i, noninteracting control is achieved for the
corresponding closed-loop state equation.
To prove necessity of the invertibility condition on i\(t), suppose K (t) and N (t)

achieve noninteracting control, with N(t) invertible at each t. Then (41) is satisfied, in
particular. From the definition of relative degree and the invertibility of N (a), we have
OE [t0,t1]
This argument applies for i =

ni, and the collection of identities represented by

(41) can be written as

A(a)N(a) = diagonal
It follows that

(g (a),...,

is invertible at each a a [t0, t4.

DOD
Specialization of Theorem 14.12 to the time-invariant case is almost immediate
from the observability lineage of LA[Cj](r). The notion of constant relative degree
deflates to existence of finite positive integers ic1

i, such that

j=0,...,K12
(46)

for i = 1,..., m. It remains only to work out the specialized proof to verify that the
time interval is immaterial, and that constant gains can be used (Exercise 14.13).
14.13 Corollary

Suppose the time-invariant linear state equation (22) withniphas


=
relative degree K1,..., 1rn Then there exist constant feedback gains K and invertible

Noninteracting Control

255

N that achieve noninteracting control if and only if the m x rn matrix

C A

- 'B
(47)
1

C,,,A

is

invertible.

14.14 Example

For the plant

0100

x(t) =

b(t)

x(t) +

1101

u(t)
1

y(t)=
simple calculations give

Lg[C,I(t)B(t)= [0

0]

LA[C,}(t)B(t)= [i

1]

[b(t) 0]

If [t0, t1] is an interval such that b (t) 0 for t [t0, ti], then the plant has constant
relative degree K, = 2, ic2 = I on [ta, t1]. Furthermore

is

invertible for t e [ti,, t1}. The gains in (43) and (45) yield the state feedback

u(t)=
and the

?]x(t)

[?

resulting noninteracting closed-loop state equation is

1201
g g

2202
y(t)=

10
x(t) +

10

1/b(t)]r(t)

(48)

Chapter 14

256

Linear Feedback

Additional Examples
return to examples in Chapter 2 to illustrate the capabilities of feedback in
modifying the dynamical behavior of an open-loop state equation. Other features of
feedback, particularly and notably in regard to robustness properties of systems, are left
We

to the study of linear control theory.

14.15 Example

The linear state equation


0

..

o
a0(r) o
a1(t)

y(t)= [1

...

:::_'
a,,

x(t) +
h0(t)

1(t)

OJx(t)

(49)

is developed in Example 2.5 as a representation for a system described by an n"-order


linear differential equation. Given any degree-n polynomial
p(?.) =
and

assuming b (:)
U

(1)

0 for all t, the

= b0(t) [ao(t) Pa

state feedback

1(t)

a,,_1 (t)

Pi

P,,-i ] x (t) + b0(t) r (t)

yields the closed-loop state equation


o

y(t)= [1

1.

Po

Pi

P,,-i

r(t)

.. 0]x(t)

(50)

Thus we have obtained a time-invariant closed-loop state equation, and a straightforward


calculation shows that its characteristic polynomial is p (X). This illustrates attributes of

the special form of (49) in the time-varying case, and when specialized to the timeinvariant setting it illustrates the simple single-input case underlying our general proof
of eigenvalue assignment. Also the conversion of (49) to time invariance further
demonstrates the tremendous capability of state feedback.

14.16 Example The linearization of an orbiting satellite about a circular orbit of radius
r0 and angular velocity cot, is described in Example 2.7, leading to

Additional Examples

257

00
0 2r(,w()

1 000
00
The

00
x(t) +

u(t)

(51)

0 1/,o

x(t)

output components are deviations in radius and angle of the orbit. The inputs are

radial and tangential force on the satellite produced by internal means. An easy
calculation shows that the eigenvalues of this state equation are 0, 0,
Thus small
deviations in radial distance or angle of the satellite, represented by nonzero initial
states, perpetuate, and the satellite never returns to the nominal, circular orbit. This is
illustrated in Example 3.8.
Since (51) is controllable, forces can be generated on the satellite that depend on
the state in such a way that deviations are damped out. Mathematically this corresponds
to choosing a state feedback of the form

x(t)

u(t) = Kx(t)
=

The corresponding closed-loop state equation is


0

k, /r(,

k1,

'13

x(t)
1

+ kv, )11(, k

several strategies for choosing the feedback gain K to obtain an


exponentially-stable closed-loop state equation, and indeed to place the eigenvalues at
desired locations. One approach is to first set

There are

k,1=0,

k13=0,

k22=2o0

Then
0

k1,

x(t) =
0

x(t)

(52)

k23/r(, k14/r(,

and the closed-loop characteristic polynomial has the simple form


det (A.! A BK) =

[A.2

k 12A

(k241r0)A.

Clearly the remaining gains can be chosen to place the roots of these two quadratic

factors as desired.

Chapter 14

258

Linear Feedback

EXERCISES
Exercise 14.1

Consider the time-invariant linear state equation

*(t) =Av(t) + Bu(t)


suppose then x n matrix F has the characteristic polynomial de: (Ad F) = p(A.). If the in x n
matrix R and the invertible, x ii matrix Q are such that
and

AQ - QF

= BR

show how to choose an in X n matrix K such that A + BK has characteristic polynomial p (k). Why

is controllability not involved?

Exercise 14.2 Establish the following version of Theorem 14.7. If the time-invariant linear state
equation
.i(r) =Av(t) + Bu(t)
is

controllable, then for any r1>

the time-invariant state feedback


I

If

= Br

cl-c

x(t)

yields an exponentially stable closed-loop state equation. Hint: Consider

(A + BK)Q + Q(A + BK)'


where
If

BBTeA't cit

Q=
and

proceed as in Exercise 9.9.

Exercise 14.3 Suppose that the time-invariant linear state equation


= A.v(r) + Bu(t)
is controllable and A + AT < 0. Show that the state feedback

u(t)

Br.i(t)

yields a closed-loop state equation that is exponentially stable. Hint: One approach is to directly

consider an arbitrary eigenvalue-eigenvector pair for A _BBT.


Exercise 14.4 Given the time-invariant linear state equation

=Ax(t) + Bu(t)
y(t) = Cx(t)
with time-invariant state feedback

u(t)=Kx(t) +Nr(t)
show that the transfer function of the resulting closed-loop state equation can be written in terms

Exercises

259

of the open-loop transfer function as

A)'B]'N

C(sI,

(This shows that the input-output behavior of the closed-loop state equation can be obtained by
use of a precompensator instead of feedback.) Hint: An easily-verified, useful identity for an
n x a: matrix P and an rn x n matrix Q is
=

where the indicated inverses are assumed to exist.

Exercise 14.5 Provide a proof of Theorem 14.8 via these steps:


(a) Consider the quadratic form x"Ax + x11A
for x a unity-norm eigenvector of A. and show that
_(AT +al) has negative-real-part eigenvalues.

(h) Use Theorem 7.10 to write the unique solution of (24). and show by contradiction that the
controllability hypothesis implies Q > 0.

(c) For the linear state equation (26), substitute for BBT from (24) and conclude (26)

is

exponentially stable.
(d) Apply Lemma 14.6 to complete the proof.
Exercise 14.6

Use Exercise 13.12 to give an alternate proof of Theorem 14.9.

Exercise 14.7

For a controllable, single-input linear state equation

i(t) =Ax(t) + bu(t)


suppose a degree-n monic polynomial p

k=

is given. Show that the state feedback gain

A?t_lb]_Ip(A)

0 i] [b Ab

is such that det (X!A bk) =j,(X). Hint: First show for the controller-form case (Example
10.11) that

0]p(A)

0
and

[1

Exercise 14.8

01= [0

lJ [h Ab

For the time-invariant linear state equation

=Ax(t) + Bu(t)
show that there exists a time-invariant state feedback

u(t) = K.r(t)
such that the closed-loop state equation is exponentially stable if and only if

rank [X!A

B]=n

for each X that is a nonnegative-real-part eigenvalue of A. (The property in question is called


stahilizability.)
Exercise 14.9 Prove that the controllability indices and observability indices in Definition 13.5
and Definition 13.16, respectively, for the time-invariant linear state equation

Chapter 14

260

Linear Feedback

= (A +BLC).r(,) + Bn(t)
=
are

independent of the choice of in x p output feedback gain L.

Exercise 14.10 Prove that the time-invariant linear state equation

.i(t) = Ax(t) + Bn(t)

v(t) =

C'.v(t)

cannot be made exponentially stable by output feedback

u(t) = Lv(t)
if CB

=OandtrlA] >0.

Exercise 14.11

Determine if the noninteracting control problem for the plant

v(t)=

01000
00111
I

000

t'

00000

x(t) +

00000

00
00
0 0

10
01

n(t)

II 00001
0jx(t)
[o

can be solved on a suitable time interval. If so, compute a state feedback that solves the problem.

Exercise 14.12 Suppose a time-invariant linear state equation with p = in is described by the
transfer function G(s). Interpret the relative degree iq
ic,,, in terms of simple features of
G(s).
Exercise 14.13 Write out a detailed proof of Corollary 14.13, including formulas for constant
gains that achieve noninteracting control.
Exercise 14.14 Compute the transfer function of the closed-loop linear state equation resulting
from the sufficiency proof of Theorem 14.12. Hint: This is not an unreasonable request.
Exercise 14.15 For a single-input, single-output plant

.i(t)=AO).r(I) +B(t)u(t)
= C(t).v(t)
derive a necessary and sufficient condition for existence of state feedback

u(t) = K(t)x(t) + N(t)r(t)


with

N(t) never zero such that the closed-loop weighting pattern admits a time-invariant

realization. (List any additional assumptions you require.)

Exercise 14.16 Changing notation from Definition 9.3, corresponding to the linear state equation

261

Notes

=A(t)x(r) + B(t)u(t)
let

KA[B](t)= A(t)B(t) +

Show that the notion of constant relative degree in Definition 14.10 can be defined in terms of this
in (37) is replaced by
linear operator. Then prove that Theorem 14.12 remains true if
[B Rt)

C1

[B ](t)

Hint: Show first that for j, k


[B 1(t) =

0,

(-1

[B J(t) +

E (-1

'

[B 1(t)]

NOTES
Note 14.1

Our treatment of the effects of feedback follows Section 19 of

R.W. Brockett, Finite Dimensional Linear Systems, John Wiley, New York, 1970

The representation of state feedback in terms of open-loop and closed-loop transfer functions is
pursued further in Chapter 16 using the polynomial fraction description for transfer functions.
Note 14.2 Results on stabilization of time-varying linear state equations by state feedback using
methods of optimal control are given in

R.E. Kalman, "Contributions to the theory of optimal control," Boletin de la Sociedad


Matematica Mexicana, Vol. 5, pp. 102 119, 1960
See also

M. Ikeda, H. Maeda, S. Kodama, "Stabilization of linear systems," SIAM Journal on Control and
Optimization, Vol. 10, No. 4, pp. 716729, 1972
The proof of the stabilization result in Theorem 14.7 is based on

V.H.L. Cheng, "A direct way to stabilize continuous-time and discrete-time linear time-varying
systems," IEEE Transactions on Automatic Control, Vol. 24, No.4, pp. 641 643, 1979

For the time-invariant case, Theorem 14.8 is attributed to R.W. Bass and the result of Exercise
14.2 is due to D.L. Kleinman. Many additional aspects of stabilization are known, though only
two are mentioned here. For slowly-time-varying linear state equations, stabilization results based
on Theorem 8.7 are discussed in

E.W. Kamen, P.P. Khargonekar, A. Tannenbaum, "Control of slowly-varying linear systems,"


IEEE Transactions on Automatic Control, Vol. 34, No. 12, pp. 1283 1285, 1989
It is shown in

Chapter 14

262

Linear Feedback

M.A. Rotea, P.P. Khargonekar, "Stabilizability of linear time-varying and uncertain linear

systems." IEEE Transactions on Automatic Control, Vol. 33, No. 9, pp. 884 887, 1988
that if uniform exponential stability can be achieved by dynamic state feedback of the form

=F(t):(t) + G(t)x(t)
"(1) = H(t):(t) + E(t).v(t)
then uniform exponential stability can be achieved by static state feedback of the form (2).
However when other objectives are considered, for example noninteracting control with
exponential stability in the time-invariant setting, dynamic state feedback offers more capability
than static state feedback. See Note I 9.4.

Note 14.3 Eigenvalue assignability for controllable, time-invariant, single-input linear state
equations is clear from the single-input controller form, and has been understood since about
1960. The feedback gain formula in Exercise 14.7 is due to J. Ackermann, and other formulas are
available. See Section 3.2 of
T. Kailath. Linear Systems,

Prentice Hall. Englewood Cliffs, New Jersey, 1980

For multi-input state equations the eigenvalue assignment result in Theorem 14.9 is proved in

W.M. Wonham, "On pole assignment in multi-input controllable linear systems," IEEE
Transactions on Automatic Control, Vol. 12. No. 6, pp. 660 665, 1967
to single-input'
The approach suggested in Exercise 14.6 is due to M. Heymann. This
approach can be developed without recourse to changes of variables. See the treatment in Chapter
20 of

R.A. DeCarlo, Linear Systems, Prentice Hall, Englewood Cliffs. New Jersey, 1989
Note 14.4

In contrast to the single-input case, a state feedback gain K that assigns a speci fled set

of eigenvalues for a multi-input plant is not unique. One way of using the resulting flexibility
involves assigning closed-loop eigenvectors as well as eigenvalues. Consult

B.C. Moore, "On the flexibility offered by state feedback in multivariable systems beyond closed
loop eigenvalue assignment," IEEE Transactions on Automatic Control. Vol. 21, No. 5. pp. 689
692. 1976
and

G. Klein, B.C. Moore, "Eigenvalue-generalized eigenvector assignment with state feedback."


IEEE Transactions on Automatic Control, Vol. 22. No. 1. pp. 140 141, 1977
Another characterization of the flexibility involves the invariant factors of A +BK and is due to
H.H. Rosenbrock. See the treatment in
B.W. Dickinson. "On the fundamental theorem of linear state feedback," IEEE Transactions on
Auto,natic Control, Vol. 19, No.5, pp. 577579, 1974
Note 14.5 Eigenvalue assignment capabilities of static output feedback is a famously difficult
topic. Early contributions include

H. Kimura, "Pole assignment by gain output feedback," iEEE Transactions on Automatic'


Control, Vol. 20, No.4, pp.509516, 1975

263

Notes
E.J.

Davison, S.H. Wang, "On pole assignment in linear multivariable systems using output

feedback," IEEE Transactions on Auio,natic Control, Vol. 20, No. 4, pp. 516 518, 1975
Recent studies that make use of the geometric theory in Chapter I 8 are
C. Champetier, J.F. Magni, "On .eigenstructure assignment by gain output feedback," SIAM
Journal on Control and Optimi:ation, Vol. 29, No.4, pp. 848865, 1991

J.F. Magni, C. Champetier, "A geometri'c framework for pole assignment algorithms," IEEE
Transactions an Automatic Control. Vol. 36, No. 9, pp. 1105 1111, 1991
A survey paper focusing on methods of algebraic geometry is

C.!. Byrnes, "Pole assignment by output feedback," in Three Decades of Mathematical System
Theory, H. Nijmeijer, J.M. Schumacher, editors, Springer-Verlag Lecture Notes in Control and
Information Sciences, No. 135, pp. 31 78, Berlin, 1989
Note 14.6 For a time-invariant linear state equation in controller form,
= (Ar, +

the linear state feedback

u(t) =

R*(t)

gives a closed-loop state equation described by the integrator coefficient matrices,


=

+ B,,r(i)

In other words, for a controllable linear state equation there is a state variable change and state
feedback yielding a closed-loop state equation with structure that depends only on the
controllability indices. This is called Brunorskyforni after
P. Brunovsky, "A classification of linear controllable systems," Kyhernetika, Vol. 6, pp. 173
188, 1970

If an output is specified, the additional operations of output variable c/lange and output injection
(see Exercise 15.9) permit simultaneous attainment of a special structure for C that has the form of
A treatment using the geometric tools of Chapters 18 and 19 can be found in

A.S. Morse, "Structural invariants of linear multivariable systems," SIAM Journal on Control
and Optimization, Vol. II, No.3, pp. 446 465, 1973
Note 14.7 The noninteracting control problem also is called the decoupling problem. For timeinvariant linear state equations, the existence condition in Corollary 14.13 appears in

P.L. FaIb, W.A. Wolovich, "Decoupling in the design and synthesis of multivariable control
systems," IEEE Transactions on Automatic Control, Vol. 12, No. 6, pp. 651 659, 1967
For time-varying linear state equations, the existence condition is discussed in

W.A. Porter, "Decoupling of and inverses for time-varying linear systems," IEEE Transactions
on Automatic Control, Vol. 14, No. 4, pp. 378 380, 1969

with additional work reported in

E. Freund, "Design of time-variable multivariable systems by decoupling and by the inverse."


IEEE Transactions on Automatic Control, Vol. 16, No. 2, pp. 183

185, 1971

264

Chapter 14

Linear Feedback

Wi. Rugh, "On the decoupling of linear time-variable systems," Proceedings of tile
Conference on I,zformation Sciences and Systems, Princeton University, Princeton, New Jersey,
pp. 490494, 1971

Output controllability, used to impose nontrivial input-output behavior on each noninteracting


closed-loop subsystem, is discussed in

E. Kriendler, P.E. Sarachik, "On the concepts of controllability and observability of linear
systems," IEEE Transactions on Automatic Control, Vol. 9, pp. 129 136, 1964 (Correction: Vol.

lO,No. l,p.1l8, 1965)


However the definition used is slightly different from the definition in Exercise 9.10. Details
aside, we leave noninteracting control at an embryonic stage. Endearing magic occurs in the
proof of Theorem 14.12 (see Exercise 14.14). yet many questions remain. For example
characterizing the class of state feedback gains that yield noninteraction is crucial in assessing the
possibility of achieving desirable input-output behaviorfor example stability if the time interval
is infinite. Further developments are left to the literature of control theory, some of which is cited
in Chapter 19 where a more general noninteracting control problem for time-invariant linear state
equations is reconstituted in a geometric setting.

15
STATE OBSERVATION

An important application of the notion of state feedback in linear system theory occurs

in the theory of state observation via observers. Observers in turn play an important role
in control problems involving output feedback.
In rough terms state observation involves using current and past values of the plant
input and output signals to generate an estimate of the (assumed unknown) current state.

Of course as the current time t gets larger there is more information available, and a
better estimate is expected. A more precise formulation is based on an idealized
objective. Given a linear state equation

=A(t)x(t)

B(r)u(t)

v(t) = C(t)x(t)
with the initial state .v0 unknown, the goal is to generate an
that is an estimate of x(t) in the sense

lim[x(t)

x1

vector function i(t)

=0

It is assumed that the procedure for producing


at any
can make use of the
values of u(t) and v(t) fort E [ti,. 'a], as well as knowledge of the coefficient matrices
in (1).

If (1) is observable on [ta, tj,], then an immediate suggestion for obtaining a state
estimate is to first compute the initial state from knowledge of 11 (t) and y (t) for
E [to, t,,i. Then solve (I) for I
yielding an estimate that is exact at any /
though not current. That is, the estimate is delayed because of the wait until t,,, the time

required to compute x0,. and then the time to compute the current state from this
information. In any case observability plays an important role in the state observation
problem. How feedback enters the problem is less clear, for it depends on the specific
idea of using a particular state equation to generate a state estimate.
265

Chapter 15

266

State Observation

Observers
The standard approach to state observation, motivated partly on grounds of hindsight, is

to generate an asymptotic estimate of the state of (1) by using another linear state
equation that accepts as inputs the input and output signals, u (t) and y(r), in (1). As
diagramed in Figure 15.1, consider the problem of choosing an n-dimensional linear
state equation of the form

i(t) =F(t)i(t) + G(t)u(t)

H(t)y(t)

(t,,) =,,

with the property that (2) holds for any initial states x0 and
A natural requirement to
impose is that if
= x0, then
= x(t) for all t ti,. Forming a state equation for
shows that this fidelity is attained if coefficients of (3) are chosen as
x (t)

F(t)=A(t) H(t)C(t)
G(t) =B(t)
Then (3) can be written in the form

i(t) = A(t)i(t)

+ B(t)u(t) +

H(r)[y (t)

5(t) I

9(t) =

where for convenience we have defined an output estimate 9(t). The only remaining
coefficient to specify is the n x p matrix function H (t), and this final step is best

motivated by considering the error in the state estimate. (We also need to set the
observer initial state, and without knowledge of x0 we usually put

15.1 Figure

= 0.)

Observer structure for generating a state estimate.

From (1) and (4) the estimate error


e (t) = x (t)

(t)

satisfies the linear state equation

{A(t)

H(t)C(t)]e(t)

e(t0)

Therefore (2) is satisfied if H(t) can be chosen so that (5) is uniformly exponentially
stable. Such a selection of H (t) completely specifies the linear state equation (4) that

Observers

267

the estimate, and (4) then is called an observer for the given plant. Of course
uniform exponential stability of (5) is stronger than necessary for satisfaction of (2), but
we choose to retain uniform exponential stability for reasons that will be clear when
output-feedback stabilization is considered.
The problem of choosing an observer gain H (r) to stabilize (5) obviously bears a
generates

resemblance to the problem of choosing a stabilizing state feedback gain K (t) in


Chapter 14. But the explicit connection is more elusive than might be expected. Recall
that for the plant (1) the observability Gramian is given by
M (ta,

dt

= $

where c1(t, r) is the transition matrix for A (t). Mimicking the setup of Theorem 14.7
on state feedback stabilization, let

dt

r1) = 5

Theorem Suppose for the linear state equation (1) there exist positive constants
Ci, and r, such that

15.2

t) 21

C / cD7(t 6, t)M (t 6,

for all t. Then given a positive constant a the observer gain

,)b(t, t)]'CT(t)

H(t) =

is such that the resulting observer-error state equation (5) is uniformly exponentially
stable with rate a.

Proof Given a> 0, first note that from (6),


6, t)Ma(t 6, t)cD(t 8, t) 2e2/

2e1 e

for all t, so that existence of the inverse in (7) is clear. To show that (7) yields an error

state equation (5) that is uniformly exponentially stable with rate a, we will show that
the gain

_HT(_t)

= C(t)[

t)Ma(t8,

r)

renders the linear state equation

f(t) = {

AT(r) + CT(t)[ HT(t)J }f(t)

uniformly exponentially stable with rate a. That this suffices follows easily from the
relation between the transition matrices associated to (5) and (8), namely the identity

Chapter 15

268

t) =

State Observation

t, t) established in Exercise 4.23. For if

t)IIye_1_t)
for all t, t with t c, then

t)II = IktF(t, t)II =

1)11

ye
for all t, t with t r. The beauty of this approach is that selection of _HT(_t) to

render (8) uniformly exponentially stable with rate a is precisely the state-feedback
stabilization problem solved in Theorem 14.7. All that remains is to complete the
notation conversion so that (7) can be verified.

Writing A(t) =AT(_t) and B(r) = CT(_r) to minimize confusion, consider the
linear state equation

i(t) = A(t): (t) + B(t)ie (t)

(9)

Denoting the transition matrix for A(r) by 1(t, t), the controllability Gramian for (9) is
given by

a) da

W(t(,, tj)
=

t0)

f
This expression can be used to evaluate W( t,
integration variable to t =

8), and then changing the

gives

W(r, t + 6) = J DT(t, t)CT(t)C(t)c1(r, t) dt


=

t)M(t, t)c1(t6, t)

Therefore (6) implies, since t can be replaced by t

in that inequality,

for all t. That is, the controllability Gramian for (9) satisfies the requisite condition for

application of Theorem 14.7. Letting


14'a(to, t1) = J

we need to check that

a)B(a)B

a) da

Output Feedback Stabilization


14'a(t, t

269

+) =

r)

(11)

For then

_HT(_t)= _T(t)i_I(t t+6)


renders (9), and hence (8), uniformly exponentially stable with rate a, and this gain
corresponds to H(t) given in (7).
The verification of (11) proceeds as in our previous calculation of W(t, t +6).
From (10),
'+5
Wa(t,

t+8)=
=

t)da

t 6.

t)J

r_6)CT(t)C(t)

b(r, t6) drct(t6, t)

this is readily recognized as (11).

and

Output Feedback Stabilization


An important application of state observation arises in the context of linear feedback

when not all the state variables are available, or measured, so that the choice of state
feedback gain is restricted to have certain columns zero. This situation can be illustrated
in terms of the stabilization problem for (I) when stability cannot be achieved by static
output feedback. First we demonstrate that this predicament can arise, and then a
general remedy is developed that involves dynamic output feedback.
15.3 Example

The unstable, time-invariant linear state equation

x(t)

01
=

x(t)

0
+

u(t)

y(t)= [0 11x(t)
with static linear output feedback
14(t)

=Ly(t)

yields the closed-loop state equation

The closed-loop characteristic polynomial is

1. Since the product of roots is


for every choice of L, the closed-loop state equation is not exponentially stable for

State Observation

Chapter 15

270
any value of L. This
but is a consequence

limitation is not due to a failure of controllability or observability,

of the unavailability of x1 (t) for use in feedback. Indeed state


feedback, involving both x1(t) and x2(t), can be used to arbitrarily assign eigenvalues.
A natural intuition is to generate an estimate of the plant state, and then stabilize
by feedback of the estimated state. This notion can be implemented using an observer
with linear feedback of the state estimate, which leads to linear dynamic output feedback

(t) = A (t)(t) + B (t)u (t) + H(r)[y (r) -

C(t)(t) J

zi(t) =K(t)(t) + N(t)r(t)


The overall closed-loop system, shown in Figure 15.4,
2n-dimension linear state equation,

i(t)

A(t)

can

B(t)K(t)

H (t)C (t) A (r) H (t)C (t) + B (t)K (t)

be written as a partitioned

B(t)N(t)

x(i)
11(1)

B (t)N (t)

'

Ix(t)l

y(t)= [C(t)

y(t)

15.4 Figure

Observer-based dynamic output feedback.

The problem is to choose the feedback gain K(t), now applied to the state estimate, and
the observer gain H (t) to achieve uniform exponential stability of .the zero-input
response of (13). (Again the gain N (t) plays no role in internal stabilization.)

15.5 Theorem
a1, 2,

Suppose for the linear state equation (1) there exist positive constants
such that

c11W(t,
Ci!

for all t, and

t)M(t3, t)4(t6,

Output Feedback Stabilization

271

JIIB(r)II2da13,
for all t,

with t

t. Then given a> 0, for any 11 > 0 the feedback and observer gains

K(t) =

t +)

t)]'CT(t)

H(t) =

such that the closed-loop state equation (13) is uniformly exponentially stable with
rate a.
are

Proof In considering uniform exponential stability for (13), r(t) can be set to zero.
We first apply the state variable change (using suggestive notation)

x(t)

I,,

e(t)

I,, I,,

.v(f)

0,,

This is a Lyapunov transformation, and (13) is uniformly exponentially stable with rate a

if and only if the state equation in the new state variables,

(t)

A(r)+B(t)K(t)

B(t)K(t)

0,,

A(t)H(t)C(t)

x(t)
e(t)

is uniformly exponentially stable with rate a. Let cb(t, r) denote the transition matrix
corresponding to (16), and let c1,(t, t) and
r) denote the n x n transition matrices
for A(t)+B(t)K(t) and A(t)H(t)C(t), respectively. Then from Exercise 4.13, or by
easy verification,
r) 5

t) da

r)B (cr)K

t)

0,,

Writing cb(t, t) as a sum of three matrices, each with one nonzero partition, the triangle
inequality and Exercise 1.8 provide the inequality
II

t)

II

II

Now given a> 0 and any (presumably small) 11 > 0, the feedback and observer
gains in (14) are such that there is a constant ?for which
II

for all t, t with t

'r) II

II 1e(t, r) II ie (a + ri)(1 t)

t. (Theorems 14.7 and 15.2.) Then

State Observation

Chapter 15

272

r)

a)B

II

da

J IIB(a)II

II _<12e_

Using an inequality established in the proof of Theorem 14.7,


e

IIK(o)II IIBT(a)II
Thus

for all t, t with

t,

a)B (cY)K(cy)'t'e(a,

II

IIB(a)II

ea+t_T) f

c) da II

(a+i)(rt)[

da

t) 1

(18)

Using the elementary bound (see Exercise 6.10)

t0
in (18) gives, for (17),

t)II
for all

t, t

with t

r, and the proof is complete.

Reduced-Dimension Observers
discussion of state observers so far has ignored information about the state of the
plant that is provided directly by the plant output signal. For example if output
components are state componentseach row of C (t) has a single unity entrywhy
estimate what is available? We should be able to make use of output information, and

The

construct an observer only for states that are not directly known from the output.
Assuming the linear state equation (1) is such that C (t) is continuously
differentiable, and rank C (t) = p at every t, a state variable change can be employed
that leads to the development of a reduced-dimension observer that has dimension n p.
Let

where Ph(t) is an (n p) x ii matrix that is arbitrary, subject to the requirements that


P(t) indeed is invertible at each t and continuously differentiable. Then letting
z(r) = P'(t)x(r) the state equation in the new state variables can be written in the
partitioned form

Reduced-Dimension Observers

y(t)

273

F11(t) F12(t)
F11 (t) F,1(r)

= [1,)

0,,

G1(t)

;,(t(,)

ii(r),

G,(t)

:,,(t)

=j(t)

=P

(t0)x0

(20)

(t.) is p x p, G (t) is p x in, Za(t) is p x 1, and the remaining partitions have


corresponding dimensions. Obviously
y (t), and the following argument shows
how to obtain the asymptotic estimate of :,,(t) needed to obtain an asymptotic estimate

Here F

of x(t).
Suppose for a moment that we have computed an (n p)-dimensional observer for
that has the form, slightly different from the full-dimension case,

+ G(,(t)u

(t)

+ H(t)z(,(t)

(Default continuity hypotheses are in effect, though it turns out that we need H(t) to be
continuously differentiable.) That is, for known u (t), but regardless of the initial values
from (20), the solutions of (20) and (21) are
a(t(,) and the resulting
such that
lirn

=0

Then an asymptotic estimate for the state vector in (20), the first p components of which
are perfect estimates, can be written in the form
z(,(t)

h(t)

y(t)

H(t)

1,,...,,

Adopting this variable-change setup, we examine the problem of computing an


(n p)-dimensional observer of the form (21) for an n-dimensional state equation in the
special form (20). Of course the focus in this problem is on the (n p) x I error signal
eh(t) = Zh(t)
that satisfies the error state equation
Ch(t) = Zh(t)

= h(t)
= F21

(t)z0(r) +

F,2(t)z,,(t)

H(t);,(t)

+ G2(t)u(t)

Ga(t)U(t)

G1)(t)z(,(t)
H(t)Fii(t)Za(t) H(t)F12(t)z,,(t)

H(t)G1(t)u(t)

Chapter 15

274

Using (21) to substitute for

and rearranging, gives

eh(t) = F(f)e,,(t) + [F27(t)

0) + F(t)H(t)

+ [F21

State Observation

H(,)F17(t)

G,,(t)

P0)] z,,(t)

H(r)F

+ [G2(t) G0(t) H(t)G 1(t)] u(t)

H(t)]
eb(t0) = Zh(t0) Zh(to)

Again a reasonable requirement on the observer is that, regardless of u (t), z0(t0),


and the resulting z0(t), the lucky occurrence h(to) = Zh(to) should yield Ch(t) = 0 for
all t to. This objective is attained by making the coefficient choices

F(t) = F22(t)

H(t)F12(t)

Gh(t) = F21(t) + F(t)H(t)

Ga(t) = G2(t)
with the resulting (ii

p)

H(r)F11(t)

H(t)

H(r)G (t)

(22)

x I error state equation

eh(t) = [F22(t)

H(t)F17(t)] eh(t) ,

= z,,(t0) z,,(t0)

(23)

To complete the specification of the reduced-dimension observer in (21), we


consider conditions under which a continuously-differentiable, (n p) x p gain H (t) can

be chosen to yield uniform exponential stability at any desired rate for (23). These
conditions are supplied by Theorem 15.2, where A (t) and C (t) are interpreted as
F22(t) and F12(t), respectively, and the associated transition matrix and observability
Gramian are correspondingly adjusted. In terms of the original state vector in (1), the
estimate for z (t) leads to an asymptotic estimate for x (t) via
px(np)

P't)

H(t)

"

Then x 1 estimate error e(t) = x(t)

y(t)

24

ZL(t)

(t) is given by

e(t) = P(t)[z(t)

2(t)] =P(t)

eh(t)

Therefore if (23) is uniformly exponentially stable with rate

and P (t)

is

bounded,

then lie (t)li decays exponentially with rate


Statement of a summary theorem is left to the interested reader, with a reminder
that the assumptions on C(t) used in (19) must be recalled, boundedness of P(t) is
required, and the continuous differentiability of H (t) must be checked. Collecting the
hypotheses for a summary statement makes obvious an unsatisfying aspect of our

treatment of reduced-dimension observers: Delicate hypotheses are required both on the


new-variable State equation (20) and on the original state equation (1). However this
situation can be neatly rectified in the time-invariant case, where tools are available to
express all assumptions in terms of the original state equation.

Time-Invariant Case

275

Time-Invariant Case
When specialized to the case of a time-invariant linear state equation,

=Ax(t) + Bu(t), x(O) =x,,


y(t) = Cx(t)

(25)

full-dimension state observation problem can be connected to the state feedback


stabilization problem in a much simpler fashion than in the proof of Theorem 15.2. The
form of the observer is, from (4),
the

= Ai(t)

+ Bu(t)

+ H[y (t)

9(t)]

(O)

9(t) = C(t)

(26)

and the corresponding error state equation is

Now the problems of choosing H so that this error equation is exponentially stable with
prescribed rate, or so that A HG has a prescribed characteristic polynomial, can be
recast in a form familiar from Chapter 14. Let

A=AT, B=CT, K=_HT


Then the characteristic polynomial of A HG is identical to the characteristic
polynomial of

(AHC)T=A +BK
Also observability of (25) is equivalent to the controllability assumption needed to apply

either Theorem 14.8 on stabilization or Theorem 14.9 on eigenvalue assignment.


Alternatively observer form in Chapter 13 can be used to prove that if rank C = p and
(25)is observable, then H can be chosen to obtain any desired characteristic polynomial
for the observer error state equation in (26). (See Exercise 15.5.)
Specialization of Theorem 15.5 on output feedback stabilization to the timeinvariant case can be described in terms of eigenvalue assignment. Time-invariant linear
feedback of the estimated state yields a 2n-dimension closed-loop state equation that
follows directly from (13):

(t)

RN

r(t)

y(t)= [C

(27)

The state variable change (15) shows that the characteristic polynomial for (27) is
precisely the same as the characteristic polynomial for the linear state equation

Chapter 15

276

i(t)
()

A+BK BK
=

= [C

BN

.v(t)

e(t)

AHC

0,,

State Observation

r(t)

(28)

Taking advantage of block triangular structure, the characteristic polynomial is

det(A1ABK)det.(7JA +HC)
By

this calculation we have uncovered a remarkable cigenvahie separation

property. The 2,, eigenvalues of the closed-loop state equation (27) are given by the ii
eigenvalues of the observer and the ii eigenvalues that would be obtained by linear state
feedback (instead of linear estimated-state feedback). Of course if (25) is controllable
and observable, then K and H can be chosen such that the characteristic polynomial for
(27) is any specified monic, degree-2,i polynomial.
Another property of the closed-loop state equation that is equally remarkable
concerns input-output behavior. The transfer function for (27) is identical to the transfer
function for (28), and a quick calculation, again making use of the block-triangular
structure in (28), shows that this transfer function is

0(s) = C(si
That is,

- A - BK)'BN

linear estimated-state feedback leads to the same input-output (zero-state)

behavior as does linear state feedback.

15.6 Example
Example 15.3,

For the controllable and observable linear state equation encountered in

01

x(t)

x(t)

0
+

y(t)= [o l]x(t)
the full-dimension observer (26) has the form
+

=
= [0

(t)

] [y(t)

1](t)

(29)

The resulting estimate-error equation is


e(t)

0 1hi

e(t)

Time-Invariant Case

277

By setting h1 = 26, h2 = 10, to place both eigenvalues at 5, we obtain exponential

stability of the error equation. Then the observer becomes

x(t)

025..x(t)
I

10

(t)

26
+

y (t)

10

With t,he goal of achieving closed-loop exponential stability, consider estimatedstate feedback of the form
u

where

r(t)

is

(t) = K(t)

r(t)

the scalar reference input signal. Choosing K = [k1

k2] to place both

eigenvalues of

A BK=
at

leads to K =

k2]

21. Then substituting into the plant and observer state

[ 2

equations we obtain the closed-loop description

t)=

[?]r(t)

[12
l2]x(t)

x(t)=

[i]Yt

] r(t)

y(t)= [o lJx(t)
This can be rewritten in the form (27) as the 4-dimensional linear state equation

010
i(t)
(r)

0 2 2

0 26

25

0 10 1 12

y(t)= [0

x(t)
(t)

0
1

0]

Familiar calculations verify that (31) has two eigenvalues at 2 and two eigenvalues at

5. Thus exponential stability, which cannot be attained by static output feedback, is

achieved by dynamic output feedback. Furthermore the closed-loop eigenvalues


comprise those eigenvalues contributed by the observer-error state equation, and those
relocated by the state feedback gain as if the observer was not present. Finally the
transfer function for (31) is calculated as

Chapter 15

278

G(s)= [0

10

25

State Observation

ls+12

s3+10s2+25s
12s3 +46s2 +60s+25

s(s+5)2

(s+ 1)2(s+5)2

(5+1)2

Note that the observer-error eigenvalues do not appear as poles of the closed-loop
transfer function.

DOD
Specialization of the treatment of reduced-dimension observers to the timeinvariant case also proceeds in a straightforward fashion. We assume rank C = p, and
choose Ph(t) in (19) to be constant. Then every time-varying coefficient matrix in (20)
becomes a constant matrix. This yields a dimension-(n p) observer described by the
state equation
= (F22

+ (G2

HF12

+ (F21 + F22H HF11H


=
=

= (F22

HG1

)u(t)
)z0(t)

+ HZa(t)

(t)

(32)

Zh(t)

typically with the initial condition


Zb(t) is given by

4(t)

= 0. The error equation for the estimate of

)eh(t)

e,,(0) = :h(O) h(O)

(33)

For the reduced-dimension observer in (32), we next show that the (11 p) x p gain

matrix H can be chosen to yield any desired characteristic polynomial for (33). (The
observability criterion in Theorem 13.14 is applied in this proof. An alternate proof
based on the observability-matrix rank condition is given in Theorem 29.7.)

15.7 Theorem Suppose the time-invariant linear state equation (25) is observable and
rank C = p. Given any degree-(n p) monic polynomial q (X) there is a gain H such
that the reduced-dimension observer defined by (32) has an error state equation (33) with
characteristic polynomial q

Time-Invariant Case

279

Proof We need to show H can be chosen such that

det(AIF,2 +HF12)=q(X)
From our discussion of time-invariant observers, this follows upon proving that the
observability hypothesis on (25) implies that the (n p)-dimensional state equation
d(t)

F7,zd(f)

1(t) = F,2:d(t)

(34)

observable. Supposing the contrary, a contradiction is obtained as follows. If (34) is


not observable, then by Theorem 13.14 there exists a nonzero (n p) x I vector I and a
is

scalar 11 such that

F,,I=TlI, F,,l=0
This implies, using the coefficients of (20) (time-invariant case),

F,, F,

pxl

F,, F-,,

F,,!
F,,!

and, of course,

Therefore another application of Theorem 13.14 shows that the linear state equation (20)

(time-invariant case) is not observable. But (20) is related to (25) by a state variable
change, and thus a contradiction with the observability hypothesis for (25) is obtained.

15.8 Example To compute a reduced-dimension observer for the linear state equation
in Example 15.6,

')=
y(t)=

[o

l]x(t)

(35)

we begin with a state variable change (19) to obtain the special form of C-matrix in (20).
Letting

P=P-l=
gives

Chapter 15

280

y(t)=

Za(o)

',(t)

[1

State Observation

ii(t)

01

The reduced-dimension observer in (32) becomes the scalar state equation

H;(t)

(t) + (1

112 )y(t)

Zh(t) = z1(t) + Hy(t)

For H =

(36)

we obtain an observer for Zh(f) with error equation


= Seh(z)

From (32) the observer can be written as


5u(t) 24y(t)

+ 5y(t)

(t)

(t) provides .V2(t) exactly.

Servomechanism Problem

As another illustration of state observation and estimated-state feedback, we consider a

time-invariant plant affected by disturbances and pose multiple objectives for the
closed-loop state equation. Specifically consider a plant of the form

i(t)=Ax(t) + Bu(t) + Ew(t), x(0)=x0


y(t) = Cx(t) + Fw(t)

(37)

We assume that w(t) is a q x 1 disturbance signal that is unavailable for use in


feedback, and for simplicity we assume p = ni. Using output feedback the objectives for
the closed-loop state equation are that the output signal should track any constant
reference input with asymptotically-zero error in the face of unknown constant

disturbance signals, and that the coefficients of the characteristic polynomial should be
arbitrarily assignable. This type of problem often is called a servomechanism problem.

The basic idea in addressing this problem is to use an observer to generate


asymptotic estimates of both the plant state and the constant disturbance. As in earlier
observer constructions, it is not apparent at the outset how to do this, but writing the
plant (37) together with the constant disturbance (t) in the form of an 'augmented'

plant provides the key. Namely we describe constant disturbance signals by the
'exogenous' linear state equation

= 0, with unknown w (0), to write

A Servomechanism Problem

281

A E

.v(t)

0 0

w(t)

[C F

v (1) =

(t)

'

(38)

Then the observer structure in (26) can be applied to this (ii + q)-dimensional linear state

equation. With the observer gain partitioned appropriately, the resulting observer state
equation is
=

A E

0 0

c(t)

w(t)

FJ

[C

H1

H,

(t)
w(t)

(39)

Since

AE

00 -

H1

H,

1C
I

E-H1F
-H,C

-H,F

the error equation, in the obvious notation, is

L(t)

EH1F

H,C

H,F

However, rather than separately consider this error equation, and feedback of the
augmented-state estimate to the input of the augmented plant (38), we can simplify
matters by directly analyzing the closed-loop state equation with w(t) treated again as a

disturbance.
Consider linear feedback of the form

u(t) = K

+ Nr(t)

The corresponding closed-loop state equation can be written as

(t)

A
BK1
BK,
H1CA+BK1H1CE+BK,J-11p
H2C
H,C
H,F

BN
BN

r(t) +

H1F w(t)
H2F

x(t)

Chapter 15

282

State Observation

x(e')

y(t)= [C

+ Fw(t)

0]

(42)

is convenient to use the state-estimate error variable and change the sign of the
disturbance estimate to simplify the analysis of this complicated linear state equation.
With the state variable change
It

x(t)

1,,

'ii

1n

nXq

x(t)

nxq

(t)

qxn qxn
the

'q

closed-loop state equation becomes


A+BK1
0

BK1

H2C

BK2

EH1F
H2F

BN
+

x(t)
e,(t)

r(t) +

EH1F

-H2F

x(t)

y(t)=

[C

+ Fw(i)

0]

(43)

characteristic polynomial of (43) is identical to the characteristic polynomial of


(42). Because of the block-triangular structure of (43), it is clear that the closed-loop
characteristic polynomial coefficients depend only on the choice of gains K1, H1, and
H2. Furthermore comparison of (40) and (43) shows that a separation of the eigenvalues
of the augmented-state-estimate error and the eigenvalues of A + BK1 has occurred.
Assuming for the moment that (43) is exponentially stable, we can address the
choice of gains N and K2 to achieve the input-output objectives of asymptotic tracking
and disturbance rejection. A careful partitioned multiplication verifies that

The

A-i-BK1

0
0

BK1

-BK2

AHIC EHIF

-H,C -H,F

(slABK1Y'[BK1 BK7]
0

and another gives

sIA

H,C

E-i-H1F

sI+H,F

A Servomechanism Problem

283

Y(s) =

+ C(s!A

[C(s!ABK1Y'BKi

C(sIABK1Y'BK7]

-l

siA -I-H

sl+H2F

H2C

E-H1F
H2F

W(s) + FW(s)

Constant reference and disturbance inputs correspond to


1

R(s) = r0 -i-, W(s) = w0

and the only terms in (44) that contribute to the asymptotic value of y (t) are those
partial-fraction-expansion terms for Y(s) corresponding to denominator roots at s = 0.
Computing the coefficients of such terms using
l

H2F

H2C

E-H1F
-H2F

gives

Iimy(t)

= C(A +

BK1)'BNr0

+ [C(A +

BK1y'E C(A

BK2 + F]w0

(45)

Alternatively the final-value theorem for Laplace transforms can be used to obtain the
same result.

At this point we are prepared to establish the eigenvalue assignment property


using (42), and the tracking and disturbance rejection property using (45). Indeed these
properties follow from previous results, so a short proof completes our treatment.
15.9 Theorem Suppose the plant (37) is controllable for E
(38) is observable, and the (n +m) x (n +rn) matrix

= 0, the augmented plant

(46)

is invertible. Then linear dynamic output feedback of the form (41), (39) has the
following properties. The gains K1, H1, and H2 can be chosen such that the closedloop state equation (42) is exponentially stable with any desired characteristic
polynomial coefficients. Furthermore the gains

N=[C(A
K2 =NC(A +

NF

(47)

Chapter 15

284

State Observation

are such that for any constant reference input r(t) = r0 and constant disturbance
w (t) = W0 the response of the closed-loop state equation satisfies

urn y(t) = r0

(48)

Proof By the observability assumption on the augmented plant in conjunction with


(40), and the plant controllability assumption in conjunction with A + BK1, we know
from Theorem 14.9 and remarks in the preceding section that K,, H1, and H, can be
chosen to achieve any specified degree-2n characteristic polynomial for (43), and thus
for (42). Then Exercise 2.8 can be applied to conclude, under the invertibility condition
on (46), that C(A + BK,) - 'B is invertible. Therefore the gains N and K-, in (47) are
well defined, and substituting (47) into (45) a straightforward calculation gives (48).

EXERCISES
Exercise 15.1

For the plant

x(t)

01
=

y(t) = [I 1 ]x(i)
compute a 2-dimensional observer such that the error decays exponentially with rate X
compute a reduced-dimension observer for the same error-rate requirement.

10. Then

Exercise 15.2 Suppose the time-invariant linear state equation

=Av(i)

+ Bu(i)

y(1) =

controllable and observable, and ,-ank B = ni. Given an (nrn) x (nni) matrix F and an ii xp
matrix H, consider dynamic output feedback
is

F:O) +

i'(I)=v(t) + CL:(t)
u(r)=M:(t) -i-N1'(t)
where the matrices G, L, M, and N satisfy
AL

BM = LF

LG + BN = -H
Show that the 2n ti, eigenvalues of the closed-loop state equation are given by the eigenvalues of

F and the eigenvalues of A HC. Hint: Consider the variable change

w(i)

/L

x(t)

:(t)

:(t)

Exercises

285

For the linear state equation

Exercise 15.3

i(t) =A(t)x(t)
y(t) = C(t)x(t)
show that if there exist positive constants y, 6,

all t, then there exist positive constants

such that

and

t)M(t,

a3!
for

such that

y,

IIA(z)lI
for

and

t)'D(t6,

all 1. Hint: See Exercise 6.6.

Exercise 15.4 For the linear state equation

i(t) = A (t)x (1) + B (t)u (t)


prove that if there exist positive constants y, 6, and
IIA(t)II

y,

such that

W(t,

for all t, then there exist positive constants 13 and 132 such that
5

IIB(a)1I2 da

+ 13,(tt)

for all t, r with t t. Hint: Write


5

II B (a) 112 thy = 5

ii cD(a,

a)B (a)B

a)43T(a t) II da

bound this via Exercise 6.6, and Exercise 1.21, and add up the bounds over subintervals of [t, tJ of

length 6.

Exercise 15.5 Suppose the time-invariant linear state equation

i(t) =Ax(t)

Bu(t)

y(t) =Cx(t)
is

observable with rank C = p. Using a variable change to observer form (Chapter 13), show how

to compute an observer gain H such that characteristic polynomial det (Al A +HC) has a
specified set of coefficients.

Exercise 15.6 Suppose the time-invariant linear state equation

+ Bu(r)

y(t)
is controllable

O,,x(,,_p)Iz(t)

and observable. Consider dynamic output feedback of the form

+ Nr(t)
where

is an asymptotic state estimate generated via the reduced-dimension observer specified


by (32). Characterize the eigenvalues of the closed-loop state equation. What is the closed-loop
transfer function?

Chapter 15

286

State Observation

Exercise 15.7
For the time-varying linear state equation (I), suppose the (np) xii matrix
function Ph(t) and the uniformly exponentially stable, (np)-dimensional state equation

(t) + Gh(t)y (t)


satisfy the following additional conditions for all r:
rank

C (t)
Ph(t)

P,)(f)= F(t)Ph(t) P1,(t)A (t) + G,,(i)C(z)

G0(t) =Ph(t)B(t)
Show that the (,,p) x I error vector e,,(t) =

P,,(r)x(t) satisfies

eh(t) = F(t)eh(t)
Writing

C(t)

H (t) J

Ph(t)
where

(t)]

H (f) is it x p. show that, under an appropriate additional hypothesis,

i(t) =H(t)y(t) + J(t)z(t)


provides an asymptotic estimate for .v(t).

Exercise 15.8

Apply Exercise 15.7 to a linear state equation of the form (20), selecting, with

some abuse of notation,

Ph(t)= 111(t)

In_i,]

Compare the resulting reduced-dimension observer with (21).

Exercise 15.9

For the time-invariant linear state equation

= Ax(i) + Bu(t)

y(t) = Cx(t)
show there exists an n x p matrix H such that

= (A + HC)x(t) + Bu(i)

y(t) = Cx(t)
is

exponentially stable if and only if


rank

[XEA]

Ii

for each X that is a nonnegative-real-part eigenvalue of A. (The property in question is called


detectability, and the term output injection sometimes is used to describe how the second state
equation is obtained from the first.)

Exercise 15.10 Consider a time-invariant plant described by

287

Notes
+

y(r)=Ctx(t)

Bu(t)
+

D1u(t)

Suppose the vector r(t) is a reference input signal, and

v(t) = C,x(t) + D,1r(t) + D22u(t)


is

a vector signal available for feedback. For the time-invariant, ne-dimensional dynamic

feedback

i(t)=Fz(t) + Gv(z)
u(t)=Hz(t) + Jv(t)
compute, under appropriate assumptions, the coefficient matrices A, B, C, and D for the (n + n, )dimensional closed-loop state equation.
Exercise 15.11 Continuing Exercise 15.10, suppose D22 = 0 (for simplicity), D1 has full column
rank, D11 has full row rank, and the dynamic feedback state equation is controllable and
observable. Define matrices B,, and C,,, by setting B = B,,D and C2 = D21C2,,. For the closedloop state equation, use the controllability and observability criteria in Chapter 13 to show:
is an eigenvalue
A
(a) If the complex number

of A.
(b) If the complex number A.,, is such that
rank

then

A.,,

C
X01A

is an eigenvalue of A B,,C1.

NOTES
Note 15.1

Observer theory dates from the paper

D.G. Luenberger, "Observing the state of a linear system," IEEE Transactions on Military
Electronics, Vol. 8, pp. 74 80, 1964
and an elementary review of early work is given in

D.G. Luenberger, "An introduction to observers," IEEE Transactions on Automatic Control, Vol.
16, No. 6, pp. 596602, 1971

Our discussion of reduced-dimension observers in the time-varying case is based on the treatments

J. O'Reilly, M.M. Newmann, "Minimal-order observer-estimators for continuous-time linear


systems," International Journal of Control, Vol. 22, No.4, pp. 573 590, 1975
Y.O. Yuksel, J.J. Bongiorno, "Observers for linear multivariable systems with applications,"
IEEE Transactions on Automatic Control, Vol. 16, No. 6, pp. 603 613, 1971
In the latter reference the choice of H(t) to stabilize the error-estimate equation involves a timevarying coordinate change to a special observer form. The issue of choosing the observer initial
state is examined in

Chapter 15

288
C.D.

State Observation

Johnson, "Optimal initial conditions for full-order observers," International Journal of

Control, Vol. 48, No. 3, pp. 857 864, 1988


Note 15.2 Related to observability is the property of reconstructibility. Loosely speaking, an
if x (ti) can be determined from y (1) for
unforced linear state equation is reco,zstructihle on [ta,
This property is characterized by invertibility of the reconstructibility Gramian
e [ti,,

If) dr

t1) = J

The relation between this and the observability Gramian is

N(t,,, tj) =

11)M (ta, 11)cD(t,,, tj)

and thus the 'observability' hypotheses of Theorem 15.2 and Theorem 15.5 can be replaced by the
more compact expression

N(t8,
Reconstructibility is discussed in Chapter 2 of
R.E. Kalman, P.L. FaIb, M.A. Arbib, Topics in Mathematical System Theo,y, McGraw-Hill, New

York, 1969
and Chapter 1 of

J. O'Reilly, Observers for Linear Systems, Academic Press, London, 1983


a book that includes many references to the literature on observers.
Note 15.3 The proof of output feedback stabilization in Theorem 15.5 is from

M. Ikeda, H. Maeda, S. Kodama, "Estimation and feedback in linear time-varying systems: a


deterministic theory," SIAM Journal on Control and Optimi:ation, Vol. 13, No. 2, pp. 304 327,
1975

This paper contains an extensive taxonomy of concepts related to state estimation, stabilization,
and even 'instabilization.' An approach to output feedback stabilization via linear optimal control
theory is in the paper by Yuksel and Bongiomo cited in Note 15.1.

Note 15.4 The problem of state observation is closely related to the problem of statistical
estimation of the state based on output signals corrupted by noise, and the well-known Kalman
filter. A gentle introduction is given in

B.D.O. Anderson, J.B. Moore, Optimal Control Linear Quadratic Methods, Prentice Hall,
Englewood Cliffs, New Jersey, 1990
This problem also can be addressed in the context of observers with noisy output measurements in

both the full- and reduced-dimension frameworks. Consult the monograph by O'Reilly cited in
Note 15.2. On the other hand the Kalman filtering problem is reinterpreted as a deterministic
optimization problem in Section 7.7 of
E.D. Sontag, Mathematical Control Theory. Springer-Verlag, New York, 1990
Note 15.5

The design of a state observer for a linear system driven by unknown input signals also

can be considered. For approaches to full-dimension and reduced-dimension observers, and


references to earlier treatments, see

Notes

289

F. Yang, R.W. Wilde, "Observers for linear systems with unknown inputs." IEEE Transactions on
Automatic Control, Vol. 33, No.7, pp. 677681, 1988

M. Hou, P.C. Muller, "Design of observers for linear systems with unknown inputs," IEEE
Transactions on Automatic Control, Vol. 37, No. 6, pp. 871 874, 1992

Note 15.6 The construction of an observer that provides asymptotically-zero error depends
crucially on choosing observer coefficients in terms of plant coefficients. This is easily recognized
in the process .of deriving the observer error state equation (5). The behavior of the observer error
when observer coefficients are mismatched with plant coefficients, and remedies for this situation,
are subjects in robust observer theory. Consult
J.C. Doyle, G. Stein, "Robustness with observers," IEEE Transactions on Automatic Control, Vol.

24,No.4,pp.60761 1,1979

S.P. Bhattacharyya, "The structure of robust observers," IEEE Transactions on Automatic


Control, Vol.21, No.4, pp. 581 588, 1976

K. Furuta, S. Hara, S. Mon, "A class of systems with the same observer," IEEE Transactions on
Automatic Control, Vol. 21, No.4, pp. 572576, 1976
Note 15.7 The servomechanism problem treated in Theorem 15.6 is based on

H.W. Smith, E.J. Davison, "Design of industrial regulators: integral feedback and feedforward
control," Proceedings of the lEE, Vol. 119, pp. 1210 1216, 1972

The device of assuming disturbance signals are generated by a known exogenous system with
unknown initial state is extremely powerful. Significant extensions and generalizationsusing
many
approachescan be found in the control theory literature. Perhaps a good starting
point is

C.A. Desoer, Y.T. Wang, "Linear time-invariant robust servomechanism problem: A selfcontained exposition," in Control and Dynamic Systems, C.T. Leondes, ed., Vol. 16, pp. 81 129,
1980

16
POLYNOMIAL FRACTION
DESCRIPTION

The polynomial fraction description is a mathematically efficacious representation for a

matrix of rational functions. Applied to the transfer function of a multi-input, multioutput linear state equation, polynomial fraction descriptions can reveal structural
features that, for example, permit natural generalization of minimal realization
considerations noted for single-input, single-output state equations in Example 10.11.
This and other applications are considered in Chapter 17, following development of the
basic properties of polynomial fraction descriptions here.

We assume throughout a continuous-time setting, with G (s) a p x ,iz matrix of


strictly-proper rational functions of s. Then, from Theorem 10.10, G (s) is realizable by
a time-invariant linear state equation with D = 0. Re-interpretation for discrete time
requires nothing more than replacement of every Laplace-transform s by a z-transform
z. (Helvetica-font notation for transforms is not used, since no conflicting time-domain
symbols arise.)

Right Polynomial Fractions


Matrices of real-coefficient polynomials in s, equivalently polynomials in s with

coefficients that are real matrices, provide the mathematical foundation for the new
transfer function representation.
16.1 Definition A p x polynomial nzat,-ix P (s) is a matrix with entries that are realcoefficient polynomials in s. A square (p = r) polynomial matrix P(s) is called
nonsingular if det P (s) is a nonzero polynomial, and uniniodular if det P (s) is a
nonzero real number.

The determinant of a square polynomial matrix is a polynomial (a sum of products

of the polynomial entries). Thus an alternative characterization


'Ga

is

that a square

Right Polynomial Fractions

291

0 for all but a finite


And P (s) is unimodular if and only if det P (se) *0

polynomial matrix P (s) is nonsingular if and only if det P (s0)

number of complex numbers


for all complex numbers s0.

The adjugate-over-determinant formula shows that if P (s)

is

square and

nonsingular, then
exists and (each entry) is a rational function of s. Also
P '(s) is a polynomial matrix if P (s) is unimodular. (Sometimes a polynomial is
viewed as a rational function with unity denominator.) From the reciprocal-determinant
relationship between a matrix and its inverse, P '(s) is unimodular if P (s) is
unimodular. Conversely if P (s) and P - '(s) both are polynomial matrices, then both
are unimodular.
16.2 Definition A right polynomial fraction description for the p x m strictly-proper
rational transfer function G (s) is an expression of the form

(s) is a p x m polynomial matrix and D (s) is an m x m nonsingular polynomial


matrix. A left polynomial fraction description for G (s) is an expression

G(s)

(2)

where NL(s) is a p x in polynomial matrix and DL(s) is a p x p nonsingular polynomial

matrix. The degree of a right polynomial fraction description is the degree of the
polynomial det D (s). Similarly the degree of a left polynomial fraction is the degree of
det DL(s).
Of course this definition is familiar if m = p = 1. In the multi-input, multi-output
case, a simple device can be used to exhibit so-called elementaiy polynomial fractions
for G (s). Suppose d (s) is a least common multiple of the denominator polynomials of
entries of G (s). (In fact, any common multiple of the denominators can be used.) Then
Na(s)

a p x in polynomial matrix, and we can write either a right or left polynomial fraction
description:
is

(3)

The degrees of the two descriptions are different in general, and it should not be
surprising that lower-degree polynomial fraction descriptions typically can be found if
some effort is invested.

In the single-input, single-output case, the issue of common factors in the scalar
numerator and denominator polynomials of G (s) arises at this point. The utility of the
polynomial fraction representation begins to emerge from the corresponding concept in
the matrix case.

An r x r polynomial matrix R (s) is called a right divisor of the p xr


polynomial matrix F(s) if there exists ap x r polynomial matrix P(s) such that
16.3 Definition

P(s) = P(s)R(s)

Chapter 16

292

Polynomial Fraction Description

If a right divisor R(s) is nonsingular, then P(s)R'(s) is a p x, polynomial


matrix. Also if P (s) is square and nonsingular, then every right divisor of P (s) is
nonsingular.

To become accustomed to these notions, it helps to reflect on the case of scalar


polynomials. There a right divisor is simply a factor of the polynomial. For polynomial
matrices the situation is roughly similar.
16.4 Example

For the polynomial matrix

(s+ l)2(s+2)
(s+ l)(s+2)(s+3)

P(s)=

(4)

right divisors include the 1 x 1 polynomial matrices

R0(s) =

Rh(s) = s +

+2, R<j(s)=(s +

l)(s

+2)

In this simple case each right divisor is a common factor of the two scalar polynomials in
P(s), and Rd(s) is a greatest-degree common factor of the scalar polynomials. For the
slightly less simple

P(s)=

(s+l)2(s+2) (s+3)(s+5)
(s+4)(s+5)

two right divisors are

(s+l)
0

(s+l)2

s+5

'

s+5

ODD
Next we consider a matrix-polynomial extension of the concept of a common

factor of two scalar polynomials. Since one of the polynomial matrices always is square

in our application to transfer function representation, attention is restricted to that


situation.

Suppose P(s) is a p x r polynomial matrix and Q(s) is a ix r


16.5 Definition
polynomial matrix. If the
polynomial matrix R (s) is a right divisor of both, then
R (s) is called a common rig/it divisor of P (s) and Q (s). We call R (s) a greatest
common right divisor of P (s) and Q (s) if it is a common right divisor, and if any other
common right divisor of P (s) and Q (s) is a right divisor of R (s). If all common right
divisors of P (s) and Q (s) are unimodular, then P (s) and Q (s) are called rig/it coprime.
For polynomial fraction descriptions of a transfer function, one of the polynomial

matrices always is nonsingular, so only nonsingular common right divisors occur.


Suppose G (s) is given by the right polynomial fraction description

293

Right Polynomial Fractions

G(s)

=N(s)D'(s)

and that R (s) is a common right divisor of N(s) and D (s). Then

D(s) =

N(s) =

(5)

are polynomial matrices, and they provide another right polynomial fraction description

for G(s) since

N(s)D'(s)=N(s)R_l(s)R(s)D_I(s) =G(s)
The degree of this new polynomial fraction description is no greater than the degree of

the original since

deg [detD(s)] = deg [detD(s)] +

deg

[detR(s)]

Of course the largest degree reduction occurs if R (s) is a greatest common right divisor,
and no reduction occurs if N (s) and D (s) are right coprime. This discussion indicates
that extracting common right divisors of a right polynomial fraction is a generalization
of the process of canceling common factors in a scalar rational function.
Computation of greatest common right divisors can be based on capabilities of
elementary row operations on a polynomial matrixoperations similar to elementary

row operations on a matrix of real numbers. To set up this approach we present a


preliminary result.

Suppose P (s) is a p x r polynomial matrix and Q (s) is an r x r


polynomial matrix. If a unimodular (p + r) x (p + i) polynomial matrix U(s) and an
16.6 Theorem

x r polynomial matrix R (s) are such that

U(s)

Q(s)

R(s)

F(s)

then R (s) is a greatest common right divisor of P (s) and Q (s).

Proof Partition U(s) in the form

U(s)=

U11(s) U12(s)
U71(s) U22(s)

where
i(s) is r xr, and U77(s) isp x p. Then the polynomial matrix U - 1(s) can be
partitioned similarly as

U'( )-

Uj1(s)

(s) Ui(s)

Using this notation to rewrite (6) gives

Q(s)

F(s)

Uj1(s) Un(s)

R(s)
0

Chapter

294

16

Polynomial Fraction Description

That is,

Q(s) =

Uj1

(s)R(s), P(s) =

(s)R(s)

Therefore R (s) is a common right divisor of P (s) and Q (s). But, from (6) and (7),

R(s) = U11(s)Q(s) + U12(s)P(s)

(8)

so that if Ra(S) is another common right divisor of P(s) and Q(s), say
Q (s) = Qa(S)Ra(S)

P (s)

Pa(S)Ra(S)

then (8) gives

R(s)= [Uii(S)Qa(S)

This shows Ra(S) also is a right divisor of R Cs), and thus R (s) is a greatest common
right divisor of P(s) and Q(s).

ODD
To

calculate greatest common right divisors using Theorem 16.6, we consider

three types of
row operations on a polynomial matrix. First is the interchange
of two rows, and second is the multiplication of a row by a nonzero real number. The
third

is to add to any row a polynomial multiple of another row. Each of these

elementary row operations can be represented by premultiplication by a unimodular


matrix, as is easily seen by filling in the following argument.
Interchange of rows i and j i corresponds to premultiplying by a matrix Ea that
has a very simple form. The diagonal entries are unity, except that [Ea}jj = EEai jj = 0,
and the off-diagonal entries are zero, except that [Ea]jj =
= 1. Multiplication of the
by a real number a 0 corresponds to premultiplication by a matrix Eb that is

diagonal with all diagonal entries unity, except [Eh]j, = a. Finally adding to row i a
i, corresponds to premultiplication by a matrix Er(s)
polynomial p (s) times row j,

that has unity diagonal entries, with off-diagonal entries zero, except
= p (s).
It is straightforward to show that the determinants of matrices of the form Ea,
and

described above are nonzero real numbers. That is, these matrices are

unimodular. Also it is easy to show that the inverse of any of these matrices corresponds
to another elementary row operation. The diligent might prove that multiplication of a
row by a polynomial is not an elementary row operation in the sense of multiplication by
a unimodular matrix, thereby burying a frequent misconception.
It should be clear that a sequence of elementary row operations can be represented
as premultiplication by a sequence of these elementary unimodular matrices, and thus as

a single unimodular premultiplication. We also want to show the conversethat


premultiplication by any unimodular matrix can be represented by a sequence of
elementary row operations. Then Theorem 16.6 provides a method based on elementary
row operations for computing a greatest common right divisor R (s) via (6).
That any unimodular matrix can be written as a product of matrices of the form Ea,
Eb, and Er(s) derives easily from a special form for polynomial matrices. We present
this special form for the particular case where the polynomial matrix contains a fulldimension nonsingular partition. This suffices for our application to polynomial fraction

Right Polynomial Fractions


descriptions,

295

and also avoids some fussy but trivial issues such as how to handle

identical columns, or all-zero columns. Recall the terminology that a scalar polynomial
is called inonic if the coefficient of the highest power of s is unity, that the degree of a
polynomial is the highest power of s with nonzero coefficient, and that the degree of the
zero polynomial is, by convention, Co.

Theorem
Suppose P (s) is a p x, polynomial matrix and Q (s) is an r x r,
nonsingular polynomial matrix. Then elementary row operations can be used to
16.7

transform

M(s)=
into row He,-niite form described as follows. For k = ,...,r, all entries of the k"column below the k,k-entry are zero, and the k,k-entry is nonzero and monic with higher
1

degree than every entry above it in column k. (If the k,k-entry is unity, then all entries
above it are zero.)
Proof Row Hermite form can be computed by an algorithm that is similar to the row
reduction process for constant matrices.

Step (i): In the first column of M (s) use row interchange to bring to the first row a
lowest-degree entry among nonzero first-column entries. (By nonsingularity of Q (s),
there is a nonzero first-column entry.)
Step (ii): Multiply the first row by a real number so that the first column entry is monic.

Step (iii): For each entry ni,1 (s) below the first row in the first column, use polynomial
division to write
=

+ r,1(s) ,

i = 2,..., p +r

where each remainder is such that deg

(s) <deg m ii (s). (If


(s) = 0, that is
1(s) = 0. If deg m, 1(s) = 0, then by Step (i)
deg m1 (s) = 0. Therefore deg q,(s) = 0 and deg r11 = that is, r, 1(s) = 0.)

deg m, 1(s)

oo, we

Step (iv): For i = 2,.


row. The
r21(s)

resulting

set q(s) =

.., p + i, add

to the i'1'-row the product of q1(s) and the first

entries in the first column, below the


p+r. i(S), all of which have degrees less than deg mi1(s).

first

row,

are

Step (i'): Repeat steps (i) through (iv) until all entries of the first column are zero except

the first entry. Since the degrees of the entries below the first entry are lowered by at
least one in each iteration, a finite number of operations is required.
Proceed to the second column of M(s) and repeat the above steps while ignoring
the first row. This results in a monic, nonzero entry mu(s), with all entries below it zero.
If in 2(s) does not have lower degree than mn22(s), then polynomial division of in

Polynomial Fraction Description

Chapter 16

296

by ,n22(s) as in Step (iii) and an elementary row operation as in Step (iv) replaces
ni
by a polynomial of degree less than deg nz22(s). Next repeat the process for the
third column of M(s), while ignoring the first two rows. Continuing yields the claimed
form on exhausting the columns of M (s).

ODD
To complete the connection between unimodular matrices and elementary row
operations, suppose in Theorem 16.7 that p = 0, and Q (s) is unimodular. Of course the
resulting row Hermite form is upper triangular. The diagonal entries must be unity, for a

diagonal entry of positive degree would yield a determinant of positive degree,


contradicting unimodularity. But then entries above the diagonal must have degree 00

Thus row Hermite form for a unimodular matrix is the identity matrix. In other words
for a unimodular polynomial matrix U(s) there is a sequence of elementary row
operations, say Ea, E,,, Ejs)
E,,, such that
{E0E1,Er(5)

E,,JU(s)=I

(11)

This obviously gives U(s) as the sequence of elementary row operations on the identity
specified by

and premultiplication of a matrix by U(s) thus corresponds to application of a sequence


of elementary row operations. Therefore Theorem 16.6 can be restated, for the case of
nonsingular Q (s), in terms of elementary row operations rather than premultiplication
by a unimodular U(s). If reduction to row Hermite form is used in implementing (6),
then the greatest common right divisor R (s) will be an upper-triangular polynomial
matrix. Furthermore if P (s) and Q (s) are right coprime, then Theorem 16.7 shows that
there is a unimodular U(s) such that (6) is satisfied for R (s) = Jr.
16.8 Example

For

Q(s)=

s2+s+1

s+l
2s2

P(s)= [s+2

1]

calculation of a greatest common right divisor via Theorem 16.6 is a sequence of


elementary row operations. (Each arrow represents one type of operation and should be
easy to decipher.)

M(s)=

Q(s)
(s)

s2+s+1 s+1
=

s23

s+2

2s2
1

s+2

2s2

s2+s+l s+1

Right Polynomial Fractions

297

s+21

s+2
(s2)(s+2)+l 2s2
1

s22s+l

3s+2

Os2/3
0

7/9

s
2

(sl)(s+2)+3 s+l

S
1

Os2/3

s+2

3s+2

s2/3

0 s22s+l

0 s22s+1

is

10

01

01

00

00

This calculation shows that a greatest common right divisor is the identity, and P (s) and
Q (s) are right coprime.

0l0

Two different characterizations of right coprimeness are used in the sequel. One is
in the form of a polynomial matrix equation, while the other involves rank properties of
a complex matrix obtained by evaluation of a polynomial matrix at complex values of s.

16.9 Theorem For a p x r polynomial matrix P(s) and a nonsingular r x r polynomial


matrix Q (s), the following statements are equivalent.

(i) The polynomial matrices P (s) and Q (s) are right coprime.

(ii) There exist an r xp polynomial matrix X(s) and an r x r polynomial matrix Y(s)
satisfying the so-called Bezout identity

X(s)P(s) + Y(s)Q(s)

(12)

(iii) For every complex number se,,


rank

=r

(13)

Proof Beginning a demonstration that each claim implies the next, first we show
that (i) implies (ii). If P (s) and Q (s) are right coprime, then reduction to row Hermite
form as in (6) yields polynomial matrices U11 (s) and U12(s) such that
U1i(s)Q(s) + U12(S)P(S)!r
and this

has the form of (12).


To prove that (ii) implies (iii), write the condition (12) in the matrix form
Y(s)

If

is a complex number for which

X(s)]

Chapter 16

298

rank

Polynomial Fraction Description

Q(s(,)

P(s(,)

then we have a rank contradiction.

To show (iii) implies (i), suppose that (13) holds and R(s) is a common right
divisor of P(s) andQ(s). Then for some p x r polynomial matrix P(s) and some r xr
polynomial matrix Q(s),
Q(s)
P(s)

Q(s)
R(s)
P(s)

(14)

If det R (s) is

a polynomial of degree at least one and


is a root of this polynomial,
then R (s0) is a complex matrix of less than full rank. Thus we obtain the contradiction

rank

rank R

(s0)

<r

Therefore det R (s) is a nonzero constant, that is, R (s) is unimodular. This proves that
P(s) and Q(s) are right coprime.

ODD
A right polynomial fraction description with N(s) and D (s) right coprime is
called simply a coprinie right polynomial fraction description. The next result shows
that in an important sense all coprime right polynomial fraction descriptions of a given
transfer function are equivalent. In particular they all have the same degree.
16.10 Theorem

For any two coprime right polynomial fraction descriptions of a

strictly-proper rational transfer function,

G(s) =N(s)D'(s)
there

exists a unimodular polynomial matrix U(s) such that

N(s) = Na(S)U(S), D(s) = Da(s)U(s)


Proof

By Theorem 16.9 there exist polynomial matrices X(s), Y(s), A(s), and B(s)

such that

X(s)N11(s) + Y(S)Da(5) = un

and

A(s)N(s) + B(s)D(s) = I,,,


Since N(s)D '(s) = Na(S)D '(s), we have Na(S) = N(s)D '

into (15) gives

Substituting this

Left Polynomial Fractions

299

X(s)N(s) + Y(s)D(s) = D'(s)D(s)


A similar calculation using N(s) = Na(S)D'(S)D(S) in (16) gives
A(s)Na(s) + B(S)Da(S) =

and since they

are inverses of each other both must be unimodular. Let

U(s) =D;1(s)D(s)
Then

N(s)

Na(5)U(5) , D(s)

Da(S)U(S)

and the proof is complete.

Left Polynomial Fractions


Before going further we pause to consider left polynomial fraction descriptions and their

relation to right polynomial fraction descriptions of the same transfer function. This
means repeating much of the right-handed development, and proofs of the results are left
as unlisted exercises.
16.11 Definition A q x q polynomial matrix L (s) is called a
divisor of the q x p
polynomial matrix P (s) if there exists a q x p polynomial matrix P(s) such that

P(s) = L(s)P(s)
16.12 Definition If P (s) is a q x p polynomial matrix and Q (s) is a q x q polynomial
matrix, then a q x q polynomial matrix L (s) is called a common left divisor of P (s) and
Q (s) if L (s) is a left divisor of both P (s) and Q (s). We call L (s) a greatest common
left divisor of P (s) and Q (s) if it is a common left divisor, and if any other common

left divisor of P (s) and Q (s) is a left divisor of L (s). If all common left divisors of
P (s) and Q (s) are unimodular, then P (s) and Q (s) are called left coprime.
16.13 Example Revisiting Example 16.4 from the other side exhibits the difibrent look
of right- and left-handed calculations. For
P(s) =
one left divisor is

(s+ 1)2(5+2)
(s+ 1)(s+2)(s+3)

Chapter 16

300

L(s)

(s+l)2(s+2)

Polynomial Fraction Description


0

(s+l)(s+2)(s+3)

where the corresponding 2 x 1 polynomial matrix P(s) has unity entries. In this simple
case it should be clear how to write down many other left divisors.
16.14 Theorem

Suppose P(s) is a qxp polynomial matrix and Q(s) is a qxq

polynomial matrix. If a (q +p) x (q +p) unimodular polynomial matrix U(s) and a


q x q polynomial matrix L (s) are such that

[Q(s) P(s)IU(s)= [L(s) 0]


then L (s) is a greatest common left divisor of P (s) and Q (s).

Three types of elementaiy

operations

can be represented by post-

multiplication by a unimodular matrix. The first is interchange of two columns, and the
second is multiplication of any column by a nonzero real number. The third elementary
column operation is addition to any column of a polynomial multiple of another column.
It

is easy to check that a sequence of these elementary column operations can be

represented by post-multiplication by a unimodular matrix. That post-multiplication by


any unimodular matrix can be represented by an appropriate sequence of elementary
column operations is a consequence of another special form, introduced below for the
class of polynomial matrices of interest.

16.15 Theorem
Suppose P(s) is a q xp polynomial matrix and Q(s) is a q xq
nonsingular polynomial matrix. Then elementary column operations can be used to
transform

M(s)= [Q(s) P(s)]


into a column Her,njte form described as follows. For k =
q, all entries of the
to the right of the k,k-entry are zero, and the k,k-entry is monic with higher
degree than any entry to its left. (If the k,k-entry is unity, all entries to its left are zero.)
1

Theorem 16.14 and Theorem 16.15 together provide a method for computing
greatest common left divisors using elementary column operations to obtain column
Hermite form. The polynomial matrix L (s) in (19) will be lower-triangular.
16.16 Theorem

For a q xp polynomial matrix P(s) and a nonsingular q x q

polynomial matrix Q (s), the following statements are equivalent.

(i) The polynomial matrices P (s) and Q (s) are left coprime.

(ii) There exist a p x q polynomial matrix X(s) and a q x q polynomial matrix Y(s)
such that

Left Polynomial Fractions

301

P(s)X(s) +
(iii) For every complex number se,,

rank [Q(s0) P(s0)] =

(21)

Naturally a left polynomial fraction description composed of left coprime


polynomial matrices is called a coprime left polynomial fraction description.

Theorem
For any two coprime left polynomial fraction descriptions of a
strictly-proper rational transfer function,
16.17

G(s)
there

=D'(S)Na(S)

exists a unimodular polynomial matrix U(s) such that

N(s) = U(S)Na(S), D(s) =


Suppose that we begin with the elementary right polynomial fraction description
and the elementary left polynomial fraction description in (3) for a given strictly-proper

rational transfer function G (s). Then appropriate greatest common divisors can be
extracted to obtain a coprime right polynomial fraction description, and a coprime left
polynomial fraction description for G (s). We now show that these two coprime
polynomial fraction descriptions have the same degree. An economical demonstration
relies on a particular polynomial-matrix inversion formula.

16.18 Lemma Suppose that V11 (s) is a in x in nonsingular polynomial matrix and
V11(s) V12(s)
V21 (s) V22(s)

V(s) =

(22)

is an (iii +p) x (in +p) nonsingular polynomial matrix. Then defining the matrix of
rational functions Va(S) = V22(s)

V21

(s)Vj11 (s)V12(s),

(i)detV(s) =det[V11(s)] det [Va(S)],


(ii) det Va(S) is a nonzero rational function,

(iii) the inverse of V(s) is

v1 ( 5)

V '(s) V21

(s)

(s)V12(s)V '(s)

Vp(s)

Proof A partitioned calculation verifies


1=
1

V(s)=

V11(s) V12(s)
()

V0(s)

Using the obvious determinant identity for block-triangular matrices, in particular

(23)

Chapter 16

302

Polynomial Fraction Description

iflj
-l
V,1(s) V11 (s)

et

Xp

I,,

gives

det V(s) = det

det

Since V(s) and V11(s) are nonsingular polynomial matrices, this proves that del Va(S)
is a nonzero rational function, that is,
(s) exists. To establish (iii), multiply (23) on
the left by

1,,,

Vp(s)

I,,

to obtain

V '(s) V21

and

(s)V12(s)V '(s)
Vp(s)

(s)

1,,

V(s)

the proof is complete.

Suppose that a strictly-proper rational transfer function is represented


by a coprime right polynomial fraction and a coprime left polynomial fraction,
16.19 Theorem

G(s) = N(s)D'(s) =

(24)

Then there exists a nonzero constant a such that det D(s) = a del DL(s).
Proof By right-coprimeness of N(s)
unimodular polynomial matrix

and

U11(s)
U(s) =

such

D(s) there exists an (rn +p) x (rn +p)


U12(s)

U21(s) U22(s)

that
U11(s)

U12(s)

D(s)

'rn

U21 (s) U22(s)

N(s)

For notational convenience let

U21(s)

Each

U12(s)

V11(s)

U22(s)

V21(s) V,2(s)

is a polynomial matrix, and in particular (25) gives

V11(s) =D(s), V21(s) = N(s)

25

Column and Row Degrees

303

Therefore V1 (s) is nonsingular, and calling on Lemma 16.18 we have that

U22(s) = [V,2(s)

V12(s)]'

V,1(s)

which of course is a polynomial matrix, is nonsingular. Furthermore writing


U11(s) U12(s)

V11(s)

U,1(s) U,2(s)

V,i(s) V22(s)

1,,,

0 I,,

gives, in the 2,2-block,


U,1(s)V12(s) +

By Theorem 16.16 this implies that U,1 (s) and U,,(s) are left coprime. Also, from the
2,1-block,
U,1 (s)V1 i(s) + U,,(s) V21 (s) = U21 (s)D(s) + U,2(s)N(s)

=0

(26)

Thus we can write, from (26),


G (s) = N(s)D '(s) =

(s)U,1 (s)

(27)

This is a coprime left polynomial fraction description for G (s). Again using Lemma
16.18, and the unimodularity of V (s), there exists a nonzero constant a such that
det

Vi2(5)]

(s)] det [V22(s) -

= det

= det

[D(s)] . det

detD(s)
detU22(s)

V21

(s)V12(s)]

(s)]

Therefore, for the coprime left polynomial fraction description in (27), we have
det U22(s) = a der D(s). Finally, using the unimodular relation between coprime left
polynomial fractions in Theorem 16.17, such a determinant formula, with possibly a

different nonzero constant, must hold for any coprime left polynomial fraction
description for G (s).

Column and Row Degrees


There is an additional technical consideration that complicates the representation of a

strictly-proper rational transfer function by polynomial fraction descriptions. First we


introduce terminology for matrix polynomials that is related to the notion of the degree
of a scalar polynomial. Recall again conventions that the degree of a nonzero constant is
zero, and the degree of the polynomial 0 is

Chapter

304

Polynomial Fraction Description

16

16.20 Definition For a p xr polynomial matrix P (s), the degree of the highest-degree
polynomial in the
of P (s), written
is called the j"-colu,nn degree of
P (s). The column degree coefficient matrix for P (s), written P1k, is the real p X r
matrix with i,j-entry given by the coefficient of
in the i,j-entry of P(s). If P(s) is
square and nonsingular, then it is called column reduced if

deg [det P (s)] =

c1

[P] +

[PJ

(28)

If P (s) is square, then the Laplace expansion of the determinant about columns
shows that the degree of det P (s) cannot be greater than c1 [PJ +
+
1. But it
can be less.

The issue that requires attention involves the column degrees of D (s) in a right
polynomial fraction description for a strictly-proper rational transfer function. It is clear
in the m = p = 1 case that this column degree plays an important role in realization
considerations, for example. The same is true in the multi-input, multi-output case, and
the complication is that column degrees of D(s) can be artificially high, and they can
change in the process of post-multiplication by a unimodular matrix. Therefore two
coprime right polynomial fraction descriptions for G (s), as in Theorem 16.10, can be
such that D (s) and Da(S) have different column degrees, even though the degrees of the
polynomials det D (s) and det D0(s) are the same.
16.21

Example

The coprime right polynomial fraction description for

71_I_]

(29)

G(s)= [
specified by

N(s)=

[1

s1

21, D(s)=

sl

is such that c1 [D] = 1 and c2[D] = 1. Choosing the unimodular matrix

U(s)=
another coprime right polynomial fraction description for G (s) is given by
Na(s)

=N(s)U(s)= [2s22s+3

Da(S)D(S)U(S)
Clearly Ci[Da] = 3 and

DOD

21

+1 s+1
S2

= 1,though detDa(s) = detD(s).

Column and Row Degrees

305

first step in investigating this situation is to characterize column-reduced


polynomial matrices in a way that does not involve computing a determinant. Using
Definition 16.20 it is convenient to write aj xp polynomial matrix P(s) in the form
The

ei[Pl

P(s)=P,,r

+ P1(s)

(30)

where P1(s) is a p xp polynomial matrix in which each entry of the j"-column has
degree strictly less than
I. (We use this notation only when P (s) is nonsingular, so

[PJ0)

that c1[P]

16.22 Theorem
If P (s) is a p x p nonsingular polynomial matrix, then P (s) is
column reduced if and only if
is invertible.
Proof We can write, using the representation (30),

det P(s) = det {P(s) diagonal


= det

}]

[P,,, +

= det

{s

+ P(s

ei[PJ

cfPj
l'

where P(s _1) is a matrix with entries that are polynomials in s


that have no constant
terms, that is, no
terms. A key fact in the remaining argument is that, viewing s as
oo yields P(s
real and positive, letting s
0. Also the determinant of a matrix is
a continuous function of the matrix entries, so limit and determinant can be
interchanged. In particular we can write
lim { s

det

P (s)] = lim det


= det

lim

+ P(s ) I

+ P(s ')]

= det

Using (28) the left side of (31) is a nonzero constant if and only if P (s) is colunm
reduced, and thus the proof is complete.

IDD
Consider a coprime right polynomial fraction description N (s)D 1(s), where
D (s) is not column reduced. We next show that elementary column operations on D (s)

(post-multiplication by a unimodular matrix U(s)) can be used to reduce individual


column degrees, and thus compute a new coprime right polynomial fraction description

Chapter 16

306

N(s) =N(s)U(s) , D(s)

Polynomial Fraction Description

=D(s)U(s)

(32)

where D(s) is column reduced. Of course U(s) need not be constructed explicitly
simply perform the same sequence of elementary column operations on N (s) as on
D (s) to obtain N(s) along with D(s).
To describe the required calculations, suppose the column degrees of the rn x rn
polynomial matrix D(s) satisfy c1[D] c2[D]
c,,,[D], as can be achieved by
column interchanges. Using the notation
+ D1(s)

D (s) =

there exists a nonzero m x I vector z such that


reduced. Suppose that the first nonzero entry in z

and

is

D (s) is not column


define a corresponding

polynomial vector by
o

o
Z

0
*

z (s)

(33)

Zk+I

znls

in

L ID

lc.,IDI

Then

D (s)z(s) =

(s)

(s)

+ D,(s)z(s)

= D,(s)z(s)

and all entries of D,(s)z(s) have degree no greater than

Choosing the

unimodular matrix

U(s) = [ei ..
e denotes the
degrees satisfying

ck[D]<ck[D1;

If D(s)

is

ek_I z(s) ek+ i

it follows that D(s) = D(s)U(s) has column

j= 1

k1,k+l

rn

not column reduced, then the process is repeated, beginning with the

reordering of columns to obtain nonincreasing column degrees. A finite number of such


repetitions builds a unimodular U(s) such that D(s) in (32) is column reduced.

Column and Row Degrees

307

Another aspect of the column degree issue involves determining when a given
N(s) and D (s) are such that N (s)D (s) is a strictly-proper rational transfer function.
The relative column degrees of N(s) and D (s) play important roles, but not as simply
as the single-input, single-output case suggests.

16.23 Example

Suppose a right polynomial fraction description is specified by

1], D(s)= s3+i

N(s)= [s 2

s+1

Then

c1[N]=2,c2[N]=O,c1[D}=3,c2[D]=l
the column degrees of N(s) are less than the respective column degrees of D (s).
However an easy calculation shows that N (s)D - '(s) is not a matrix of strictly-proper
rational functions. This phenomenon is related again to the fact that
and

is

not invertible.

16.24 Theorem if the polynomial fraction description N (s)D '(s) is a strictly-proper


rational function, then
m. If D(s) is column reduced and
j =1
<c1[D j = 1
m, then N (s)D '(s) is a strictly-proper rational function.

Proof Suppose G (s) = N(s)D - '(s) is strictly proper. Then N(s) = G(s)D(s), and
in particular

i=1

In

No(s) =

'1'n

(34)

j ,...,m

Then for any fixed value of j,


5cJ[D1

As we let (real) s
each Dk3(s)

c1[D]

5-cj[Dl

i=

1,...,

each strictly-proper rational function Gk(s) approaches 0,


approaches a finite constant, possibly zero. In any case this gives

= 0,

and

1,..., p

Therefore deg N.y(s)


i = 1, ..., p, which implies
Now suppose that D(s) is column reduced, and
can write

<c1[D].

j = 1,...,

m. We

Polynomial Fraction Description

Chapter 16

308

N(s)D'(s)

= [N(s) diagonal

and

since

<

I,

lim

j=I

(35)

,n.

[N(s) . diagonal { s

,...,s

cuD]

I=0

The adjugate-over-determinant formula implies that each entry in the inverse of a matrix
is a continuous function of the entries of the matrix. Thus limit can be interchanged with
matrix inversion,
lim

[D(s) diagonal

{s

'

Il)]

,...,x

= [ urn (D(s) . diagonal (5 cEO] ,...,s e0]D] 1)1


Writing D (s) in the form (30), the limit yields

Then, from (35),

=0
which implies strict properness.

oi:o

It remains to give the corresponding development for left polynomial fraction


descriptions, though details are omitted.

16.25 Definition For a q x p polynomial matrix P (s), the degree of the highest-degree
degree of P(s).
polynomial in the i's'- row of P(s), written r1[P], is called the
is the real q x p matrix with i,jThe row degree coefficient matrix of P (s). written
in
P (s) is square and nonsingular, then
entry given by the coefficient of
it is called

reduced if

deg [detP(s)] =r1[PI +

(36)

16.26 Theorem If P(s) is a p xp nonsingular polynomial matrix, then P(s) is row


reduced if and only if P11,. is invertible.
16.27 Theorem If the polynomial fraction description D '(s)N(s) is a strictly proper
rational function, then r1[NJ < r,[D], i =
p. If D(s) is row reduced and
p. then D (s)N (s) is a strictly-proper rational function.
r1[N] < r,{DJ, i = I
1

Finally, if G (s) = D (s)N (s) is a polynomial fraction description and D (s) is


not row reduced, then a unimodular matrix U(s) can be computed such that
D1,(s) = U(s)D(s) is row reduced. Letting Nb(s) = U(s)N(s), the left polynomial

fraction description

Exercises

309

U(s)N(s)=G(s)

(37)

has the same degree as the original.


Because of machinery developed in this chapter, a polynomial fraction description
for a strictly-proper rational transfer function G (s) can be assumed as either a coprime

right polynomial fraction description with column-reduced D (s), or a coprime left


polynomial fraction with row-reduced DL(s). In either case the degree of the
polynomial fraction description is the same, arid is given by the sum of the column
degrees or, respectively, the sum of the row degrees.

EXERCISES
Exercise 16.1 Determine if the following pair of polynomial matrices is right coprime. If not,
compute a greatest common right divisor.

s-

Q(s)=

(s+l)(s+3)
s+3

Determine if the following pair of polynomial matrices is right coprime. If not,


compute a greatest common right divisor.
Exercise 16.2

P(s)=
Exercise 16.3

Q(s)=

s(s+l)2s

(s+l)2(s+2)2

(5+2)2

Show that the right polynomial fraction description

G(s)
is coprime if and only if there exist unimodular matrices U(s) and V(s) such that

]V(s)=

U(s)

If N (s)D '(s) is right coprime and

'(s) is another right polynomial fraction description


for G (s), show that there is a polynomial matrix R (s) such that

Da(s)
Ne(s)

D(s)
R(s)
N(s)

Suppose that D - (s)N (s) and D' (s)N(, (s) are coprime left polynomial fraction
descriptions for the same strictly-proper transfer function. Using Theorem 16.16, prove that
D(s)D'(s) is unimodular.
Exercise 16.4

Exercise 16.5 Suppose DZ'(s)NL(s) =


and both are coprime polynomial fraction
descriptions. Show that there exist U11(s) and U12(s) such that
U11(s) U12(s)
NL(s)
is

unimodular and

DL(s)

Chapter 16

310

U11(s) U12(s)
NL(s) DL(s)
Exercise 16.6

Polynomial Fraction Description

D(s)

N(s)

For
0

D(s)=

0 s2+1
52+1

compute a unimodular U(s) such

that D(s)U(s)

is

column reduced.

Exercise 16.7 Suppose the inverse of the unimodular matrix


+
is

written as

Q(s) =
and p.

2. Prove that if

+
are

and

+ Qo

invertible, then

is

unimodular by

exhibiting R1 and R0 such that

[Pus
Exercise

16.8

Obtain a coprime, column-reduced right polynomial fraction description for

G(s)=

s s+2
i

Exercise 16.9

=R1s + R11

s+i

52+2 (s+l)2

s+l

An rn xn, matrix V(s) of proper rational functions is called b/proper if V1(s)

exists and is a matrix of proper rational functions. Show that V (s) is biproper if and only if it can
where P(s) and Q(s) are nonsingular, column-reduced
be written as V(s) =

polynomial matrices with c1[P] = c-[Q], I = I

rn.

Exercise 16.10 Suppose N(s)D'(s) and N(s)D'(s) both are coprime right polynomial
fraction descriptions for a strictly-proper, rational transfer function G(s). Suppose also that D(s)
and D(s) both are column reduced with column degrees that satisfy the ordering
= c1[D], j = 1
m. (This shows that these column
c1 c2 <
c,,,, Show that
degrees are determined by the transfer function, not by a particular (coprime, column-reduced)
right polynomial fraction description.) Hint: AssumeJ is the least index for which cj[D I
and express the unimodular relation between D(s) and D(s) column-wise. Using linear
independence of the columns of
must be zero.

and

Dir, conclude that a submatrix of the unimodular matrix

NOTES
Note 16.1 A standard text and reference for polynomial fraction descriptions is

T. Kailath, Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey, 1980

At the beginning of Section 6.3 several citations to the mathematical theory of polynomial
matrices are provided. See also

Notes

311

S. Bamett, Polynomials and Linear Control Systems. Marcel Dekker, New York, 1983

A.I.G. Vardulakis, Linear Multii'ariahle Control. John Wiley, Chichester, 1991

Note 16.2 The polynomial fraction description emerges from the time-domain description of
input-output differential equations of the form
L (p)y (t) = M (p)u (1)

This is an older notation where p represents the differential operator dldt, and L (p) and

M(p) are

polynomial matrices in p. Early work based on this representation, much of it dealing with stateequation realization issues, includes

E. Polak, "An algorithm for reducing a linear, time-invariant differential system to state
IEEE Transactions on Automatic Control. Vol. 11, No. 3, pp. 577579, 1966

W.A. Wolovich, Linear Multii'ariable Systems, Applied Mathematical Sciences, Vol. 11,
Springer-Verlag, New York, 1974

For more recent developments consult the book by Vardulakis cited in Note 16.1, and

H. Blomberg, R. Ylinen, Algebraic Theory ftr Multivariahie Linear Systems. Mathematics in


Science and Engineering, Vol. 166, Academic Press, London, 1983

Note 16.3 If P(s) is a p xp polynomial matrix, it can be shown that there exist unimodular
matrices U(s) and V(s) such that

U(s)P(s)V(s) = diagonal

2.1(s)

where X1(s)
Xe(s) are monic polynomials with the property that ?.L(S) divides
A
similar result holds in the nonsquare case, with the polynomials
on the quasi-diagonal. This
is called the Smith form for polynomial matrices. The polynomial fraction description can be
developed using this form, and the related Smith-McMillan form for rational matrices, instead of
Hermite forms. See Section 22 of

D.F. Delchamps, State Space and Input-Output Linear Systems. Springer-Verlag, New York, 1988

Note 16.4 Polynomial fraction descriptions are developed for time-varying linear systems in
I. Numberger, W. Schmale, "Time-varying polynomial matrix systems,"
International Journal of Control. Vol. 40, No. 2, pp. 329 362, 1984

A. Ilchmann,

and, for the discrete-time case, in

P.P. Khargonekar, K.R. Poolla, "On polynomial matrix-fraction representations for linear timevarying systems," Linear Algebra and Its Applications, Vol. 80, pp. 1 37, 1986
Note 16.5

In addition to polynomial fraction descriptions, rational fraction descriptions have

proved very useful in control theory. For an introduction to this different type of coprime
factorization, see

M. Vidyasagar, Control System Synthesis: A Factorization Approach, MIT Press, Cambridge,


Massachusetts, 1985

17
POLYNOMIAL FRACTION
APPLICATIONS

In this chapter we apply polynomial fraction descriptions for a transfer function in three

ways. First computation of a minimal realization from a polynomial fraction description


is considered, as well as the reverse computation of a polynomial fraction description for
a given linear state equation. Then the notions of poles and zeros of a transfer function

are defined in terms of polynomial fraction descriptions, and these concepts are
characterized in terms of response properties. Finally linear state feedback is treated
from the viewpoint of polynomial fraction descriptions for the open-loop and closedloop transfer functions.

Minimal Realization
We

assume that a p x ni strictly-proper rational transfer function is specified by a

coprime right polynomial fraction description

G(s) =
with

D(s) column reduced. Then the column degrees of N(s) and D(s) satisfy

j =1

m. Some simplification occurs if one uninteresting case is

ruled out. If
I = 0 for some j, then by Theorem 16.24 G (s) is strictly proper if and
only if all entries of the
of N(s) are zero, that is,
Therefore a
=

standing assumption in this chapter is that c1[D],..., c,11[D] 1, which turns out to be
compatible with assuming rank B = in for a linear state equation. Recall that the
degree ofthe polynomial fraction description (1) is c1[D]+ . . +c,,,[DJ, since D(s) is
.

column reduced.
From Chapter 10 we know there exists a minimal realization for G (s),

= Av(t) +

y(t) = Cx(t)

(t)

Minimal Realization

313

In exploring the connection between a transfer function and its minimal realizations, an
additional bit of terminology is convenient.

Suppose N(s)D'(s) is a coprime right polynomial fraction


17.1 Definition
description for the p x m, strictly-proper, rational transfer function G (s). Then the
degree of this polynomial fraction description is called the McMiilan degree of G (s).

The first objective is to show that the McMillan degree of G (s) is precisely the
dimension of minimal realizations of G (s). Our roundabout strategy is to prove that
minimal realizations cannot have dimension less than the McMillan degree, and then
compute a realization of dimension equal to the McMillan degree. This forces the
conclusion that the computed realization is a minimal realization.

17.2 Lemma The dimension of any realization of a strictly-proper rational transfer


function G(s) is at least the McMillan degree of G(s).

Proof

Suppose that the linear state equation (2)

is

a dimension-n minimal

realization for the p x m transfer function G (s). Then (2) is both controllable and
observable, and

G(s)= C(sI AY'B

Define a n x in strictly-proper transfer function H(s) by the left polynomial fraction


description
= (si

H(s) =

Clearly this left polynomial fraction description has degree ii. Since the state equation
(2) is controllable, Theorem 13.4 gives
rank [DL(s,)

= rank
=

B]

11

Thus by Theorem 16.16 the left polynomial fraction description


is a coprime right polynomial fraction
description for H(s). Then this right polynomial fraction description also has degree ii,
for every complex

(3) is coprime. Now suppose

and

G(s) = [CN1,(s)]D;'(s)
is a degree-n right polynomial fraction description for G (s), though not necessarily
coprime. Therefore the McMillan degree of G (s) is no greater than n, the dimension of
a minimal realization of G (s).

DOD

Chapter 17

314

Polynomial Fraction Applications

For notational assistance in the construction of a minimal realization, recall the


integrator coefficient matrices corresponding to a set of k positive integers, cx1
ak,
with
+
-+
= n. From Definition 13.7 these matrices are

01 ...0

00...0
A0 = block diagonal

i =

00 ---1

o0...0

(a, Xa,)

= block diagonal
0

(ax
Define the corresponding integrator polynomial matrices by

i=

'1'(s) = block diagonal


S

= diagonal

U1

The terminology couldn't be more appropriate, as we now demonstrate.

17.3 Lemma The integrator polynomial matrices provide a right polynomial fraction
description for the corresponding integrator state equation. That is,
(si
Proof
to obtain

To verify (5), first multiply on the left by (sI A(,) and on the right by s(s)

B(4(s) = s'P(s)

This expression is easy to check in a column-by-column fashion using the structure of


the various matrices. For example the first column of (6) is the obvious

Minimal Realization

315

a1l

Proceeding similarly through the remaining columns in (6) yields the proof.

ODD
Completing our minimal realization strategy now reduces to comparing a special

representation for the polynomial fraction description and a special structure for a
dimension-n state equation.

17.4 Theorem Suppose that a strictly-proper rational transfer function is described by


a coprime right polynomial fraction description (1), where D (s) is column reduced with
column degrees c1 [D ]
c,,,[D] I. Then the McMillan degree of G (s) is given by
n = c1 [D] + .. + c,,,[D], and minimal realizations of G (s) have dimension n.
Furthermore, writing

N(s) = N,'P(s)
D (s) =

+ D1W(s)

where 'F(s) and is(s) are the integrator polynomial matrices corresponding to
c,,,[D }, a minimal

c1 [D

realization for G (s) is

= (A0

y(t) = N,x(t)
where

A0

and

B0

are

the

integrator

coefficient

matrices

corresponding

to

c1[D]

Proof First we verify that (8) is a realization for G (s). It is straightforward to write
down the representation in (7), where N, and D1 are constant matrices that select for
appropriate polynomial entries of N(s), and D,(s). Then solving for i\(s) in (7) and
substituting into (6) gives

(s) = sW(s)
= (sI
This implies

A0'P(s) +

A0 +

D,) 'F(s)

Chapter 17

316

Polynomial Fraction Applications

(si A(, +
from which the transfer function for (8) is

N,(sI A0 +
(8) is a realization of G (s) with dimension e1 [D J +
which is the
+
McMillan degree of G(s). Then by invoking Lemma 17.2 we conclude that the
Thus

McMillan degree of G (s) is the dimension of minimal realizations of G (s).

DOD
In the minimal realization (8), note that if
is upper triangular with unity
diagonal entries, then the realization is in the controller form discussed in Chapter 13.
(Upper triangular structure for D/,( can be obtained by elementary column operations on
the original polynomial fraction description.) If (8) is in controller form, then the
controllability indices are precisely
=c1 [D ]
p,,,=c,,,[D]. Summoning Theorem
10.14 and Exercise 13.10, we see that all minimal realizations of
have the

same controllability indices up to reordering. Then Exercise 16.10 shows that all
minimal realizations of a strictly-proper rational transfer function G (s) have the same
controllability indices up to reordering.
Calculations similar to those in the proof of Theorem 17.4 can be used to display a
right polynomial fraction description for a given linear state equation.
17.5 Theorem Suppose the linear state equation (2) is controllable with controllability
p,,, 1. Then the transfer function for (2) is given by the right
indices Pi
polynomial fraction description
C(sI

N(s) =

D(s) =

and D (s) is column reduced. Here P(s) and

are the integrator polynomial

matrices corresponding to Pi
p,11, P is the controller-form variable change, and U
and R are the coefficient matrices defined in Theorem 13.9. If the state equation (2) also
is observable, then N (s)D - '(s) is coprime with degree n.

Proof By Theorem 13.9 we can write


PAP

where

= A0 + BQUP

',

PB = B0R

and B0 are the integrator coefficient matrices corresponding to pt,. ..,

p,,,.

Let L\(s) and W(s) be the corresponding integrator polynomial matrices. Using (10) to

substitute for

in (6) gives

B0RD(s) +

A0t1'(s)

Minimal Realization

317

Rearranging this expression yields

'(s) = (si

A0

B(, UP

B0R

and therefore
=

= C(sl

A0

BOUP'Y'BOR

AY1B

This calculation verifies that the polynomial fraction description defined by (10)
represents the transfer function of the linear state equation (2). Also, D (s) in (10) is
column reduced because
Since the degree of the polynomial fraction
=R
description is n, if the state equation also is observable, hence a minimal realization of
its transfer function, then n is the McMillan degree of the polynomial fraction

description (10).

ODD
For

left polynomial fraction

descriptions, the strategy for right fraction

descriptions applies since the McMillan degree of G (s) also is the degree of any
coprime left polynomial fraction description for G (s). The only details that remain in
proving a left-handed version of Theorem 17.4 involve construction of a minimal
realization. But this construction is not difficult to deduce from a summary statement.
17.6 Theorem Suppose that a strictly-proper rational transfer function is described by
a coprime left polynomial fraction description D '(s)N(s), where D(s) is row reduced
with row degrees r1 ED],...,
1. Then the McMiilan degree of G(s) is given by
ii = r1 [D] + ... +
and minimal realizations of G(s) have dimension iz.
Furthermore, writing

N(s) =
D (s) = A(5)Di,r +

P(s) and A(s) are the integrator polynomial matrices corresponding to

where

r1 ED],..., r,, [D

minimal realization for G (s) is

i(t) =

+ N,u(t)

y(t) =
where

A()

r1[D},.

..,

and

B0

are

the

integrator coefficient

matrices corresponding to

Analogous to the discussion following Theorem 17.4, in the setting of Theorem

17.6 the observability indices of minimal realizations of D 1(s)N (s) are the same, up to
reordering, as the row degrees of D (s).

Chapter 17

318

Polynomial Fraction Applications

For the record we state a left-handed version of Theorem 17.5, leaving the proof to
Exercise 17.3.

17.7 Theorem

indices m

Suppose the linear state equation (2) is observable with observability

I. Then the transfer function for (2)

is

given by the left

polynomial fraction description


C(s!

AY'B

D1(s)N(s)

where

N(s) = qJT(5)Q_IB
D(s) =

and D (s) is row reduced. Here 'F(s) and is(s) are the integrator polynomial matrices
corresponding to
Q is the observer-form variable change, and V and S are
the coefficient matrices defined in Theorem 13.17. If the state equation (2) also is
controllable, then D '(s)N (s) is coprime with degree n.

Poles and Zeros


The connections between a coprime polynomial fraction description for a strictly-proper

rational transfer function G (s) and minimal realizations of G (s) can be used to define
notions of poles and zeros of G (s) that generalize the familiar notions for scalar transfer
functions. In addition we characterize these concepts in terms of response properties of a
minimal realization of G (s). (For readers pursuing discrete time, some translation of
these results is required.)
Given coprime polynomial fraction descriptions

G(s) =

Di'(s)NL(s)

it follows from Theorem 16.19 that the polynomials det D (s) and det DL(s) have the

same roots. Furthennore from Theorem 16.10 it is clear that these roots are the same for
every coprime polynomial description. This permits introduction of terminology in
terms of either a right or left polynomial fraction description, though we adhere to a
societal bias and use right.

17.8 Definition
Suppose G(s) is a strictly-proper rational transfer function. A
complex number s0 is called a pole of G (s) if del D (se) = 0, where N (s)D '(s) is a
is
coprime right polynomial fraction description for G (s). The multiplicity of a pole
the multiplicity of s0 as a root of the polynomial det D (s).
This terminology is compatible with customary usage in the m = p = I case, and it
agrees with the definition used in Chapter 12. Specifically if s0 is a pole of G (s), then
= oo Conversely if some entry of G (s) has
some entry G0(s) is such that
infinite magnitude when evaluated at the complex number s0, then s0 is a pole of G (s).
(Detailed reasoning that substantiates these claims is left to Exercise 17.9.) Also
Theorem 12.9 stands in this terminology: A linear state equation with transfer function
I

Poles and Zeros

319

G (s) is uniformly bounded-input, bounded-output stable if and only if all poles of G (s)
have negative real parts, that is, all roots of det D (s) have negative real parts.
The relation between eigenvalues of A in the linear state equation (2) and poles of
the corresponding transfer function

C(sl AY1B

G(s)
is

a crucial feature in some of our arguments. Writing G (s) in terms of a coprime right
description gives

polynomial

N(s)adjD(s)

C[adj(sJA)}B

det D (s)

det (si A)

Using Lemma 17.2, (15) reveals that if s0 is a pole of G (s) with multiplicity a0, then
se,, is an eigenvalue of A with multiplicity at least
But simple single-input, single-

output examples confirm that multiplicities can be different, and in particular an


eigenvalue of A might not be a pole of G (s). The remedy for this displeasing situation
is to assume (2) is controllable and observable. Then (15) shows that, since the
denominator polynomials are identical up to a constant multiplier, the set of poles of
G (s) is identical to the set of eigenvalues of a minimal realization of G (s).
This discussion leads to an interpretation of a pole of a transfer function in terms
of zero-input response properties of a minimal realization of the transfer function.

17.9 Theorem Suppose the linear state equation (2) is controllable and observable.
Then the complex number s0 is a pole of

G(s)=C(sI
if and only if there exists a complex n x 1 vector x0 and a complex p x 1 vector y0
such that

t0
Proof If is a pole of G (s), then s0 is an eigenvalue of A. With x0 an eigenvector
of A corresponding to the eigenvalue se,, we have
e

At

x0=e x0

This easily gives (16), where y0 = Cx0 is nonzero by the observability of (2) and the
corresponding eigenvector criterion in Theorem 13.14.
On the other hand if (16) holds, then taking Laplace transforms gives
C(sI

s =
shows that, since y0 0, det (s01 A) = 0. Therefore
eigenvalue of A and, by minimality of the state equation, a pole of G (s).

ODD

is an

Chapter 17

320

Polynomial FractLon Applications

Of course if So is a real pole of G (s), then (16) directly gives a corresponding


zero-input response property of minimal realizations of G (s). If s0 is complex, then the
real initial state x0 +
gives an easily-computed real response that can be written as a

product of an exponential with exponent (Re

and a sinusoid with frequency

Im [s0].
The concept of a zero of a transfer function is more delicate. For a scalar transfer

function G(s) with coprime numerator and denominator polynomials, a zero is a


complex number s0 such that G (S0,) = 0. Evaluations of a scalar G (s) at particular
complex numbers can result in a zero or nonzero complex value, or can be undefined (at

a pole). These possibilities multiply for multi-input, multi-output systems, where a


corresponding notion of a zero is a complex
where the matrix G (se,) 'loses rank.'
To carefully define the concept of a zero, the underlying assumption we make is
that rank G (s) = miii [rn, p 1 for almost all complex values of s. (By 'almost all' we
mean 'all but a finite number.') In particular at poles of G (s) at least one entry of G (s)

is ill-defined, and so poles are among those values of s ignored when checking rank.
(Another phrasing of this assumption is that G (s) is assumed to have rank mm [m, p]
over the field of rational functions, a more sophisticated terminology that we do not
further employ.) Now consider coprime polynomial fraction descriptions

G(s) =N(s)D'(s) =Dj'(s)NL(s)


for G (s). Since both D (s) and DL(s) are nonsingular polynomial matrices, assuming
,-ank G (s) = mm [m, p 1 for almost all complex values of s is equivalent to assuming
rank N(s) = mm [m, p 1 for almost all complex values of s, and also equivalent to
assuming rank NL(s) = miii [m, p1 for almost all complex values of s. The agreeable
feature of polynomial fraction descriptions is that N(s) and NL(s) are well-defined for
all values of s. Either right or left polynomial fractions can be adopted as the basis for
defining transfer-function zeros.

17.10 Definition Suppose G (s) is a strictly-proper rational transfer function with


rank G (s) = mm [ni, p1 for almost all complex numbers s. A complex number s0 is
called a transmission zero of G (s) if rank N (s0,) <miii [ni, p1, where N (s)D (s) is
any coprime right polynomial fraction description for G (s).
This reduces to the customary definition in the single-input, single-output case.
But a look at multi-input, multi-output examples reveals subtleties in the concept of

transmission zero.

17.11 Example
description

Consider the transfer function with coprime right polynomial fraction

s+2

G(s)

(s+l)2

s+l
(s + 2)2

s+2

s+l

(s+1)2

(s+2)2

-l

Poles and Zeros

321

This transfer function has multiplicity-two poles at s = 1 and s = 2, and transmission


zeros at s = I and s = 2. Thus a multi-input, multi-output transfer function can have
coincident poles and transmission zerossomething that cannot happen in the in =
p = case according to a careful reading of Definition 17.10.
1

The transfer function with coprime left polynomial fraction

17.12 Example
description

s+l
(s+3)2

G(s)=

0
0

s+2
(s+4)

s+2

s+l

s+l

s+2

(20)

s+2 s+l

has no transmission zeros, even though various entries of G (s), viewed as single-input,
single-output transfer functions, have transmission zeros at s = 1 or s = 2.

DOD
Another complication arises as we develop a characterization of transmission
zeros in terms of identically-zero response of a minimal realization of G (s) to a
particular initial state and particular input signal. Namely with ni 2 there can exist a
nonzero in x 1 vector U(s) of strictly-proper rational functions such that G (s)U (s) = 0.
In this situation multiplying all the denominators in U(s) by the same nonzero

polynomial in s generates whole families of inputs for which the zero-state response is
identically zero. This inconvenience always occurs when rn > p, a case that is left to
Exercise 17.5. Here we add an assumption that forces in <p.
The basic idea is to devise an input U(s) such that the zero-state response
component contains exponential terms due solely to poles of the transfer function, and
such that these exponential terms can be canceled by terms in the zero-input response
component.

17.13 Theorem

Suppose the linear state equation (2) is controllable and observable,

and

G(s)
has

x I

C(sI AY'B

almost all complex numbers s. If the complex number s0 is not a pole


it is a transmission zero of G (s) if and only if there is a nonzero, complex
vector 11, and a complex ii x 1 vector x0 such that

rank

of G (s),
m

for

then

t0

Proof

Suppose N (s)D -

'(s)

is

(22)

a coprime right polynomial fraction description for

Chapter 17

322

Polynomial Fraction Applications

(21). If s0 is not a pole

A. If x0 and

u0

of G(s), then D(s0) is invertible and s0 is not an eigenvalue of


are such that (22) holds, then the Laplace transform of (22) gives
C(sJ

A)'x0

N(s)D

Evaluating this expression at s =

=0

yields

N(s0)D'(s0)u0 =

and this implies that rank N (se) <m. That is, s0 is a transmission zero of G (s).
On the other hand suppose s,., is not a pole of G(s). Using the easily verified
identity
= (si

A)' + (sI

(23)

we can write, for any ni x I complex vector


A) - 'Bu0, the Laplace transform expression
=
L [CeAtxo +

do

=C(sI

+ C(sl

1 complex vector

AY' + (si

]Bu(,

N(s<,)D

Taking the inverse Laplace transform gives, for the particular choice of x0 above,

Ce'4'x0 +

$ CeA (1

_a)Bu esa do = N (So )D

1(s0 )u0,e

",

(24)

Clearly the rn x 1 vector u0 can be chosen so that this expression is zero for t 0 if
rank N(s0) < rn, that is, if s0 is a transmission zero of G (s).
Of course if a transmission zero s0 is real and not a pole, then we can take u0
real, and the corresponding x0 =
AY'Bu0
is real. Then (22) shows that the
complete response for x (0) = x0 and u (r) = tl0eSl is identically zero. If s0 is a
complex transmission zero, then specification of a real input and real initial state that
provides identically-zero response is left as a mild exercise.

State Feedback

323

State Feedback
Properties of linear state feedback

u(r) =Kx(t) +

114'r(t)

applied to a linear state equation (2) are discussed in Chapter 14 (in a slightly different

notation). As noted following Theorem 14.3, a direct approach to relating the closedloop and plant transfer functions is unpromising in the case of state feedback. However
polynomial fraction descriptions and an adroit formulation lead to a way around the
difficulty.

We assume that a strictly-proper rational transfer function for the plant is given as
a coprime right polynomial fraction G (s) = N (s)D '(s) with D (s) column reduced.
To represent linear state feedback, it is convenient to write the input-output description

Y(s)=N(s)D1(s)U(s)

(25)

as a pair of equations with polynomial matrix coefficients,


= U(s)

Y(s) =

(26)

The ni x I vector
is called the pseudo-state of the plant. This terminology can be
motivated by considering a minimal realization of the form (8) for G (s). From (9) we
write

= '+'(s)D'(s)U(s)
=(sIA4,, +
or
= (A0

Defining the n x 1 vector x (t)

as

U(s)

(27)

the inverse Laplace transform

x(t)

= L' [LP(sg(s)]

we see that (27) is the Laplace transform representation of the linear state equation (8)
with zero initial state. Beyond motivation for terminology, this development shows that
linear state feedback for a linear state equation corresponds to feedback of
in
the associated pseudo-state representation (26).
Now, as illustrated in Figure 17.14, consider lthear state feedback for (26)
represented by

U(s) =
K

M are real matrices of dimensions rn x n and in x ni, respectively. We

assume that M is invertible. To develop a polynomial fraction description for the


resulting closed-loop transfer function, substitute (28) into (26) to obtain

Chapter 17

324

Polynomial Fraction Applications

Y(s)

17.14 Figure

Transfer function diagram for state feedback.

=MR(s)

[D(s)

Y(s) =

Nonsingularity of the polynomial matrix D (s) KP(s) is assured, since its column
degree coefficient matrix is the same as the assumed-invertible column degree coefficient
matrix for D (s). Therefore we can write
=

[0(s)

K'P(s)]'MR(s)

Y(s) = N(s)t(s)

(29)

Since M is invertible (29) gives a right polynomial fraction description for the closedloop transfer function

N(s)D'(s) =

M_IKP(s)f'

(30)

This description is not necessarily coprime, though D Cs) is column reduced.

Calm reflection on (30) reveals that choices of K and invertible M provide


complete freedom to specify the coefficients of D (s). In detail, suppose
D (s) =
and

+ Dj'P(s)

suppose the desired D(s) is


D(s) =

+ D,tP(s)

Then the feedback gains

K=MD1+D,
accomplish the task. Although the choices of K and M do not directly affect N(s),
there is an indirect effect in that (30) might not be coprime. This occurs in a more
obvious fashion in the single-input, single-output case when linear state feedback places

a root of the denominator polynomial coincident with a root of the numerator


polynomial.

EXERCISES
If G (s) = D -' (s)N (s) is copnme and 0(s) is row reduced, show how to use the
right polynomial fraction description
GT(s) __NT(s)[DT(s)]'
Exercise 17.1

and

controller form to compute a minimal realization for G (s).

Exercises

325

Exercise 17.2 Suppose the linear state equation

=Ax(f) + Bn(t)
= Cx(t)

is controllable and observable, and

C(sI
is

AY1B =

N(s)D'(s)

a coprilne polynomial fraction description with D (s) column reduced. Given any p X II matrix
show that there exists a polynomial matrix
such that
C(,(sI A)'B =Na(S)D'(S)

show that if
is a p x nz polynomial matrix such that Na(S)D (s)
proper, then there exists a C(1 such that this relation holds.
Conversely

is

strictly

Exercise 17.3 Write out a detailed proof of Theorem 17.7.


Exercise 17.4 Suppose the linear state equation

=Ax(i)

Bu(t)

y(t) = Cx(t)
is controllable and observable with m = p. Use the product

sIAB
c o

C(s!AY'

to give a characterization of transmission zeros of C(si


the matrix

A)'B that are not also poles in terms of

siA B

-c

Exercise 17.5 Suppose the linear state equation

i(t) =Ax(t) + Bu(t)


y(t) = Cx(t)
with p <rn is controllable and observable, and

G(s) = C(si

rank p for almost all complex values of s. Suppose the complex number
is not a pole of
G (s). Prove that
is a transmission zero of G (s) if and only if there is a nonzero complex I xp
vector h with the property that for any complex rn x I vector there is a complex ,i x 1 vector x0
has

such that
+

ds = 0, t

Phrase this result as a characterization of transmission zeros in terms of a complete-response


property, and contrast the result with Theorem 17.13.

Exercise 17.6

Given a strictly-proper transfer function G (s), let

(s) be the greatest common

Polynomial Fraction Applications

Chapter 17

326

divisor of the numerators of all the entries of G (s). The roots of the polynomial ii (s) are called the
blocking :eros of G (s). Show that every blocking zero of G (s) is a transmission zero. Show that
the converse holds if either = I or p = I. but not otherwise.

Exercise 17.7 Compute the transmission zeros of the transfer function

G (s)
where

2.

s+l

sI
s+l

-l

(s+4)2

is a real parameter.

Exercise 17.8 Consider a linear state equation

.i(t) =Ax(t) + Bu(t)


y(t) = Civ(t)
where both B and C are square and invertible. What are the poles and transmission zeros of

G(s)= C(sI AY'B


Exercise 17.9 Prove in detail that
if some entry of G (s) satisfies
Exercise 17.10

is

a pole of G (s)

in

the sense of Definition 17.8 if and only

For a plant described by the right polynomial fraction

Y(s) =N(s)D'(s)U(s)
with dynamic output feedback described by the left polynomial fraction

+ MR(s)

U(s)

show that the closed-loop transfer function can be written as

Y(s) =

(s)D (s) N, (s)N(s)

(s)MR (s)

What natural assumption on the plant and feedback guarantees nonsingularity of the polynomial

matrix D, (s)D (s) N( (s)N (s)?

NOTES
Constructions for various forms of minimal realizations from polynomial fraction
descriptions are given in Chapter 6 of
Note 17.1

T. Kailath, Linear

Theory, Prentice

Hall, Englewood Cliffs, New Jersey, 1980

Also discussed are special forms for the polynomial fraction description that imply additional
properties of particular minimal realizations. A method for computing coprime left and right
polynomial fraction descriptions for a given linear state equation is presented in

C.H. Fang, "A new approach for calculating doubly-coprime matrix fraction descriptions,'' IEEE
Control, Vol. 37, No. I, pp. I 38 141, 1992
Transactions on
Note 17.2 Transmission zeros of a linear state equation can be characterized in terms of rank
properties of the system matrix

327

Notes
siA B

-c

thereby avoiding the transfer function. An alternative is to characterize transmission zeros in


terms of the S,njth-McMj//an form for the transfer function. Original sources for various
approaches include
H.H. Rosenbrock. Stare Space and Multirariable Theory. Wiley Interscience. New York. I 970

poles of matrix transfer functions and their dynamical


C.A. Desoer, iD. Schulman, 'Zeros
interpretation,'' iEEE Transactions on Circuits and Systems. Vol. 21, No. 1, pp. 38. 1974
Sec also the survey

C.B. Schrader. M.K. Sam, Research in system zeros: A survey," International Journal of
Control. Vol.50. No.4, pp. 1407 1433, 1989
Note 17.3 Efforts have been made to extend the concepts of poles and zeros to the time-varying
case. This requires more sophisticated algebraic constructs, as indicated by the reference

E.W. Kamen, "Poles and zeros of linear time-varying systems," Linear Algebra and Its
Applications. Vol. 98, pp. 263 289. 1988
or extension of the geometric theory discussed in Chapters 18 and 19, as in

O.M. Grasselli, S. Longhi, "Zeros and poles of linear periodic multivariable discrete-time
systems," Circuits. Systems, and Signal Processing, Vol.7, No.3, pp. 361 380, 1988
Note 17.4 The standard observer, estimated-state-feedback approach to output feedback is
treated in terms of polynomial fractions in

B.D.O. Anderson, V.V. Kucera, "Matrix fraction construction of linear compensators," IEEE
Vol. 30, No. 11, pp. 1112 1114, 1985
Transactions on Automatic
and, for reduced-dimension observers in the discrete-time case,

P. Hippe, "Design of observer-based compensators in the frequency domain: The discrete-time


case," International Journal of Control, Vol. 54, No. 3, pp. 705 727, 1991

Further material regarding applications of polynomial fractions in linear control theory can be
found in the books by Wolovich and Vardulakis cited in Note 16.2, and in

F.M. Callier, C.A. Desoer, Multirariable Feedback Svstenis, Springer-Verlag, New York, 1982
C.T. Chen, Linear System Theo,y and Design, Holt, Rinehart, and Winston, New York, 1984

T. Kaczorek, Linear Control Svste,ns. John Wiley. New York; Vol. 1, 1992; Vol. 2, 1993
The last reference includes the case of descriptor (singular) linear state equations.

18
GEOMETRIC THEORY

begin with the study of subspace constructions that can be used to characterize the
fine structure of a time-invariant linear state equation. After a brief review of relevant
linear-algebraic notions, subspaces related to the concepts of controllability,
observability, and stability are introduced. Then these definitions are extended to a
closed-loop state equation resulting from state feedback. The presentation is in terms of
continuous time, with adjustments for discrete time mentioned in Note 18.8.
Definitions of the subspaces of interest are offered in a coordinate-free manner,
that is, the definitions do not presuppose any choice of basis for the ambient vector
space. However implications of the definitions are most clearly exhibited in terms of
particular basis choices. Therefore the significance of various constructions often is
interpreted in terms of the structure of a linear state equation after a state-variable
change corresponding to a particular change in basis. Additional subspace properties
and related algorithms are developed in Chapter 19 in the course of addressing sample
problems in linear control theory.
We

Subspaces
geometric theory rests on fundamentals of vector spaces rather than the matrix
algebra emphasized in other chapters. Therefore a review of the axioms for finitedimensional linear vector spaces, and the properties of such spaces, is recommended.
The

Basic notions such as the span of a set of vectors and a basis for a vector space are used
freely. However we pause to recapitulate concepts related to subspaces of a vector space.
The vector spaces of interest can be viewed as Rk, for appropriate dimension k,
though a more abstract notation is convenient and traditional. Suppose '11 and W are
vector subspaces of a vector space X over the real field R. In this chapter the symbol

Subspaces

329

'=' often means subspace equality, for example '1/= W. The symbol 'c' denotes
subspace inclusion, for example '1"c W, where this is not interpreted as strict inclusion.

Thus 'i/= W is equivalent to the pair of inclusions 'Vc: W and Wc 'V. The usual
method for proving that subspaces are identical is to show both inclusions. Also the
symbol '0' means the zero vector, zero scalar, or the subspace 0, as indicated by context.
Various other subspaces of X arise from subspaces '1! and W. The intersection of
'L"and '14) is defined by

W= {
and the sum

r E 'V;

c w}

of subspaces is

+wIi'E 'ii; we 'W}

(1)

It is not difficult to verify that these indeed are subspaces. If 'V + '141= X and 'Vn W= 0,
then we write the direct Sum X =
W. These basic operations extend to any finite
number of subspaces in a natural way.
Linear maps on vector spaces evoke additional subspaces. If 9' is another vector
space over R and A is a linear map, A X, 9', then the ke,-nel or null space of A is

Ker[A]={xlxeX; A.v=O}
and the image or range space of A is
mi [A] = { Ax x
I

eX

Confirmation that these are subspaces is straightforward, though it should be emphasized


that Ker [A] c X, while Im [A J c 9". Finally if 'Vc X and Z c 9", then the image of 'V
under A is the subspace of 9' given by

A'l"= {Ai'

I ic

'V}

Of course mi [A I is the same subspace as the image of X under A. The inverse image of
Z with respect to A is the subspace of X:

{x I.VEX; AXE

z}

These notations should be used carefully. Although A('V W) = A'V+ AW,


note that (A + A,) 'V typically is not the same subspace as A '1/+ A2 'V. However
(A1

and

A1'V(A1A2)'V=A111+A2'V
Also the notation A - 'Z does not mean that A -'
is an invertible linear map.

is

(2)

applied to anything, or even that A

Chapter 18

330

Geometric Theory

On choosing bases for X and 9', the map A is represented by a real matrix that is
also denoted by A with confidence that the chance of confusion is slight.

Invariant Subspaces
Throughout this chapter we deal with concepts associated to the rn-input, p-output, ndimensional, time-invariant linear state equation

.i(r) =Ax(t) + Bu(t)

.v(O) =V()

y(t)=Cx(t)

(3)

The coefficient matrices presume bases choices for the state, input, and output spaces,
namely R", R", and R". However, adhering to tradition in the geometric theory, we
adopt a more abstract view and write the state space R" as X, the input space R" as
and the output space R" as 9'. Then the coefficient matrices in (3) are viewed as
representing linear maps according to

State variable changes in (3) yielding P'AP, P'B, and CP usually are discussed in
the language of basis changes in the state space X. The subspace Irn [B] X occurs
frequently and is given the special symbol
liii [B I. Various additional subspaces are
generated in our discussion, and the dependence on the specific coefficient matrices in
(3) is routinely suppressed to simplify the notation and language.
The foundation for the development should be familiar from linear algebra.

18.1 Definition

Aq)c

X is called an irn'ariant subspace for

A subspace

X if

'V.

18.2 Example
The subspaces 0, X, Ker [Al, and mi [A] of X all are invariant
for any
subspaces for A. If 'V is an invariant subspace for A, then so is
nonnegative integer k. Other subspaces associated with (3) such as fB and Ker [Cl are
not invariant subspaces for A in general.

DOD
An important reason invariant subspaces are of interest for linear state equations
can be explained in terms of the zero-input solution for (3). Suppose 'V is an invariant
subspace for A. Then from the representation for the matrix exponential in Property 5.8,
nI

'Vc
k=O

cxk(t)A'V
k=O

c'V
for any value of t 0. Therefore if x,, E '1", then the zero-input solution of (3) satisfies

Invariant Subspaces

331

x(t) e 'V for all t 0. (Notice that the calculation in (4) involves sums of matrices in
the first term on the right side, then sums of subspaces in the second. This kind of
mixing occurs frequently, though usually without comment.) Conversely a simple
contradiction argument shows that if a subspace 'V is endowed with the property that
x0 e 'V implies the zero input solution of (3) satisfies x(t) E 'V for all t 0, then 'V is
an invariant subspace for A.
Bringing the input signal into play, we consider first a special subspace and
associated standard notation. (Superficial differences in terminology for the discretetime case begin to appear with the following definition.)
18.3 Definition

The subspace of X given by


(5)

is called the controllable subspace for the linear state equation (3)

The Cayley-Hamilton theorem immediately implies that <A I B> is an invariant


subspace for A. Also it is easy to show that <A I 3> is the smallest subspace of X that
contains
and is invariant under A. That is, every subspace that contains
and is

invariant under A contains <A I B>. Finally we note that the computation of
<A I
more specifically the computation of a basis for the subspace, involves
selecting linearly independent columns from the set of matrices B, AB,.. ., A" 'B
An important property of <A

input signal. By invariance, X(,

3> relates to the solution of (3) with nonzero

<A I fB> implies


CAIXQE<AIfB>,

If u (t)

is

rO

a continuous input signal (for consistency with our default assumptions), then
nI

'
$ tXk(t r)u(a) dcN

t0
The integral term on the right side provides, for each

t 0, an m x I vector that
describes the k"-summand as a linear combination of columns of AkB. The immediate
conclusion is that if x0 e <A I !B>, then for any continuous input signal the
corresponding solution of (3) satisfies x (t) E <A I 3> for all t 0. But to justify the
terminology in Definition 18.3, we need to refine the notion of controllability introduced

in Chapter 9.

18.4 Definition
there

A vector x,, E X is called a controllable state for (3) if for x (0) =

is a finite time

>0

and a continuous input signal

corresponding solution of (3) satisfies x (t,,) = 0.

Ua(t) such

x0

that the

Chapter 18

332

Geometric Theory

Recalling the controllability Gramian, in the present context written as

W(O, t)
we first establish a preliminary result.

18.5 Lemma For any

> 0,

<A I B> = fin [W(O, ti,)]


Proof Fixing ta > 0, for any ii x I vector .v,,,
W (0, ta )x0 = J e A CBBTe

da

I' I

=
Since

each column of AAB

da

and the k"-summand above is a linear

is in

combination of columns of AkB,

This gives

liii [W(0, ti,)] c <A I


To establish the reverse containment, we use the proof of Theorem 13.1 to define a
convenient basis. Clearly <A I B> is the range space of the controllability matrix

AB ...

[B

for the linear state equation (3). Define an invertible ii x ii matrix P column-wise by
choosing a basis for <A I B> and extending to a basis for X. Then changing state
variables according to z(t) =
leads to a new linear state equation in :(t) with
the coefficient matrices

A1

'412

A22

These expressions can be used to write W(0,

B11

in (6) as

_14I2]aJ [Butj
o

A,2

A12

A,7

Invariant Subspaces

333

W1(O,t0) 0
0
0

pT

where
=

W1(O,

an invertible matrix. This representation shows that hn [W(0, t(,)] contains any vector
of the form
is

(8)

for setting

we obtain

W(O,t1)x=P
Since

AkB=P

k=0,l,...

has the form (8), it follows that <A I B> cJrn [W(0, ti,)].

ODD
Lemma 18.5 provides the tool needed to show that <A I B> is exactly the set of
controllable states.

18.6 Theorem

A vector x0 E X is a controllable state for the linear state equation (3)

ifandonlyifx(,E <A
Proof
vector

z E

IfB>.

Fix t0 > 0. If X()


<A
X such that x0 = W(0,

then Lemma 18.5 implies that there exists a

Setting
= _BTe_I%Ttz

the

solution of (3) with x (0) = x0 is, when evaluated at t =


x(t0) =

(9)

Chapter 18

334

{x0 W(O, ta)Z

Geometric Theory

=0
Conversely if

is a controllable state, then there exist a finite time ta > 0 and

continuous input

such that

0=eAtx0

Therefore
=

5 e _AaB1, (a) da

ii I
.1-

this implies x0 e <A


DOD
and

czk(a)u0(a)da

The proof of Theorem 18.6 shows that a linear state equation is controllable in the
sense of Definition 9.1 if and only if every state is a controllable state. (The fact that
can be fixed independent of the initial state is crucialthe diligent should supply
reasoning.) Of course this can be stated in geometric language.

18.7 Corollary
<A IfB> =X.

The linear state equation (3)

is controllable

if and only if

It can be shown that <A B> also is precisely the set of states that can be
reached from the zero initial state in finite time using a continuous input signal. Such a
characterization of <A I B> as the set of reachable states is pursued in Exercise 18.8.
Using the state variable change in the proof of Lemma 18.5, (3) can be written in
terms of z (t) = P 1x (t) as a partitioned linear state equation
=

A12
0

A22

B11

u(t)

y(t) = CPz(t)
<A NB> = q <n, the
submatrix
is q xq, while
component of the state equation (12) that describes zjt),
Assuming dim

=A11z((t)
is controllable. That is,

+ B11u(t)

is

q xm.

The

Invariant Subspaces

335

rank LB

A11B11

A11 B11j =q
1

(The extra term A


known from
does not change the ability to drive an
initial state
to the origin in finite time.) Obviously the component of the state
equation (12) describing
namely
=

(t)

is not controllable. The structure of(12) is exhibited in Figure 18.8.

18.8 Figure

Decomposition of the state equation (12).

Coordinate changes of this type are used to display the structure of linear state
equations relative to other invariant subspaces, and formal terminology is convenient.

18.9 Definition

Suppose 'l"c X is a dimension-v invariant subspace for A


X

such that pr,...,

span '1) is said to be adapted tO

the subspace 'ii.


In general, for the linear state equation (3), suppose 'V is a dimension-v invariant
subspace for A, not necessarily containing fB. Suppose also that columns of the n x n
matrix P form a basis for X adapted to '1". Then the state variable change
2(t) =
yields
Ia(t)
Zh(t)

A11

A22

Zh(t)

u(t)

B21

y(t)=CPz(t)
In terms of the basis
only if it has the form

(13)

for X, an nxl vector ZE X satisfies ZE 'V if and

=
The action of A on 'V is described in the new basis by the partition A

since

Chapter 18

336

A12

Za

A1;,

A22

Geometric Theory

Clearly A inherits features from A, for example eigenvalues. These features can be
interpreted as properties of the partitioned linear state equation (13) as follows.
The linear state equation (13) can be written as two component state equations

=AliZa(t) AI,:h(t) + B11u(t)


Zh(t) =A,2z,,(t) + B,1u(t)
the first of which we specifically call the component state equation corresponding to

'1".

Exponential stability of (13) (equivalent to exponential stability of (3)) is equivalent to


exponential stability of both state equations in (14). Also an easy exercise shows that
controllability of (13) (equivalent to controllability of (3)) implies
rank [a21 A,,71

= nv

l21
]

However simple examples show that controllability of (13) does not imply that

...

[a11

rank v. In case this is puzzling in relation to the special case where 'iJ= <A I 13> in
is vacuous.
(12), note that if(l2) is controllable, then
Often geometric features of a linear state equation are discussed in a way that
leaves understood the variable change. As with subspaces the various properties we
considercontrollability, observability, stability, and eigenvalue assignmentare
has

uninfluenced by state variable change. At times

it

is convenient to address these

properties in a particular set of coordinates, but other times it is convenient to leave the
variable change unmentioned.
The geometric treatment of observability for the linear state equation (3) will not
be pursued in such detail. The basic definition starts from a converse notion, and just as
in Chapter 9 we consider only the zero-input response.
18.10 Definition

The subspace

X given by
nI
k =0

is called

the unobservable suhspace for (3).

Another way of writing the unobservable subspace for (3) involves a slight
extension of our inverse-image notation:

Ker[CJ

Invariant Subspaces

It is easy to verify that

337

is an invariant subspace for A. and it is the largest subspace

contained in Ker [CI that

is

invariant under A. Also

is the null space of the

observability matrix
C
CA

(15)

CA"'
By showing that, for any ta > 0,
KC, EM (0, Ia)]

where

M(O, t) = f

da

(16)

is the observability Gramian for (3), the following results derive from an omitted linearalgebra argument.

18.11 Theorem Suppose the linear state equation (3) with zero input and unknown
initial state x0 yields the output signal y (t). Then for any ta > 0,
can be determined
up to an additive n x I vector in
from knowledge of y (t) for t a [0, ta].

18.12 Corollary

The linear state equation (3) is observable if and only if

= 0.

Finally we note that a state variable change with the columns of P adapted to
transforms (3) to a state equation (13) with CP in the partitioned form [0 C12].
Additional invariant subspaces of importance are related to the internal stability

properties of (3). Suppose that the characteristic polynomial of A is factored into a


product of polynomials

det(X/
where all roots of p nonnegative real parts.

have negative real parts, and all roots of p +


have
Each polynomial has real coefficients, and we denote the

respective polynomial degrees by n and

18.13 Definition

The subspace of X given by

X =Ker[p(A)]
is called the stable suhspace for the linear state equation (3), and

is called the unstable suhspace for (3).

Chapter 18

338

Obviously

X and

Geometric Theory

are subspaces of X. Also both are invariant subspaces for

A; the key to proving this is that Ap(A) =p(A)A for any polynomial p(X). The
stability terminology is justified by a fundamental decomposition property.

The stable and unstable subspaces for the linear state equation (3)

18.14 Theorem

provide the direct sum decomposition


(17)

Furthermore in

a basis adapted to X

and

X is exponentially stable, while all eigenvalues of the component


have nonnegative real parts.
state equation corresponding to

Proof

Since

the polynomials p -

and p + (A.)

are coprime (have no roots in

(A.) and q2(X) such that

common), there exist polynomials

p(A.)q1(A.)

(This standard result from algebra is a special case of Theorem 16.9. The polynomials
q1 (A.) and q2(A.) can be computed by elementary row operations as described in Theorem
16.6.) The operations of multiplication and addition that constitute a polynomial p (A.)
remain valid when A. is replaced by the square matrix A. Therefore equality of
polynomials, say p (A.) = q (A.), implies equality of the matrices obtained by replacing A.
by A, namely p (A) = q(A). By this argument we conclude

p(A)q1(A)
For any vector z E X, multiplying

(18) on the right

by z shows that we can write

z=z+
where

=p(A)q1(A)z
z

The superscript notation z -

and z + is

suggestive, and indeed the Cayley-Hamilton

theorem gives

p(A)z
That

=0

is,

zeX,
and

thus X= X

To show that X

p(A)z

= 0, we note that if: E

=0

then

Canonical Structure Theorem

339

Using (18), and commutativity of polynomials in A, gives

z =p(A)q1(A): +

=0
Therefore (17) is verified.

Now suppose the columns of P form a basis for X adapted to X. Then the first
n - columns. of P form a basis for X, the remaining ii + columns form a basis for X
and the state variable change z (t) = P 'x (t) yields the partitioned linear state equation

A11

A22

Zb(t)

Za(t)
Zh(t)

u(t)
B21

y (t) = CPz (t)

(20)

Since the characteristic polynomials of the component state equations corresponding to


are, respectively,
X and
det (xi A
the

det (?J A22)

= p

eigenvalue claims are obvious.

18.15 Example
check. Let X=

As usual a diagonal-form state equation provides a helpful sanity


e4, and

with the standard basis e

1000
i(t)=

consider the state equation

x(t)+

u(t)

0004
y(t)= L

l]x(t)

Then the controllable subspace <A I B> is spanned by e1, e4, the unobservable
subspace
is spanned by e1, the stable subspace X is spanned by e3, e4, and the
e2. Verifying these answers both from basic
unstable subspace
is spanned by
intuition and from definitions of the subspaces is highly recommended.

Canonical Structure Theorem


To

illustrate the utility of invariant subspace constructions, we consider a conceptually

important decomposition of a linear state equation (3) that is defined in terms of


<A I W> and
This is the canonical structu,-e theorem cited in Note 10.2 and Note
26.5. Despite its name the result is difficult to precisely state in economical theorem
form, and so we adopt a less structured presentation that starts at the geometric
beginning.

Given (3), with associated controllable and unobservable subspaces <A I 3> and
the first step is to make use of Exercise 18.3 to note that <A I 3>
also is an
invariant subspace for A. Next consider a change of state variables z(t) = P'x(t),
where P is defined as follows. Let columns p
Pq be a basis for <A I 3> n

Chapter 18

340

Geometric Theory

IS
a basis
for <A
and let
Pr
be a basis for W. Finally we extend toabasis p1,. .., p,, for
X. (Of course any of the subsets of column vectors could be empty, and corresponding
partitions below would be absent (zero dimensional). ) By keeping track of the invariant
subspaces <A I 3> n
<A I
9V, and X, the coefficients of the linear state
equation in terms of (t) have the partitioned form

Then

suppose

..,

P1

Pq

A11

'413 A14

A7,

00

B11

BP1B B,1

A33A34

0
(22)

this partitioning is easier to understand by first considering only that P is a


basis for Xadapted to <A I
This implies the four 0-partitions in the lower-left
corner of A, and the two 0-partitions in B. Then imposing the A-invariance of
Perhaps

<A I 3> n and explains the additional 0-partitions in A, while the 0-partitions in
C arise from
Ker [C].
Each of the four component state equations associated to (22) inherits particular
controllability and observability properties from the corresponding invariant subspaces.
We describe these properties with suggestive notation and free rearrangement of terms,
recalling again that the introduction of known signals into a state equation does not
change the properties of controllability or observability for the state equation.
The first component state equation
=Aiiza(t) + Bi1u(t) + A12zh(t) + Aii(t)Ze(t) + Ai4z,i(t)

y(t)
is

Oza(t) + C17zj,(t) + Cl4zd(t)

controllable, but not observable. The second component

=A,,zh(t)

y(t) = CI2zh(t) + C14zd(t)


is

both controllable and observable. The component


z(.(t) =

+ 0 u (t) +

y(t) =

+ Ci,z,)(t) +

is neither controllable nor observable. The remaining component


Zd(t)

y(t) =
is

observable, but not controllable.

+ 0 u (t)
+

Controlled Invariant Subspaces

341

Often this decomposition is interpreted in a different fashion, where the connecting


signals are de-emphasized. We say that

+ B21u(i)
v(t) = C12:,,(t)

(23)

is the controllable and observable subsystem, while

= A132e(t)

is the uncontrollable and unobservable subsystem. Then


=

0
is

Za(f)

A11 1412

A22

subs vsten? is

:j,(t)

(24)

B,1

called the controllable subsystem, and the observable

Zd(t)

u(t)

:,,(t)

y(t)= [c12

(25)

C42]

This terminology leads to a view of (22) as an interconnection of the four subsystems.


It is important to be careful in interpreting and discussing this 'theorem.' One

common misconception is that the decomposition is an immediate consequence of


sequential application of the controllability decomposition in Theorem 13.1 and the
observability decomposition in Theorem 13.12. Also it is easy to mangle the structure of
the coefficients in (22) if one or more of the partitions is zero-dimensional.
Delicate aspects aside, the canonical structure theorem immediately connects to
realization theory. A straightforward calculation shows that the transfer function of (3),
which is the same as the transfer function for (22), is
Y(s) = C12(sI

A22)

B21

U(s)

(26)

That is, all subsystems except the controllable and observable subsystem (23) are
irrelevant to the input-output behavior (zero-state response) of (3). Put another way, in a
minimal state equation only the subsystem (23) is present.

Controlled Invariant Subspaces


Linear state feedback can be used to modify the invariant subspaces for a given linear

state equation. This leads to the formulation of feedback control problems in terms of
specified invariant subspaces for the closed-loop state equation. However we begin by
showing that the controllable subspace for (3) cannot be modified by state feedback.
Then the effect of feedback on other types of invariant subspaces is considered.

Chapter 18

342

Geometric Theory

In a departure from the notation of Chapter 14, but consonant with the geometric
literature, we write linear state feedback as

0(t) = Fx(t) + Gv(t)

(27)

F is in x n, G is ni x in, and v (t) represents the in x 1 reference input. The


resulting closed-loop state equation is
where

.k(t) = (A + BF)x(t) + BGr(t)

y(t) = Cx(t)

(28)

In Exercise 13.11 the objective is to show that for G = I the closed-loop state
equation is controllable if the open-loop state equation is controllable, regardless of F.
We generalize this by showing that the set of controllable states does not change under
such state feedback. The result holds also for any G that is invertible, since invertibility
of G guarantees = Im [BG].
18.16 Theorem

For any F,
(29)

Proof For any F and any subspace


fB

we can write, similar to (2),

+ (A + BF)W=

AW

This immediately provides the first step of an induction proof:


+ (A + BF)!B=

Now assume K is a positive integer such that

Then
+ (A -t-BF)fB +

... + (A

B= W + (A +BF)[ !B + ... + (A

= B+

... + AKflfB

This induction argument proves (29)

DOD
Consider again the linear state equation (3) written, after state variable change, in
the form (12). Applying the partitioned state feedback
z' (t) = [F11

F11]

+ v (t)

Controlled Invariant Subspaces

343

to (12) yields the closed-loop state equation

+B11F11 A12+B11F1,
A-,,

= CP:(t)

B11

1(t)

(30)

From the discussion following (12),


A

Ze(t)

it is

clear that F11 can be chosen so that

+ B1 F11 has any desired eigenvalues. It is also important to note that regardless of

F the eigenvalues of A2, in (30) remain fixed. That is, there is a factor of the
characteristic polynomial for (30) that cannot be changed by state feedback.
Basic terminology used to discuss additional invariant subspaces for the closedloop state equation is introduced next.

A subspace '1"c X is called a controlled invariant suhspace for the


linear state equation (3) if there exists an m x matrix F such that 'V is an invariant
subspace for (A + BF). Such an F is called afriend of 'V.
18.17 Definition

The subspaces 0, <A I W>, and X all are controlled invariant subspaces for (3),
and typically there are many more. Motivation for considering such subspaces can be
provided by again considering properties achievable by state feedback.
Suppose 'V is a controlled invariant subspace for (3), with
'lJc Ker [C]. Using a friend F of 'V to define the linear state feedback

18.18 Example

u(t) = Fx(t)
yields
= (A

+ BF)x(t) , x(0)

= Cx(t)

'V implies y (t) = 0 for all


This closed-loop state equation has the property that
0. Therefore the state feedback is such that 'V is contained in the unobservable
subspace
for the closed-loop state equation.

DQD
There

is a fundamental characterization of controlled invariant subspaces that

conveniently removes explicit involvement of F.

18.19 Theorem
only if

A subspace 'Vc X is a controlled invariant subspace for (3) if and

A'Vc'V
Proof If 'V is a controlled invariant subspace for (3), then there is a friend F of 'V
such that (A +BF)'Vc 'V. Thus

344

Chapter 18

Geometric Theory

AV=(A +BFBF)'V
c:(A

Now suppose 'Vc: X, and (31) holds. The following procedure constructs a friend
of 'V to demonstrate that 'V is a controlled invariant subspace. With v denoting the
dimension of 'V. let n x 1 vectors l'i,. , v,, be a basis for X adapted to 'I". By
hypothesis there exist n x 1 vectors w1,...,
E 'V and m x 1 vectors u
e ZI such that
.

Avk=wkBuk, k=l
Now let

u,,

be arbitrary ni x 1 vectors, all zero if simplicity is desired, and let


=

Then for k =

1,

[u

[V

.. , v, with et the

(32)

of 1,,,

(A + BF)vk = Avk +

BFv1,

...

=Avk + B [u1
= Avk + Bilk

e '1)

=
Since

any i' e 'V can be expressed as a linear combination of i's, ..

we have that

'V is an invariant subspace for (A + BF).

DOD
If 'V is a controlled invariant subspace, then by definition there exists at least one
friend of 'V. More generally it is useful to characterize all friends of 'V.

18.20 Theorem Suppose the rn x n matrix F is a friend of the controlled invariant


subspace Vc X. Then the ni x n matrix Fb is a friend of 'V if and only if
(Fa

Proof
va, Vb

'V

(33)

If
and F" both are friends of '1/, then for any V e '1) there exist
such that

(A +

Va

(A + BF")v =
Subtracting the second expression from the first gives

Va Vb

Controllability Subspaces
and since Va v1,

345

'V this calculation shows that (33) holds.

On the other hand if F" is a friend of '1) and (33) holds, then given any

E '1)

there is a v1, e 'V such that

B(F" F")v,, = (BF"

Therefore

(A + BF")i',,
Since

F" is a friend of

there exists a

(A +

(A +
'1!

such that (A + BF")v,, = v,.. This gives

l'c v1, e 'V

(34)

which shows that F" also is a friend of '1".

DOD
Notice that this proof is carried out in terms of arbitrary vectors in 'I" rather than
in terms of the subspace 'V as a whole. One reason is that (F"
does not obey
seductive algebraic manipulations. Namely (F" F")'l! is not necessarily the same
subspace as F"V F"i/, nor is it the same as (F"

Controllability Subspaces
In examining capabilities of linear state feedback with regard to stability or eigenvalue

assignment, it is a displeasing fact that some controlled invariant subspaces are too large.
Of course <A I !B> is a controlled invariant subspace for (3), and eigenvalue

assignability for the component of the closed-loop state equation corresponding to


<A I B> is guaranteed. But the whole state space X also is a controlled invariant
subspace for (3), and if (3) is not controllable, then eigenvalue assignment for the
closed-loop state equation on X is not possible. We address this issue by first defining a
special type of controlled invariant subspace of X and then relating this subspace to the
eigenvalue-assignment property.
18.21 Definition A subspace 2( X is called a controllability subspace for the linear
state equation (3) if there exists an in x ii matrix F and an in x in matrix G such that

!l(=<A+BFIIrn[BGJ>

(35)

The differences in terminology are subtle: A controllability subspace for (3) is the
controllable subspace for a corresponding closed-loop state equation

i(t) =

(A + BF)x(t) + BGv(t)

for some choice of F and G. It should be clear that a controllability subspace for (3) is
a controlled invariant subspace for (3). Also, since mi [BG] c for any choice of G,

Chapter 18

346

<A + BF I/rn [BG]> c <A + BF I

= <A

Geometric Theory

for any G. That is, every controllability subspace for (3) is a subspace of the controllable
subspace for (3). In the single-input case the only controllability subspaces are 0 and
depending on whether the scalar G is nonzero.
the controllable subspace <A I
However for multi-input state equations controllability subspaces are richer geometric

concepts. As a simple example, in addition to the role of F, the gain G is not


necessarily invertible and can be used to isolate components of the input signal.

18.22 Example

For the linear state equation

x(t)=

120

01

045

30

0 3 0 x(t) +

2 0 u(t)

a quick calculation shows that the controllable subspace is X= R3. To show that

span{c1}=span

0
0

is a controllability subspace, let

_[001

F10413
L 2 0

Then the closed-loop state equation is

1 00
=

0
0

1/3 0 x(t) +
0

10

0 0 v(t)

00

Since Im [BG J = span { e1 } and A + BF is diagonal, it is easy to verify that


Q = span { e } satisfies (35).
EIDD

Often it is convenient for theoretical purposes to remove explicit involvement of


the matrix G in the definition of controllability subspaces. However this does leave an

implicit characterization that must be unraveled when explicitly computing state


feedback gains.

18.23 Theorem A subspace Q c X is a controllability subspace for (3) if and only if


there exists an m x a matrix F such that
= <A + BF I fB n Q>

(36)

Proof Suppose F is such that (36) holds. Let the n x 1 vectors p


q a?,
be a basis for n Q cx. Then for some linearly independent set of rn x I vectors

Controllability Subspaces

U1,...,

Uq E 'U we

basis u1, .

347

can write p1

=Bu1,...,

Pq

=BUq. Next complete this set to a

u,,, for 'U, and let

[UI

Uq

On,x(n,_q)] [u1

Then

k=1
BGuk =

Therefore Im [BG] =

fB

0, k=q+l

Q, that is

Q= <A+BFIJm[BG]>

(37)

and Q is a controllability subspace for (3).

Conversely if Q is a controllability subspace for (3), then there exist matrices F


and G such that (37) holds. From the basic definitions,

Ini[BG}crB, Im[BG]cQ
and so Im[BG] c
Q. Therefore Qc <A
subspace for (A +BF), so (A
Q)c Q.

Q>. Also Q is an invariant


Thus

<A+BFI!BnQ> c Q,

and

we have established (36).

DOD
As mentioned earlier a controllability subspace Q for (3) also is a controlled
invariant subspace for (3), and thus must have friends. We next show that any such friend
can be used to characterize Q as a controllability subspace.

18.24 Theorem

Suppose Q cx is a controllability subspace for (3). If F is such that

(A+BF)Qc Q,then
Q = <A + BF I

fl Q>

Proof If Q is a controllability subspace, then there exists an m x n matrix


that

Now suppose F" is a friend of

that is, (A + BF")Q c


=

Let

Q>

c Q, and we next show the reverse containment.


To set up an induction argument, first note that

Clearly

(A +

such

Chapter 18

348

Geometric Theory

Assuming that for a positive integer K,


(A

+ BF

we can write

= (A + BFa) [(A +

(A +

c (A +

=[A +BF" +B(Fa_F!))1Q1,


c(A + BF")Q,, + [B(F0 F")]Q,,

(39)

By definition
(A

Also [B(F"

F")JQ,,

+ BF")Q,, c Qh

c Q,

and since

[B(F" _Fb)]Q,,c[B(Fd1
By Theorem 18.20, [B(F" F")]Qc Q. Therefore

[B(P' -F")]Q,,c

Qh

This completes an induction proof for

and the right side of (39) is contained in


(A +

k = 0,

1,...

and thus
QI,

ODD
The

last two results provide a method for checking if a controlled invariant

subspace 'V is a controllability subspace: Pick any friend F of the controlled invariant
subspace 'V and confront the condition
'iJ= <A + BF I

If this holds, then

V>

(40)

is a controllability subspace for (3) by Theorem 18.23. If the

condition (40) fails, then Theorem 18.24 implies that 'V is not a controllability subspace.

Suppose 9( is a controllability subspace for (3), and suppose F is any


friend of Q. Then (38) holds, and we can choose a basis for X as follows. Select G
18.25 Example
such that

Irn[BG]=
Then let Pi

Pq' q rn,

be a basis

for

First extend to a basis

p1,...,

Controllability Subspaces
q

349

p n, for Q, and further extend to a basis Pi

state variable change :(r) =

p,,

for X. The corresponding

applied to the closed-loop state equation

(A + BF)x(t) + BGv(t)
gives
Zr(o)

=
Znr(t)

The p x ni matrix B

A2,

B11

v(t)

(42)

has the further structure

nIl=
with B11 of dimension q x ni.

ODD
Finally, returning to the original motivation, we show the relation
controllability subspaces to the eigenvalue assignment issue.

of

18.26 Theorem Suppose Qc X is a controllability subspace for (3) of dimension p 1.


Then given any degree-p, real-coefficient polynomial p (X) there exists a state feedback

u(t)=Fx(t) + Gv(t)
with F a friend of such that in a basis adapted to Q the component of the closed-loop
state equation corresponding to Q has characteristic polynomial p
Proof To construct a feedback with the desired property, first select G such that

Jrn[BG] =

by following the construction in the proof of Theorem 18.23. The choice of F is more
complicated, and begins with selection of a friend F" of Q so that
=

Choosing a basis adapted to Q,

the

<A+BF"IInz[BG]>

corresponding variable change z(r) = P'x(t)

such that the state equation

i(t) =
can

(A +

BF")x(t) + BGv(t)

be rewritten in partitioned form as

Ir(t)
Z,,r(t)

A11 '412
0

Zr(t)

A2,

The component of this state equation corresponding to

namely

is

Chapter

350

18

Geometric Theory

= Aii2r(t) + A 12:,,r(t) +
is controllable,

and thus there is a matrix

such that

)=p(A.)

(43)

Now we verify that

F = F" +

is a friend of Q that provides the desired characteristic polynomial for the component of
the closed-loop state equation corresponding to !X. Note that x e if and only if x has
the form

x=P
Since

F" is a friend of

and

o]P'
we can write, for any x E Q,

B(FF")x=BG [F'(1
0]

Therefore

B(FF" )Qc Q, that

is,

(F -F")Qc B1Q
and F is a friend of Q by Theorem 18.20. To complete the proof compute

P'(A

[A

+ BF)P

+ BF" + BG

+011r II

Q]P' )

Pz

and from (43) the characteristic polynomial of the component corresponding to Qis p (A.).

DOD
Our main application of this result is in addressing eigenvalue assignability while

preserving invariance of a specified subspace for the closed-loop state equation. To


motivate we offer the following refinement of the discussion below Definition 18.9. If

Stabilizability and Detectability

351

(13) results from a state variable change adapted to a controllability subspace,

Q,

controllability of (13) implies controllability of both component state equations in


(14). More generally suppose for an uncontrollable state equation that 'ji is a controlled
invariant subspace, and Q is a controllability subspace contained in 'V. Then
then

eigenvalues can be assigned for the component of the closed-loop state equation
corresponding to Q using a friend of '11. This is treated in detail in Chapter 19.

Stabilizability and Detectability


Stability properties of a closed-loop state equation also are of fundamental importance,
and the geometric approach to this issue involves the stable and unstable subspaces of
the open-loop state equation, and a concept briefly introduced in Exercise 14.8.
18.27 Definition The linear state equation (3) is called stahili:able if there exists a
state feedback gain F such that the closed-loop state equation

i(t) = (A

BF)x(t)

(45)

is exponentially stable.

18.28 Theorem

The linear state equation (3) is stabilizable if and only if

c<A

VU>

(46)

Proof Changing state variables using a basis adapted to <A I 3> yields

i (t)

A 12

:e(t)

A27

B ii

u (t)

c <A I
then all eigenvalues of A22 have negative real
In terms of this basis, if
parts. Therefore (3) is stabilizable since the component state equation corresponding to
<A I B> is controllable.
On the other hand suppose that (3) is not stabilizable. Then A22 has at least one
eigenvalue with nonnegative real part, and thus
is not contained in <A I B>.

DOD
An alternate statement of Theorem 18.28 sometimes is more convenient.

18.29 Corollary

The linear state equation (3) is stabilizable if and only if

x + <ANB>=X

(47)

Stabilizability obviously is a weaker property than controllability, though


stabilizability has intuitive interpretations as 'controllability on the infinite interval
o t <oo,' or 'stability of uncontrollable states.' Further geometric treatment of issues

Chapter 18

352

Geometric Theory

involving stabilization is based on another special type of controlled invariant subspace

called a stabilizability

This is not pursued further, except to suggest

suhspace.

references in Note 18.5.

There is a similar weakening of the concept of observability that is of interest.


Motivation stems from the observer theory in Chapter 15, with eigenvalue assignment in
the error state equation replaced by exponential stability of the error state equation.
18.30 Definition The linear state equation (3) is called detectable if there exists an
a x p matrix H such that

i(t) =

(A

+ HC)x(t)

is exponentially stable.

The issue here is one of 'stability of unobservable states.' Proof of the following
detectability criterion is left as an exercise, though Exercise 15.9 supplies an underlying
calculation.
18.31 Theorem

The linear state equation (3) is detectable if and only if

x'
As an illustration we can interpret these properties in terms of the coordinate
choice underlying the canonical structure theorem. Consideration of the various
subsystems gives that the state equation described by (22) is stabilizable if and only if

have negative-real-part eigenvalues, and detectable if and only if A1 I and


A33 have negative-real-part eigenvalues.
1433 and

EXERCISES
Suppose Xis a vector space,
or counterexamples for the following claims.
(a)
W implies A 'tic A W
(b) A - t'11c 'W implies 'tic A W
(c) Vc W implies A - "tic A Exercise 18.1

Wc X are subspaces, and A

X. Give proofs

(d)'VcAWimpliesA"lIcW
Exercise 18.2 Suppose Xis a vector space,

Wc: Xare subspaces, and A :

X. Show that

(b)A'(A'V)= 'ji Ker[A1


(c) A 'tic W if and only if 'tic A Exercise 18.3

If 'ti Wc X are subspaces that are invariant for A :X* X, give proofs or

counterexamples to the following claims.


'J4) is an invariant subspace for A
(a)

353

Exercises
(b) A (d)

W) is an invariant subspace for A

+ W is an invariant subspace for A

(c)
'lb

'W is an invariant subspace for A

Exercise 18.4

If

W0,

Hint: Don't be tricked.

c Xare subspaces, show that


+

If

c '1/, show that

+ W,,)nV= Wa

W c Xare subspaces. Show that there exists an F such that

(A

+BF)'VcV, (A +BF)WcW

if and only if

A'Vc'V+'B, AWcW+'B

Exercise 18.6

If

prove that

<A
<A IC>
prove that there exists an ni x in matrix G such that

<AIlm[BGI> = <A IC>


Exercise 18.7
in

For the linear state equation in Example 18.15, describe the following subspaces

terms of the standard basis forX= R4:

(a) all controllability subspaces,


(b) examples of controlled invariant subspaces,
(c) examples of subspaces that are not controlled invariant subspaces.
Repeat (b) and (c) for stabilizability subspaces as defined in Note 18.5.
Exercise 18.8 Show that <A I B> is precisely the set of states that can be reached from the zero
initial state in finite time with a continuous input signal.
Exercise 18.9

Prove that the linear state equation

=Ax(t) + Bu(t)
=
C = p is output controllable in the sense of Exercise 9.10 if and only if

C<A

= 9'

Chapter 18

354
Exercise 18.10

Geometric Theory

Show that the closed-loop state equation

i(t) = (A + BF)x(t)
y(t) =Cx(t)
is observable for all gain matrices F if and only if the only controlled invariant subspace contained
in Ker [C] for the open-loop state equation is 0.
Exercise 18.11

Suppose Qis a controllability subspace for


= Ax(t) + Bu(t)

and, in terms of the columns of B,

+
Suppose

+ hfl[Bql

the columns of the n x n matrix P form a basis for X that is adapted to the nested set of

subspaces

!BnQc Qc<A VB>cX


Using the state variable change z (t) = P x (t), what structural features does the resulting state
equation have? (Note that there is no state feedback involved in this question.)
Exercise 18.12 Suppose cR' is a subspace and z (t) is a continuously differentiable, ii x 1
for all t 0. Show that
E
function of time that satisfies z (r) c
for all t 0.
Exercise 18.13 Consider a linear state equation

i(r) =Ax(t) + Bu(t)


A - 'fBfor all
suppose z(t) is a continuously-differentiable n x I function satisfying z(t) E
0. Show that there exists a continuous input signal such that with x (0) = z(0) the solution of
the state equation is x (t) = z (t) for t 0. Hint: Use Exercise 18.12.
and

NOTES
Note 18.1

Though often viewed by beginners as the system theory from another galaxy, the

geometric approach arose on Earth in the late 1960's in independent work reported in the papers

G. Basile, G. Marro, "Controlled and conditioned invariant subspaces in linear system theory,"
Journal of Optimization Theory and Applications, Vol. 3, No.5, pp. 306315, 1969
W.M. Wonham, A.S. Morse, "Decoupling and pole assignment in linear multivariable systems: A
geometric approach," SIAM Journal on Control and Optimization, Vol. 8, No. 1, pp. 1 18, 1970

In the latter paper controlled invariant subspaces are called (A, B)-invariane' subspaces, a term
that has fallen somewhat out of favor in recent years. In the first paper a dual notion is presented
that recalls Definition 18.30: A subspace V c X is called a conditioned invariant subspaee for the
usual linear state equation if there exists an n x p matrix H such that

(A +HC)'Vc'i)
This construct provides the basis for a geometric development of state observers and other notions
related to dynamic compensators. See also

Notes

355

W.M. Wonham.

Dynamic observersgeometric

IEEE Transactions on Automatic

Control, Vol. 15, No. 2, pp. 258 259, 1970

Note 18.2 For further study of the geometric theory, consult

W.M. Wonham, Linear Multivariable Control: A Geometric Approach, Third Edition, SpringerVerlag, New York. 1985

G. Basile, 0. Marro, Controlled and Conditioned Invariants in Linear System Theory. Prentice
Hall, Englewobd Cliffs, New Jersey, 1992

These books makes use of algebraic concepts at a more advanced level than our introductory
treatment. For example dual spaces, factor spaces, and lattices appear in further developments.

More than this, the purist prefers to keep the proofs coordinate free, rather than adopt a
particularly convenient basis as we have so often done. Satisfying this preference requires more
sophisticated proof technique in many instances.
Note 18.3 From a Laplace-transform viewpoint, the various subspaces introduced in this chapter
can be characterized in terms of rational solutions to polynomial equations. Thus the geometric
theory makes contact with polynomial fraction descriptions. As a start, consult
M.L,J. Hautus, "(A, B)-invariant and stabilizability subspaces, a frequency domain description,"
Autoinatica, Vol. 16, pp. 703707. 1980

Note 18.4

Eigenvalue assignment properties of nested collections of controlled invariant

subspaces are discussed in

J.M. Schumacher, "A complement on pole placement," IEEE Transactions on Automatic Control,
Vol.25, No.2, pp. 281 282, 1980

Eigenvalue assignment using friends of a specified controlled invariant subspace 'V will be an

important issue in Chapter 19, and it might not be surprising that the largest controllability
subspace contained in 'I) plays a major role. Geometric interpretations of various concepts of
system zeros, including transmission zeros discussed in Chapter 17, are presented in

H. Aling, J.M. Schumacher, "A nine-fold canonical decomposition for linear systems,"
International Journal of Control, Vol. 39, No. 4, pp. 779 805, 1984
This leads to a geometry-based refinement of the canonical structure theorem.

Note 18.5 A subspace S cX is called a stabili:ahility suhspace for (3) if S is a controlled


invariant subspace for (3) and there is a friend F of S such that the component of
=

(A + BF)x(t)

to S is exponentially stable. Characterizations of stabilizability subspaces and


applications to control problems are discussed in the paper by Hautus cited in Note 18.3. In
corresponding

Lemma 3.2 of

J.M. Schumacher, "Regulator synthesis using (C, A, B )-pairs," IEEE Transadilons on Automatic
Control, Vol. 27, No.6, pp. 1211 -1221, 1982
a characterization of stabilizable subspaces, there called inner stahilizable suhspaces. is given that
is a geometric cousin of the rank condition in Exercise 14.8.
Note 18.6 An approximation notion related to invariant subspaces is introduced in the papers

Chapter 18

356

Geometric Theory

Willems, "Almost invariant subspaces: An approach to high-gain feedback designPart I:


Almost controlled invariant subspaces," IEEE Transactions on Automatic Control, Vol. 26, No. 1.
pp. 235 252, 1981: "Part II: Almost conditionally invariant subspaces," IEEE Transactions on
Automatic Control, Vol. 27, No.5, Pp. 1071 1085, 1982
J.C.

Loosely speaking, for an initial state in an almost controlled invariant subspace there are input
signals such that the state trajectory remains as close as desired to that subspace. This so-called

almost geometric theory can be applied to many of the same control problems as the basic
geometric theory, including the problems addressed in Chapter 19. Consult
R. Marino, W. Respondek, A.J. Van der Schaft. "Direct approach to almost disturbance and almost
input-output decoupling," International
of Control. Vol.48, No. 1, pp. 353383, 1986

Note 18.7 Extensions of geometric notions to time-varying linear state equations are available.
See for example

A. Ilchmann, "Time-varying linear control systems: A geometric approach," IMA Journal of


Mathe,natical Control and Information. Vol. 6, pp. 411 440. 1989

Note 18.8 For a discrete-time linear state equation

x(k-t-l)=A.v(k) +Bn(k)
v(k) = C.v(k)

is unchanged from the


construction of the invariant subspaces <A I B> and
continuous-time case. However the interpretation of <A I 23> must be phrased in terms of
mathematical

reachability and reachable states, because of the peculiar nature of controllability in discrete time.
Of course in defining the stable and unstable subspaces, X and X', we assume all roots of
have magnitude unity or greater.
p(X) have magnitude less than unity and all roots

These simple adjustments propagate through the treatment with nothing more than recurring
terminological awkwardness. In discrete time should the controllable subspace be called the
reachable subspace. and the controllability subspace the reachability subspace? The concerned
are invited to relax rather than fret over such issues.

19
APPLICATIONS OF
GEOMETRIC THEORY

In this chapter we apply the geometric theory for a time-invariant linear state equation,
often called the plant or open-loop state equation in the context of feedback,

1(t) =Ax(i) + Bu(t)


y(t)

Cx(t)

to linear control problems involving rejection of unknown disturbance signals, and


isolation of specified entries of the vector output signal from specified input-signal
entries. In both problems the control objective can be phrased in terms of invariant
subspaces for the closed-loop state equation. Thus the geometric theory is a natural tool.

New features of the subspaces introduced in Chapter 18 are required by the


development. These include notions of maximal controlled-invariant and controllability
subspaces contained in a specified subspace, and methods for their calculation.

Disturbance Decoupling
A disturbance input can be added to (1) to obtain the linear state equation

1(t) =Ax(t) + Bu(t) + Ew(t)


v(t) = Gx(t)
suppose w (t) is a q x 1 signal that is unknown, but continuous in keeping with the
usual default, and E is an ii x q coefficient matrix that describes the way the disturbance
enters the plant. All other dimensions, assumptions, and notations from Chapter 18 are
preserved. Of course the various geometric constructs are unchanged by adding the
disturbance input. That is, invariant subspaces for A and controlled invariant subspaces
with regard to the plant input zi(t) are the same for (2) as for (1).
We

357

Applications of Geometric Theory

Chapter 19

358

The control objective is to choose time-invariant linear state feedback

u(t)=Fx(t) + Gv(t)
so that, regardless of the reference input v (1) and initial state x0, the output signal of the
closed-loop state equation

= (A + BF).v(t) + BGr(t) + Ew(t)

.v(O) =

v(t) =

(3)

is uninfluenced by w (t). Of course the component of y (t) due to w (t) is independent


of the initial state, so we assume .v,, = 0. Then, representing the solution of (3) in terms

of Laplace transforms, a compact way of posing the problem is to require that F be


chosen so that the transfer function from disturbance signal to output signal is zero:
C(sI A

BF)'E

=0

(4)

When this condition is satisfied the closed-loop state equation is said to be disturbance
decoupled. Note that no stability requirement is imposed on the closed-loop state
equationa deficiency addressed in the sequel.

The choice of reference-input gain G plays no role in disturbance decoupling.


Furthermore, using Exercise 5.13 to rewrite the matrix inverse in (4), it is clear that the
objective is attained precisely when F is such that

<A + BF I/in [E]>

Ker [C I

In words. the disturbance decoupling problem is solvable if and only if there exists a

state feedback gain F such that the smallest (A + BF)-invariant subspace containing
fin [E J is a subspace of Ker [Cl. This can be rephrased in terms of the plant as follows.
The disturbance decoupling problem is solvable if and only if there exists a controlled
invariant subspace 'ji c Ker [C I for (2) with the property that tin [El c '1". To turn this
statement into a checkable necessary and sufficient condition for solvability of the
disturbance decoupling problem. we proceed to develop a notion of the largest
controlled invariant subspace for (I) that is contained in a specified subspace of X, in
this instance the subspace Ker [C].
Suppose

c X is a subspace. By definition a maximal controlled invariant

subspace contained in !1( for (1) contains every other controlled invariant subspace
for (I). The first task is to show existence of such a maximal controlled
contained in
is left understood.) Then
(The dependence on
invariant subspace, denoted by

the relevance of
computation of
19.1

Theorem

to the disturbance decoupling problem is shown, and the


is addressed.

Suppose

controlled invariant subspace

c X is a subspace. Then there exists a unique maximal


contained in

for(l).

Proof The key to the proof is to show that a sum of controlled invariant subspaces
First note that
also is a controlled invariant subspace contained in
contained in

Disturbance Decoupling

359

there is at least one controlled invariant subspace contained in


namely the subspace
0, so our argument is not vacuous. If 'l', and '14, are any two controlled invariant
subspaces contained in
then
A

Also

+ '14, c

c '14, + !B, A '14, c '14, +

and
A('14, + '14,) = A'14, + A'14,

c '14, + '14, +

That is, by Theorem 18.19, '14, + '14, is a controlled invariant subspace contained in
Forming the sum of all controlled invariant subspaces contained in
and using
the finite dimensionality of
a simple argument shows that there is a controlled
invariant subspace contained in
of largest dimension, say
To show
is
maximal, if 'l1c is another controlled invariant subspace for (1), then so is
But then

dim

dim('V+

'L"

and this inequality shows that


Therefore
is a maximal controlled invariant
subspace contained in
To show uniqueness simply argue that two maximal controlled
invariant stibspaces contained in '1(for (I) must contain each other, and thus they must
be

identical.

Returning to the disturbance decoupling problem, the basic solvability condition is


straightforward to establish in terms of

19.2 Theorem
There exists a state feedback gain F that solves the disturbance
decoupling problem for the plant (2) if and only if
Im LE I c

where

Proof

(5)

is the maximal controlled invariant subspace contained in Ker [C] for (2).
If (5) holds, then choosing any friend F

of

we have, since

is an

invariant subspace for A + BF,


dcs E

for any disturbance signal. Since

C Jet"

t 0

c Ker [C I.
= 0, t 0

again for any disturbance signal, and taking the Laplace transform gives (4).
Conversely if (4) holds, then

t0
and therefore

(6)

Applications of Geometric Theory

Chapter 19

360

CE=C(A +BF)E=

=C(A

This implies that <A + BF liii [El>, an invariant subspace for A + BF, is contained in
Ker [C]. Since 'V* is the maximal controlled invariant subspace contained in Ker [C],
I

we have

mi [El c <A + BF Im [E]> c


I

ODD
Application of the solvability condition in (5) requires computation of the

maximal controlled invariant subspace


contained in a specified subspace !7(. This is
addressed in two steps: first a conceptual algorithm is established, and then, at the end of
the chapter, a matrix algorithm that implements the conceptual algorithm is presented.

Roughly speaking the conceptual algorithm generates a nested set of decreasingthat yields 11* in a finite number of steps.
dimension subspaces, beginning with
Then the matrix algorithm provides a method for calculating bases for these subspaces.
is settled, the first part of the proof of Theorem 19.2
Once the computation of

shows that any friend of q)* specifies a state feedback that achieves disturbance
decoupling. The construction of such a friend is easily lifted from the proof of Theorem
is a basis for
18.19. Let v1, .. ., v,, be a basis for X adapted to '1)", so that i's,. . ,
'V*. Since A(tI*c:(1/*+fB,for k=l
v we can solve for WkE 'V* and UkE 'U,
the input space, such that AvA = Wk Bilk. Then with arbitrary ni x I vectors
ii,, ,set
.

F=

c Ker[C], then the


If 'V is any controlled invariant subspace with !rn[E] c 'lic
first part of the proof of Theorem 19.2 also shows that any friend of 'ii achieves
disturbance decoupling. Furthermore the construction of a friend of '1) proceeds as
above.

19.3 Theorem

For a subspace

c X, define a sequence of subspaces of

by

'ift-1), k=l,2,...
Then 'ii" is the maximal controlled invariant subspace contained in

for (1), that is,

'ii" =

Proof

First we show by induction that

k = 0,

'iA

'V'c 'V.Supposingthat K2 issuchthat


'ifi)

'V"')

Obviously

Disturbance Decoupling

361

and the induction is complete.

k = 0, 1. Furthermore if '1A =

It follows that dim (VA dim


some value of k, then
(Vk+I

for

(VA)

= 'V' =

= 'VAI for all j = 1, 2


Therefore at each iteration the
This implies that
dimension of the generated subspace must decrease or the algorithm effectively
the dimension can decrease for at most ii iterations, and
terminates. Since dini 'V0
Now
thus (VPl+J = (V" for j = 1, 2
(VII = (V1I+I =

c (V"+ and
and this implies '1"
'i/" c and therefore 'V" is a controlled invariant subspace contained in
Finally, to show that 'V" is maximal, suppose (V is any controlled invariant
then an
subspace contained in
By definition (Vc 'ii, and if we assume 'tic
induction argument can be completed as follows. By Theorem 18.19,
A(Vc (V+

't,/K + fB

that is.
+

Therefore
+ fB)= (VK+I

This induction proves that 'tic 'V for all k = 0, 1

and thus 't/c '11". Therefore


(I)" = (V*, the maximal controlled invariant subspace contained in

001
The algorithm in (7) can be sharpened in a couple of respects. It is obvious from
the proof that
is obtained in at most n stepsthe
is chosen here only for
simplicity of notation. Also, because of the containment relationship of the iterates, the
general step of the algorithm can be recast as
(VA =

(10)

For the linear state equation (2), suppose 't/* is the maximal controlled
invariant subspace contained in Ker [C], with the dimension of
denoted v, and
Im [E] c (11*. Then for any friend F" of 1)* consider the corresponding state feedback
19.4 Example

for(3):

u(t) = F"x(t) + v(t)


The closed-loop state equation, after a state variable change (t) = P' x (t) where the
columns of P comprise a basis for X adapted to
can be written as

________

Chapter 19

362

A11

A1,

A,,

(t)=

,,(t)

Applications of Geometric Theory


E11

v(t) +

+
B,1

(nv)xq

C12]

(11)

From the form of the coefficient matrices, and especially from the diagram in Figure
19.5. it is clear that (11) is disturbance decoupled. And it is straightforward to verify (in
terms of the state variable z (t)) that

1k'

is a friend of
for any rn x (ii v) matrix F'1'2. This suggests that there is
flexibility to achieve goals for the closed-loop state equation in addition to disturbance
decoupling. Moreover if 'Pc:
is a smaller-dimension controlled invariant subspace
also

contained in Ker [C] with un [El c 'P,

then this analysis can


Greater flexibility is obtained since the size of F'1', will be larger.

be repeated for

'P.

w(t)
UP.]

=A;,(t) +A1,:,,(t) +

Bv()

v(i)

19.5

Figure

+ B,1v(t)

Structure of the disturbance-decoupled state equation (II).

Disturbance Decoupling with Eigenvalue Assignment


Disturbance decoupling alone is a limited objective, and next we consider the problem of

simultaneously achieving eigenvalue assignment for the closed-loop state equation.


(The intermediate problem of disturbance decoupling with exponential stability is
discussed in Note 19.1.) The proof of Theorem 19.2 shows that if '1! is a controlled
invariant subspace such that Im [El c 'Pc Ker [C], then any friend of 'P
be used to
achieve disturbance decoupling. Thus we need to consider eigenvalue assignment for the

closed-loop state equation using friends of 'P as feedback gains. Not surprisingly, in
view of Theorem 18.26, this involves certain controllability subspaces for the plant. A
solvability condition can be given in terms of a maximal controllability subspace, and
therefore we first consider the existence and conceptual computation of maximal
controllability subspaces. Fortunately good use can be made of the computation for
maximal controlled invariant subspaces. The star notation for maximality is continued
for controllability subspaces.

Disturbance Decoupling with Eigenvalue Assignment

363

is the maximal controlled invariant


Theorem Suppose c X is a subspace.
Then
subspace contained in
for (I), and F is a friend of

19.6

(12)
is

the unique maximal controllability subspace contained in

for (I).

Proof As in the proof of Theorem 18.23, compute an m x nz matrix G such that


im [BG] =
With F the assumed friend of (V*, let
(13)

c and by definition F also is a


Clearly Q is a controllability subspace, Q c
is any other friend of
then
is a friend of
friend of Q. We next show that if
Q. That is,

Induction

<A+BF"hBn i)'> =
is used to show the left side is contained in the right side. Of course

Q, and if (A

Q, then

+BF")[(A

(A

c(A BF")Q
c:(A +
Since

F is a friend of !&,

(A

c 11,1*
Theorem 18.20 implies B(Fh
Obviously B(F" _F)tV* c fB, so we have

BF)9t+ B(Fb F)(


To show B(F"F)Qc: Q, note that
since both F and F" are friends of (11*.

B(F"
Therefore

B(F" -

and (15) gives

(A+BF)
This completes the induction proof that

The reverse inclusion is obtained by an exactly analogous induction argument. Thus (14)
is verified, and any friend of 411* is a friend of
(In particular this guarantees that (12)
is well definedany friend F of 411* can be used.)

To show Q is maximal, suppose


contained in

is any other controllability subspace

Q,,

for (1). Then by Theorem 18.23 there exists an F" such that
= <A + BF"

Furthermore since

also is a controlled invariant subspace contained in

for (I),

Chapter 19

364

Applications of Geometric Theory

c V*. To prove that


c Q involves finding a common friend of these two
controllability subspaces, but by the first part of the proof we need only compute a
common friend Fe for
and 'I!".
Select a basis
p,, for X such that Pt,..., pp is a basis for
is a basis for
that there exist
,...,v

Then the property A

E q,)* and

. .

+ fB implies in particular
e 'U such that

j=p+l

Choosing
FC =

[Fap

0ffi X(nv) ] [p

P2

p,,

it follows that

(A+BFa)pJEQ(,,j=1,...,p
(A +

j=p+

O,j=v+l

This shows F' is a friend of both


and
Since P is a friend of
and
and hence Tt, from Q,, c

we have

=Q
Therefore Q in (13) is a maximal controllability subspace contained in

for (1).

Finally uniqueness is obvious since any two such subspaces must contain each other.

ODD
The conceptual computation of
suggested by Theorem 19.6 involves first
Then, as discussed in Chapter 18, a friend F of
can be computed.
from which it is straightforward to compute Q" = <A + BFI !Bn
In addition the
computing

proof of Theorem 19.6 provides a theoretical result that deserves display.

19.7 Corollary With Q*


then F is a friend of Q*.

c X as in Theorem 19.6, if F is a friend of

19.8 Example It is interesting to explore the structure that can be induced in a closedloop state equation via these geometric constructions. Suppose that '1) is a controlled

invariant subspace for the state equation (1) and Q* is the maximal controllability
subspace contained in 'I". Supposing that F" is a friend of
Corollary 19.7 gives that
F" is a friend of Q* via the device of viewing 'V as the maximal controlled invariant
subspace contained in 'V for (1). Furthermore suppose q = di,,i fBn
and let
G = [G i G.,] be an invertible m x rn matrix with m x q partition G i such that

Disturbance Decoupling with Eigenvalue Assignment

365

Jm[BG1] =
Now for the closed-loop state equation
= (A

+ BFa)x(t) + BGv(t)

consider a change of state variables using a basis adapted to the nested set of subspaces
.and 'jI Specifically let P
Pq be a basis for
be a basis for Q*
be a basis for 'V,and Pi
p,, be a basis for X, with
0 <q <p < v <n to avoid vacuity. Then with

z(f)=

p,,]'x(t)

[P1

the closed-loop state equation (17) can be written in the partitioned form

A11 '42 A13


=

B11 812

z(t) +

'422 A23

A33

822

832

v(t)

Here A11 is pxp, B11 is pxq, 812 is px(rnq), A22 is (vp)x(vp), 822 is
(vp) x (ni q), A33 is (n v) x (ii v), and B3, is (,i v) x (rn q).

Consider next the state feedback gain

F=F"
where

F"

has

the partitioned form


F"

The resulting closed-loop state equation

= (A + BF)x(r) + BGv(t)

after the same state variable change is given by


+B11F?1 A12 A13

i(t) =

A22

B12

'423 +B22F43
A3, +B32F')3

z(t) +

822

832

"(0

In this set of coordinates it is apparent that F is a friend of 'V and a friend of


characteristic polynomial of the closed-loop state equation is

and under a controllability hypothesis F?1 and


coefficients

for the

associated polynomial

The

can be chosen to obtain desired


However the characteristic

factors.

Applications of Geometric Theory

Chapter 19

366

polynomial of A,, remains fixed. Of course we have used a special choice of F" to

arrive at this conclusion. In particular the zero blocks in the bottom block row of F"
preserve the block-upper-triangular structure of P '(A + BF)P, thus displaying the
eigenvalues of A + BF. The zero blocks in the top row of F" are not critical; entries
there do not affect eigenvalues. Using a more abstract analysis it can be shown that the
characteristic polynomial of A,, remains fixed for eveiy friend F of 'ii,
With this friendly machinery established, we are ready to prove a basic solvability

condition for the disturbance decoupling problem with eigenvalue assignment. The
particular choice of basis in Example 19.8 provides the key to an elementary treatment,
though in more detail than is needed. Moreover the conditions we present as sufficient
conditions can be shown to be both necessary and sufficient. In the notation of Example

19.8, necessity requires a proof that the eigenvalues of A,, in (18) are fixed for every
friend of '1".

19.9 Lemma Suppose the plant (1) is controllable, 'I! is a v-dimensional controlled
invariant subspace, v I, and Q* is the maximal controllability subspace contained in
'jJ If Q* = 'jI, then for any degree-v polynomial
and any degree-(n v)
polynomial
there exists a friend F of 'ji such that
det (7J A

Proof Given

BF)

first select a friend F" of 1= Q*

and

so

that the state

feedback
u (t) = F"x (t) + v (t)

for the
applied to (I) yields, by Theorem 18.26, the characteristic polynomial
component of the closed-loop state equation corresponding to Q". Applying a state
variable change z(t) = P'x(t), where the columns of P form a basis for X adapted to
=

'1/,

gives the closed-loop state equation in partitioned form,

i(t)

A,, A,,
0

where det (?J

z(t)

A7,

i'(t)

B2,

Now consider, in place of F", a feedback gain of the form

F=F"+ [0
This new feedback gain is easily shown to be a friend of 'jJ=
loop state equation, in terms of the state variable z (t),
(t)

A,1 A,,+B11F12
0

:(t) +

B,,

The characteristic polynomial of this closed-loop state equation is

that gives the closed-

Noninteracting Control

367

A77

By hypothesis the plant is controllable, and therefore the second component state
equation in (20) is controllable. Thus F'1'2 can be chosen to obtain the characteristic
polynomial factor
det

A77

B,1F'1'2) =PnvO')

ODD
The reason for the factored characteristic polynomial in Lemma 19.9, and the next
result, is subtle. But the issue should become apparent on considering an example where
ii = 2, v = 1, and the specified characteristic polynomial is 2.2+1.

19.10 Theorem
Suppose the plant (2) is controllable, and Q* of dimension p I is
the maximal controllability subspace contained in Ker[C]. Given any degree-p
there exists a state
polynomial
and any degree-(n p) polynomial
feedback gain F such that the closed-loop state equation (3) is disturbance decoupled

and has characteristic polynomial

if

(21)

Proof Viewing 'i= Q* as a controlled invariant subspace contained in Ker [C],


since mi [E] c '1 the first part of the proof of Theorem 19.2 shows that for any state
feedback gain F that is a friend of 'V the closed-loop state equation is disturbance
decoupled. Then Lemma 19.9 gives that a friend of 'ii can be selected such that the
characteristic polynomial of the disturbance-decoupled, closed-loop state equation is

Noninteracting Control
The noninteracting control problem is treated in Chapter 14 for time-varying linear state

equations with p = in, and then specialized to the time-invariant case. Here we
reformulate the time-invariant problem in a geometric setting and assume p in so that
the objective in general involves scalar input components and blocks of output
components. It is convenient to adjust notation by partitioning the output matrix C to
write the plant in the form
.i(t)

=Ax(t)

+ Bu(t)

j = 1,..., in

where C1 is a Pj x n matrix, and Pi + .. + p,,, = p. With G, denoting the


of the in x m matrix G, linear state feedback can be written as
In

u(t)=Fx(t)
The

resulting closed-loop state equation is

(22)

Applications of Geometric Theory

Chapter 19

368

I,,

i(t) = (A

BF)x(t)
,

BG1v,(t)

j=I

(23)

in

a notation that focuses attention on the scalar components of the input signal and the
Pi x 1 vector partitions of the output signal.

The objectives for the closed-loop state equation involve only input-output
behavior, and so zero initial state is assumed. The first objective is that for i j the j"
output partition
should be uninfluenced by the
input v1(t). In terms of the
component closed-loop transfer functions,
=

V(s) , i,

j =I

in

The second objective is that the


the first objective is, simply,
= 0 for i
closed-loop state equation be output controllable in the sense of Exercise 9.10. This
For
imposes the requirement that the j'1'-output block is influenced by the

example, from the solution of Exercise 9.11, if p1 =

p,,,

1. then the output

controllability requirement is that each scalar transfer function


be a nonzero
rational function of s.
It is straightforward to translate these requirements into geometric terms. For any
F and G the controllable subspace of the closed-loop state equation corresponding to
the i"-input is <A + BF fm [BG1]>. Thus the first requirement can be satisfied if and
only if there exist feedback gains F and G such that

Stated another way, if and only if there exist F and G such that
<A + BF I fin [BG1]> c

i=

1,

. . ,

in

where
=

i=

(24)

in

Also, by Exercise 18.9, the output controllability requirement can be written as

<A + BF I fin

= 9;,

= I

in

where 9; = fin [C1].

These two objectives comprise the noninteracting control problem. We can


combine the objectives and rephrase the problem in terms of controllability subspaces
characterized as in Theorem 18.23, so that G is implicit. This focuses attention on
geometric aspects: The noninteracting control problem is solvable if and only if there
such that
exist an m x n matrix F and controllability subspaces Rj,. .,
.

C,Q1 = 9;

(25)

Noninteracting Control

for i =

369

m. The key issue is existence of a single F that is a friend of all the

controllability subspaces Rj
R,,,. Controllability subspaces that have a common
friend are called compatible, and this terminology is applied also to controlled invariant
subspaces that have friends in common.
Conditions for solvability of the noninteracting control problem can be presented
either in terms of maximal controlled invariant subspaces or maximal controllability
subspaces. Because an input gain G is involved, we use controllability subspaces for
congeniality with basic definitions of the subspaces. To rule out trivially unsolvable
problems, and thus obtain a compact condition that is necessary as well as sufficient,
familiar assumptions are adopted. (See Exercise 19. 12.) These assumptions have the
added benefit of harmony with existence of a state feedback with invertible G that
solves the noninteracting control problema desirable feature in typical situations.
19.11 Theorem
Suppose the plant (22) is controllable with rank B = in and
rank C = p. Then there exist feedback gains F and invertible G that solve the

noninteracting control problem if and only if


=

where, for i =

Q,,, *

(26)

is the maximal controllability subspace contained in

in,

n Qi * ... + fB

for (22).

Proof To show (26) is a necessary condition, suppose F and invertible G are such
that the closed-loop state equation (23) satisfies the objectives of the noninteracting
control problem. Then the controllability subspace
= in, [BG1J + (A + BF)im [BG1J +

... + (A

+ BF)" - tim [BG1]

satisfies

and, of course, Q,

i=

in

Therefore In, [BG1] c

Im[BG1]

and since Jut [BG1I C

i=

+ 1w

in

Using the invertibility of G,


= un [BG ii

+
+

... +

(27)

Since the reverse inclusion is obvious, we have established (26).

It is a much more intricate task to prove that (26) is a sufficient condition for
solvability of the noninteracting control problem. For convenience we divide the proof
and state two lemmas. The first presents a refinement of (26), and the second proves
compatibility of a certain set of controlled invariant subspaces as an intermediate step in
proving compatibility of

Applications of Geometric Theory

Chapter 19

370
19.12

Lemma Under the hypotheses of Theorem 19.11, if(26) holds, then


(28)

j=1

(29)

rn

(30)

Proof

Since a sum of controlled invariant subspaces is a controlled invariant

subspace,
'H

1=1

a controlled invariant subspace that, by (26), contains


minimal controlled invariant subspace that contains
is

But <A I B> is the


and the controllability

hypothesis and Corollary 18.7 therefore give (28).


* has dimension one. Let
Next we show that fB m

i=2
These obviously are nonnegative integers, and the following contradiction argument
1. If y, = 0 for some value of i, then
proves that
Q.*
(32)

Setting
'H

Q.*

j*i
together with (26) gives that fB c R.,. Thus 9(, is a controlled invariant subspace
that contains fB, and, summoning Corollary 18.7 again,
= X. By the definition of
Q,,,*,
Q
c
Ker
[Ci],
which
implies
Ker
[C1]
=
and this contradicts the
X,
Qi
assumption rank C = p.
Having established that
..., 1,,, 1, we further observe, from (26) and (31),
(32)

that

ii
An immediate consequence is

Noninteracting Control

371

Of course this shows dim

1.

To establish (29) for any other value of j, simply reverse the roles of fB n
* in the definition of integers
and
Yrn' and apply the same argument.
Finally (30) holds as a consequence of (26), (29), and dim

= m.

Under the hypotheses of Theorem 19.11, suppose (26) holds. Let


denote the maximal controlled invariant subspace contained in !7(,, i = 1
Then
the subspaces defined by
19.13 Lemma

DI

i=1,...,m

(33)

1=I

are

compatible controlled invariant subspaces.

Proof The calculation

In

j=l
in

+ fB)

proves

that 'Vi,..., %,

are

controlled invariant subspaces. Using (26), and the fact that

i=l,...,m
By (29) we can chooseii x 1 vectors B1

(34)

B,,, such that

Then, from (34),

i=1
and, calling on Theorem 18.19, there exist 1 x n matrices

F,,, such that

i=1

From this data a common friend F for


,... i',1 be a basis for X. Since Im [B,] c
,

can be constructed. Let


there exist m x 1 vectors u is.. . ,

Applications of Geometric Theory

Chapter 19

372
such that

,fl

BilL =

k=I

ii

1=I

Let
= [u

(35)

I
.

. .

so that
BFi'L = B

ii,,

11?

k =

F?

j= I

can be written as a linear combination of i'1 ,...,v,,,

Since any vector in

UI

(A + BF)

+ BF, +

= (A

j=I
In

(A +

Q.*

+
1=1
In

j=

=q', 1=1

(36)

are compatible with common

Therefore the controlled invariant subspaces

friend F given by (35).

DOD
Returning to the sufficiency proof for Theorem 19.11, we now show that (26)

implies existence of F and invertible G such that


(25). The major effort involves proving that

satisfy the
Q,,,* are

conditions in
compatible. To this end we use Lemma 19.13 and show that F in (35) satisfies

(A +
Then

i=

in

In
it follows from Corollary 19.7 that F is a common friend of
implies compatibility of

other words we show that compatibility of


Let
III

i=1

ni

j*i
Since each

is

an invariant subspace for (A + BF), it is easy to show that

(37)

Noninteracting Control

373

I=1
rn, a
also are invariant subspaces for (A + BF). We next prove that
=
step that brings us close to the end.
Then, from the definition
From the definition of
in (33),
c for all i
of
c
i=I
m. To show the reverse containment, matters are
in (37),
written out in detail. From (33) and (37)
In

,fl

1=1

Jl A=1
Since
In

174*

n Ker[C,] , k = 1,..., m

it follows that
In

In

jI

In

Ker [C,]

(38)

1=1

Noting that Ker [C1] is common to each intersection in the sum of intersections
In

In

Ker[C,]
kI

k;j
we can apply the first part of Exercise 18.4 (after easy generalization to sums of more
than two intersections) to obtain
ni

Ifl

III

pn

n Ker [C,]

n Ker [C,] c Ker [Gd] n

This gives, from (38),


In

In

n (Ker
j1

c
Therefore

I=1

Ker [C,])
k=I

i= I

rn

in, by maximality of each

rn.

(39)

and this implies

i',...,
With the argument above we have compatibility of
hence
compatibility of Rj *
Lemma 19.13 provides a construction for a common
friend F, and it remains only to determine the invertible gain G. From (29) we can
compute ni x 1 vectors G i

mi [BG,] =

such that

Q.*

ni

Chapter 19

374

Applications of Geometric Theory

then

<A+BFIIm[BG,]>, i =

1.

in

and it is immediate from (30) that G is invertible.


Q,,,* satisfy the geometric conditions in
We conclude the proof that Qi
(25) by demonstrating output controllability for the closed-loop state equation. Using
(28) and the inclusion !R, c Ker [C1] noted in the proof of Lemma 19.12 yields
+ Ker [C,] =

X, 1 =

in

But then

C,Q,* =C1(Q1* + Ker[C1])=C1X=9, 1 = 1,...,

in

and the proof is complete.

DOD
After a blizzard of subspaces, and before a matrix-computation procedure for
and hence
it might be helpful to work a simple problem freestyle from the basic
theory.

19.14 Example
specified by

Consider X= R3 with the standard basis e1, e,, e3, and a linear plant

A=

100
234
005

B=

01

00
20

C=

(41)

assumptions of Theorem 19.11 are satisfied, and the main task in ascertaining
* and Q2*, the
solvability of the noninteracting control problem is to compute
maximal controllability subspaces contained in Ker [C2] and Ker [C respectively.
Retracing the approach described immediately above Corollary 19.7, we first
and 'Vt, the maximal controlled invariant subspaces contained in
compute
is spanned by e1, e3, and Ker[C2] is
Ker [C7] and Ker[C1], respectively. Since
spanned by e e2, written

The

span {e1,e3}
Ker[C2] =

span

(e1, e7

the algorithm in Theorem 19.3 gives

{e1,e7}
( span {e1, e2} )

span {e1, e3} + span (e1,

Thus

= span

Friends of

can

{e1,e,J

be characterized via the condition (A

That is,

Noninteracting Control

375

writing
F

f12 f13
f21 f22 f23

we consider
f22

2f11

2ff,

f23
4

span{e1,e2}cspan{e1,e2}

This gives that F is a friend of 'lie * if and only if f


* is F = 0, and since iTh '1's * = e1,

(42)

= f12 = 0. The simplest friend of

Rj* =
span {e1} +

= span {e1}
= span

A2 span {e1}

{e1,e7}

= '1,,I *

A similar calculation gives that

Q2* = q4* =
and

span

F is a friend of q4* if and only if f22


Applying the solvability condition (26),

{e2, e3}
= 0.

= span {e1} + span {e3} =B

and noninteracting control is feasible. Using (40) immediately gives the reference-input
gain

G= [?

(43)

A gain F provides noninteracting control if and only if it is a common friend of

and Q2*. Therefore the class of state-feedback gains for noninteracting control is
described by

where f 3 and f21 are arbitrary.


A straightforward calculation shows that A + BF has a fixed eigenvalue at 3 for
any F of the form (44). Thus noninteracting control and exponential stability cannot be
achieved simultaneously by static state feedback in this example.

Chapter 19

376

Applications of Geometric Theory

Maximal Controlled Invariant Subspace Computation


are two main steps needed to translate the conceptual algorithm for
in
Theorem 19.3 into a numerical algorithm. First is the computation of a basis for the
There

intersection of two subspaces from the subspace bases. Second, and less easy, we need a
method to compute a basis for the inverse image of a subspace under a linear map. But a
preliminary result converts this second step into two simpler computations. The proof
uses the basic linear-algebra fact that if H is an ii x q matrix,

R" =!rn[H]ffiKer[HT]
19.15 Lemma Suppose A is an n x n matrix and H is an ii x q matrix. If L is a
maximal rank n x I matrix such that LTH = 0, then A - 'mi [H] = Ker [L TA]

Proof If x E A 11nz [H], then there exists a vector y E mi [H] such that Ax = y.
Since y can be written as a linear combination of the columns of H, the definition of L

gives

0=LTy =LTAx
That is, x e Ker [LTA}.
On the other hand suppose x E Ker [L TA]. Letting y = Ax again, by (45) there
exist unique n x 1 vectors
E mi [H] and y,, e Ker[HT] such that
Ya +yb. Then
L Tx,,

0 =L
Furthermore HTy,, = 0 gives

= 0, and it follows from the maximal rank property of


must be a linear combination of the rows of LT. If the coefficients in this
linear combination are a,
a1, then

L that

a1] LTy,, = 0

Thus y,, = 0 and we have shown that y =

(46)

Im [H]. Therefore x e A'Im[H].

ODD

Given A, B, and a subspace

c X, the following sequence of matrix

computations delivers a basis for the maximal controlled invariant subspace IV" c
is specified as the image of an ,i-row, full-column-rank matrix V0; in
We assume that
other words, the columns of V0 form a basis for !1(. Each step of the matrix algorithm
implements a portion of the conceptual algorithm in Theorem 19.3, as indicated by
parenthetical comments.

19.16 Algorithm
compute a maximal-rank matrix L0 such that
(i) With Irn [V0] = 7(=
(By Lemma 19.15 with A = I, this gives 'V1 = Ker[L6].)

= 0.

(ii) Construct a matrix V0 by deleting linearly dependent columns from the partitioned

Exercises

matrix [B

377

V0]. (Then I,n [V01 = fB+

(iii) Compute a maximal-rank matrix L1 such that Lf


Ker[LfA] = A -

0. (Then, by Lemma 19.15,

(it') Compute a maximal-rank matrix V1 such that


LT

LfA

(Thus Im[V1]=

V1=0

(47)

'V).)

(v) Continue by iterating the previous three steps.

DOD
Specifically the algorithm continues by deleting linearly dependent columns from
[B V1 J to form V1, computing a maximal-rank L, such that
= 0, and then
computing a maximal-rank V2 such that
LT

[L?A]V2_0
= hn [V1]. Repeating this until the first step k where rank VLI = rank
k is guaranteed, gives
= un [VL].

Then

(48)

VL

EXERCISES
Exercise 19.1

With a basis forX= R" fixed and ScXa subspace. let


=

xl _T1 Ofor alivE S}

(Note that this definition is not coordinate free.) If WcXis another subspace, show that

(W+ 5)1

If A is ann x is matrix, show that

(ATS)'
Finally show that (5L

S.

=A'S'

Hint: For the last part use the fact that for a q x n matrix H,

drni Ker [HI + dim Ins IHI = is. This is easily proved by choosing a basis for Xadapted to Ker Ill I.

Exercise 19.2 Corresponding to the linear state equation

kU) =Ax(t) + Bu(z)


suppose
is a specified subspace. Define the corresponding sequence of subspaces (see
Exercise 19.1 for definitions)

Applications of Geometric Theory

Chapter 19

378
(wo =

= WA

+ AT(WL_I

fB1), k = 1,2,

Show that the maximal controlled invariant subspace contained in

is given by

Hint: Compare this algorithm with the algorithm for


Exercise 19.3

and use Exercise 19.1.

For a single-output linear state equation

=Ax(t) + Bu(i)

)'(t) =cx(t)
suppose

c is a finite positive integer such that

j=O
Show that the maximal controlled invariant subspace contained in Ker [c] is
K

=
A

Ker[cAA]

=0

Hint: Use the algorithm in Exercise 19.2 to compute


is the maximal controlled invariant subspace contained in 7( C X.
Exercise 19.4 Suppose
Define a corresponding sequence of subspaces by

= 0

k=l,2,...
= Q*, the maximal controllability subspace contained in
Show that
18.4 show that if F is a friend of '1/i', then

Hint: Using Exercise

+BF)'t(fBrYli*)
Exercise 19.5 For the linear state equation

=Ax(t) + Bu(t)
y(t) = Cx(t)
If
is the maximal controlled invariant subspace contained in
denote the j"-rowof C by
is the maximal controlled invariant subspace contained ,in Ker [C3 1.
Ker [C], and
p,showthat
j=1
p

C fl q*
3=I

Exercise 19.6

Corresponding to the linear state equation

i(t) =Ax(t) + Bu(t)


show that there exists a unique maximal subspace

among all subspaces that satisfy

Exercises

379

AZ +

Zc

Furthermore show that

(This relates to perfect tracking as explored in Exercise 18.13.)

Exercise 19.7 Suppose that the disturbance input w (t) to the plant

=Ax(t) + Bu(t) + Ew(t)


y(t) = Cx(i)
measurable. Show that the disturbance decoupling problem is solvable with state/disturbance
feedback of the form
is

u(t) =Fx(t) + Kw(t)

+ Gv(t)

if and only if

!rn[E]c11*
where

is

the maximal controlled invariant subspace contained in Ker [C].

Exercise 19.8 Corresponding to the linear state equation

.i(t) =Ax(t) + Bu(t)


suppose

c Xis a subspace,

is the maximal controlled invariant subspace contained in

Q* is the maximal controllability subspace contained in

qI* =
Use

and

Show that

Q*

this fact to restate Theorem 19.11.

Exercise 19.9
If the conditions in Theorem 19.11 for existence of a solution of the
noninteracting control problem are satisfied, show that there is no other set of controllability
subspaces Q, c
i =I
rn, such that

That is, Qi *

Qrn* provide the only solution of (26).

Exercise 19.10 Consider the additional hypothesis p = n for Theorem 19.11 (so that C is
invertible). Show that then (26) can be replaced by the equivalent condition

+Ker[C1]=X,

i=l

in

Exercise 19.11 Consider a linear state equation with in = 2 that satisfies the conditions for
noninteracting control in Theorem 19.11. For the noninteracting closed-loop state equation

= (A + BF)x(t) BG1v1(t) BG2v2(t)

=C1x(t)
Y2(t) = C,x(t)
consider a state variable change adapted to the nested set of subspaces

380

span { p,,

span (pI

Applications of Geometric Theory

Chapter 19

p,
span

p,,,,....,
Pi

* n Q2

p,,

p,

= Qi*

p, J =

What is the partitioned form of the closed-loop state equation in the new coordinates?

Exercise 19.12
Justify the assumptions rank B = in and rank C =p in Theorem 19.11 by
providing simple examples with in = p = 2 to show that removal of either assumption admits
obviously unsolvable problems.

NOTES
Note 19.1 Further development of disturbance decoupling, including refinements of the basic
problem studied here and output-feedback solutions, can be found in
S.P. Bhattacharyya. Disturbance rejection in linear systems," Inter,zational Journal of S%'stelns
Science, Vol.5, pp. 633 637. 1974

J.C. Willems, C. Commault, "Disturbance decoupling by measurement feedback with stability or


pole placement," SIAM Journal of Control and Opt uni:a0011. Vol. 19, pp. 490 504, 1981

We have not discussed the problem of disturbance decoupling with stability, where eigenvalue
assignment is not required. But it should be no surprise that this problem involves the
stabilizability condition in Theorem 18.28 and the condition Ini [E] c 5*, where
is the
maximal stabilizability subspace contained in KerlCJ. For further information see the references
in Note 18.5.

Note 19.2 Numerical aspects of the computation of maximal controlled invariant subspaces are
discussed in the papers

B.C. Moore, A.J. Laub. "Computation of supremal (A,B)-invariant and (A,B)-controllability


subspaces," IEEE Transactions on Automatic Control. Vol. AC-23. No. 5, pp. 783 792, 1978
A. Linnemann, "Numerical aspects of disturbance decoupling by measurement feedback," IEEE
Transactions on Automatic Control, Vol. AC-32, No. 10, pp. 922 926, 1987
The singular values of a matrix A are the nonnegative square roots of the eigenvalues of A TA, The
associated singular value decomposition provides efficient methods for calculating sums of
subspaces, inverse images, and so on. For an introduction see

V.C. Klema, A.J. Laub, "The singular value decomposition: its computation and some
applications," IEEE Transactions on Automatic Control, Vol. 25, No. 2, pp. 164 176. 1980
Note 19.3 The noninteracting control problem, also known simply as the decoupling problem,
has a rich history. Early geometric work is surveyed in the paper

A.S. Morse, W.M. Wonham, "Status of noninteracting control." IEEE Transactions on Automatic
Control, Vol. AC-16, No.6, pp.568 581, 1971
The proof of Theorem 19.11 follows the broad outlines of the development in

A.S. Morse, W.M. Wonham, "Decoupling and pole assignment by dynamic compensation," SIAM
Journal on Control and Optimi:arion, Vol. 8, No. 3, pp. 317 337, 1970

Notes

381

with refinements deduced from the treatment of a nonlinear noninteracting control problem in

H. Nijmeijer, J.M. Schumacher, "The regular local noninteracting control problem for nonlinear
control systems," SIAM Journal on C'onrrol and Optimization, Vol. 24, No. 6, pp. 1232 1245,
1986

Endependent early work on the geometric approach to noninteracting control for linear systems is
reported in

G. Basile, G. Marro, "A state space approach to noninteracting controls," Ricerche di


Auto,natica, Vol. I, No. I, pp. 68 77, 1970

Fundamental papers on algebraic approaches to noninteracting control include

P.L. FaIb, W.A. Wolovich, "Decoupling in the design and synthesis of multivariable control
systems," IEEE Transactions on Automatic

Vol. AC- 12, No. 6, pp. 651 659, 1967

E.G. Gilbert, "The decoupling of multivariable systems by state feedback," SIAM Journal on
Control and Optimization, Vol. 7, No. 1, pp. 5063, 1969

L.M. Silverman, H.J. Payne, "Input-output structure of linear systems with application to the
decoupling problem," SIAM Journal on control and Optimization, Vol. 9, No. 2, pp. 199 233,
1971

Note 19.4 The important problem of using static state feedback to simultaneously achieve
noninteracting control and exponential stability for the closed-loop state equation is neglected in
our introductory treatment. Conditions under which this can be achieved are established via
algebraic arguments for the case in = p in the paper by Gilbert cited in Note 19.3. For more
general linear plants, geometric conditions are derived in

J.W. Grizzle, A. Isidori, "Block noninteracting control with stability via static state feedback,"
Mathematics of C'onirol, Signals, and Systems, Vol. 2, No. 4, pp. 315 342, 1989

These authors begin with an alternate geometric formulation of the noninteracting control
problem that involves controlled invariant subspaces containing Inz
and contained in
Ker[C1]. This leads to a different solvability condition that is of independent interest.

If dynamic state feedback is permitted, then solvability of the noninteracting control problem
with static state feedback implies solvability of the problem with exponential stability via
dynamic state feedback. See the papers by Morse and Wonham cited in Note 19.3.
Note 19.5 Another control problem that has been treated extensively via the geometric approach

is the servomechanism or output regulation problem. This involves stabilizing the closed-loop
system while achieving asymptotic tracking of any reference input generated by a specified,
exogenous linear system, and asymptotic rejection of any disturbance signal generated by another
specified, exogenous linear system. The servomechanism problem treated algebraically in
Chapter 14 is an example where the exogenous systems are simply integrators. Consult the
geometric treatment in

B.A. Francis, "The linear multivariable regulator problem," SIAM Journal on Control and
Optimization, Vol. 15, No. 3, pp. 486505, 1977

a paper that contains references to a variety of other approaches. Other problems involving
dynamic state feedback, observers, and dynamic output feedback can be treated from a geometric
viewpoint. See the citations in Note 18.1, and

382

Chapter 19

Applications of Geometric Theory

W.M. Wonham, Linear Multi variable Control: A Geometric Approach, Third Edition, Springer-

Verlag, New York, 1985

0. Basile, 0. Marro, Controlled and Conditioned Invariants in Linear System Theory, Prentice
Hall, Englewood Cliffs, New Jersey, 1992

Note 19.6 Geometric methods are prominent in nonlinear system and control theory, particularly
in approaches that involve transforming a nonlinear system into a linear system by feedback and
state variable changes. An introduction is given in Chapter 7 of

M. Vidyasagar, Nonlinear Systems Analysis, Second Edition, Prentice Hall, Englewood


New Jersey, 1993

and extensive treatments are in


A. Isidori, Nonlinear Control Systems, Second Edition, Springer-Verlag, Berlin, 1989

H. Nijmeijer, AJ. van der Schaft, Nonlinear Dynamical Control Systems, Springer-Verlag, New
York, 1990

20
DISCRETE TIME
STATE EQUATIONS

Discrete-time signals are considered to be sequences of scalars or vectors, as the case

may be, defined for consecutive integers that we refer to as the time index. Rather than
we
employ the subscript notation for sequences in Chapter 1, for example I Xk
simply write x (k), saving subscripts for other purposes and leaving the range of interest
of integer k to context or to separate listing.

The basic representation for a discrete-time linear system is the linear state
equation

x(k+l) =A(k)x(k) + B(k)u(k)


y(k) = C(k)x(k) + D(k)u(k)
The n x 1 vector sequence x (k) is called the state vector, with

entries x1 (k),...,x,, (k)

called the state variables. The input signal is the ni x 1 vector sequence u (k), and y (k)
is the p x 1 output signal. Throughout the treatment of (1) we assume that these
dimensions satisfy rn, p n. This is a reasonable assumption since the input influences
the state vector only through the n x m matrix B (k), and the state vector influences the
output only through the p x n matrix C (k). That is, input signals with ni > n cannot

impact the state vector to a greater extent than a suitable n x I input signal. And an
output with p > n can carry no more information about the state than is carried by a
suitable n x 1 output signal.

Default assumptions on the coefficients of (1) are that they are real matrix
Of course coefficients that are of
sequences defined for all integer k, from oo to
interest over a smaller range of integer k can be extended to fit the default simply by
letting the matrix sequences take any convenient values, say zero, outside the range.
Complex coefficient matrices and signals occasionally arise, and special mention is made
in these situations.
383

Chapter 20

384

Discrete Time: State Equations

The standard terminology is that (1) is time invariant if all coefficient-matrix


sequences are constant. The linear state equation is called time valying if any entry in
any coefficient matrix sequence changes with k.

Examples
An immediately familiar, direct source of discrete-time signals is the digital computer.

However discrete-time signals often arise from continuous-time settings as a result of a


measurement or data collection process, for example, economic data that is published
annually. This leads to discrete-time state equations describing relationships among
discrete-time signals that represent sample values of underlying continuous-time signals.

Sometimes technological systems with pulsed behavior, such as radar systems, are
modeled as discrete-time state equations for study of particular aspects. Also discretetime state equations arise from continuous-time state equations in the course of
numerical approximation, or as descriptions of an underlying continuous-time state
equation when the input signal is specified digitally. We present examples of these
situations to motivate study of the standard representation in (I).

20.1 Example A simple, classical model in economics for national income y (k) in
year k describes y (k) in terms of consumer expenditure c (k), private investment i (k),
and government expenditure g (k) according to

y(k)=c(k)+i(k)+g(k)

(2)

These quantities are interrelated by the following assumptions. First, consumer


expenditure in year k + I is proportional to the national income in year Ic,

c(k+l) = ay(k)
where

the constant a is called, impressively enough, the ina,-ginal propensity to

consume. Second, the private investment in year Ic +1 is proportional to the increase in


consumer expenditure from year k to year k +1,

i(k+l) = [3[c(k+l) c(k)]


0 < a < and [3>0.
where the constant [3 is a growth coefficient.
From these assumptions we can write the two scalar difference equations
1

c(k+l) = ac(k) + ai(k) + ag(k)

i(k+l) = (f3a[3)c(k) + [3ai(k) + [3ag(Ic)


Defining state variables as x1(k)

c(k) and x2(k) = 1(k), the output as y(k), and the

input as g (k), we obtain the linear state equation

x(k+l)=
y(k)= [I

1]x(k)+g(k)

385

Examples

Numbering the years by k = 0,

the initial state is provided by c (0) and i (0).

DOD
Our next two examples presume modest familiarity with continuous-time state
equations. The examples introduce important issues in discrete-time representations for
the sampled behavior of continuous-time systems.

20.2 Example

Numerical approximation of a continuous-time linear state equation

leads directly to a discrete-time linear state equation. The details depend on the
complexity of the approximation chosen for derivatives of continuous-time signals and

whether the sequence of evaluation times is uniformly spaced. We begin with a


continuous-time linear state equation, ignoring the output equation,

= F(t):(t) + G(t)i'(t)
This sequence might be pre-selected, or it might be
a sequence of times
generated iteratively based on some step-size criterion. Assuming the simplest
namely,
approximation of
at each
and

Z(tL+l)

evaluation of (4) for a' = tk gives

Z(tk)

+ G(tk)v(tk)

That is, after rearranging,


Z(tk+j) [I + (

a'k

)G(tk)v(tk)

obtain a discrete-time linear state equation (1) that provides an approximation to the
continuous-time state equation (4), replace the approximation sign by equality, change
the index from
to k, and redefine the notation according to
To

x (k) = z(tA)

A(k) =1 +

u (k) = v (fk) ,

B (k) =

tk)F(tk)

If the sequence of evaluation times is equally spaced, say a'k+t = a'k + for all k,
then the discrete-time linear state equation simplifies a bit, but remains time varying. If
in addition the original continuous-time linear state equation is time invariant, then the
resulting discrete-time state equation also is time invariant.

20.3 Example Suppose the input to a continuous-time linear state equation (4) is
specified by a sequence i,(k) supplied, for example, by a digital computer. We assume
the simplest type of digital-to-analog conversion: a zero-order hold that produces a

Chapter 20

386

Discrete Time: State Equations

corresponding continuous-time input in terms of a fixed T> 0 by

v(t)u(k); kTt <(k+ l)T,

=k0, k0+l,...

With initial time t0

k(,T and initial state


= z(k0T), the solution of (4) for all t ti,,
discussed in Chapter 3, is unwieldy because of the piecewise-constant nature of v (t).
Therefore we relax the objective to describing the solution only at the time instants
t = kT, k k0. Evaluating the continuous-time solution formula

for t = (k + 1 )T
range,

(t) =

a)G (a)v

c)z (t) + 5

(a)

da, t t

and t = kT gives, since v (a) is constant on the resulting integration


(k+I)T

z[(k+l)T]

kT]z(kT) +

a]G(a)da u(k)

(6)

With the identifications

x(k)=z(kT), A(k)=4'F[(k+l)T, kTJ,


(k+I)T

B(k)=

(7)

for k = Ic0, + 1,..., (6) becomes a discrete-time linear state equation in the standard
form (1). An important characteristic of such sampled-data state equations is that A (k)
is invertible for every k. This follows from the invertibility property of continuous-time
transition matrices.
If the continuous-time linear state equation (4) is time invariant, then the discrete-

time linear state equation (6) is time invariant with coefficients that can be written as
constant matrices involving the matrix exponential of F. Specifically the coefficient
matrices in (7) become, after a change of integration variable,

B=JeFtdcG
20.4 Example Consider a scalar, n'1'-order linear difference equation in the dependent
variable y (k) with forcing function u (k),
y(k+n) + a,,...1(k)y(k+nl) +
Assuming the initial time is Ic0,

initial

... + a0(k)y(k)

= b0(k)u(k)

conditions that specify the solution for k

the values

y(k0),y(k0+l),...,y(k0+n1)

k0 are

Linearization

387

This difference equation can be rewritten in the form of an n-dimensional linear state
equation with input u (k) and output y (k). Define the state variables (entries in the state

vector) by

x1(k) =y(k)
x2(k) =y(k+l)

x,,(k) =y(k+n1)

(9)

Then

x1(k+1)

=x,(k)

x7(k+l) =x3(k)

x,1_1(k

+1) = x,,(k)

and, according to the difference equation (8),

x,,(k+1) = ao(k)x1(k) a1(k)x2(k)


Reassembling

+ b0(k)u(k)

these scalar equations into vector-matrix form gives a time-varying linear

state equation:
o

x(k+l)=

..

0
a0(k) a1(k)
o

y(k)= [1

a,,_1(k)

0]x(k)

u(k)

(10)

The original initial conditions for y(k) produce an initial state vector for (10) upon
evaluating the definitions in (9) at k = k0.

Linearization
Discrete-time linear state equations can be useful in approximating a discrete-time,

time-varying nonlinear state equation of the form

x(k+l)

=f(x(k),

u(k), k), x(k0) =x0

y(k) = h(x(k), u(k), k)

(11)

Here the usual dimensions for the state, input, and output signals are assumed. Given a
particular nominal input signal
we can
and a particular nominal initial state
solve the first equation in (11) by iterating to obtain the resulting nominal solution, or

Chapter 20

388

Discrete Time: State Equations

nominal state trajectoiy,


for k = kr,, k1, + 1,.... Then the second equation in (II)
provides a corresponding nominal output trajectory (k). Consider now input signals
and initial states that are close to the nominals. Assuming the corresponding solutions
remain close to the nominal solution, we develop an approximation by truncating the
Taylor series expansions of f (x, u, k) and Ii (x, ii, k) about
after first-order terms.
This provides an approximation of the dependence of f (v, ii, k) and Ii (x, ii, k) on the
arguments x and u, for any time index k.
Adopting the notation

u(k) = i,(k)

x(k) = i(k)

y(k) = j(k)

(12)

the first equation in (11) can be written in the form

i:(kl) + x8(k+l) =f

(k)+u6(k), k),
+ .v8(k0) =

j:0

Assuming indicated derivatives of the function f (x, ii, k) exist, we expand the right side
in a Taylor series about i(k) and
and then retain only the terms through first order.
This is expected to provide a reasonable approximation since u6(k) and x5(k) are
assumed to be small for all k. For the i'1' component, retaining terms through first order
and momentarily dropping most k-arguments for simplicity yields

f,(x x5, ii +

u, k) +

ii, k)x51 +

ii,

ii,

ii and arranging into vector-matrix form gives

(k+1) + x5(k+l)f ((k),


+

Similarly af/au

+ (x, ii,

(.JU

Performing this expansion for i = 1

The notation

af -

u(k), k) uo(k) .

ii(k),
x(k0)

Xo

denotes the Jacobian, an ii x n matrix with i,j-entry


is an n x

Jacobian matrix with i,j-entry

Since

k),
the relation between x5(k) and u6(k) is approximately described by a time-varying
linear state equation of the form
x8(k + 1) = A (k)x8(k) + B (k)u5(k)

(13)

Here A (k) and B (k) are the Jacobian matrices evaluated using the nominal trajectory

389

Linearization
data u(k) and

i(k), namely

A (k) =

af

k),

k), k k0

For the nonlinear output equation in (11), the function h (x, u, k) can be expanded
in a similar fashion. This gives, after dropping higherand u =

about x =

order terms,. the approximate description

+ D(k)u5(k)

y8(k) =

(14)

The coefficients again are specified by Jacobians evaluated at the nominal data:

k), k k0

k), D(k) =

C(k) =

If in fact x5(k0)

is small (in norm), u6(k) stays small for k k0, and the solution x5(k)
of (13) stays small for k k0, then we expect that the solution of (13) yields an accurate
approximation to the solution of (11) via the definitions in (12). Rigorous assessment of
the validity of this expectation must be based on stability theory for nonlinear state
equationsa topic we do not address.

20.5 Example The normalized logistics equation is a basic model in population


dynamics. With x (k) denoting the size of a population at time k, and a a positive
constant, consider the nonlinear state equation

x(k+l)=czx(k)ax2(k), x(O)=x0
No input signal appears in this formulation, and deviations from constant nominal
solutions, that is, constant population sizes, are of interest. Such a nominal solution i,
often called an equilibrium state, must satisfy
x = ax

ax

Clearly the possibilities are


= 0, corresponding to initially-zero population, or
= (a 1)/a. This latter solution has meaning as a population only if a> 1, a condition
we henceforth assume.

Computing partial derivatives, the linearized state equation about a constant


nominal solution

is given by

A straightforward iteration for k = 0, 1,..., yields the solution

k0
cx> 1, if
= 0, then this solution of the linearized equation exhibits an
exponentially increasing population for any positive x8(0), no matter how small. Since
Since

Chapter 20

390

Discrete Time: State Equations

assumption that x5(k) remains small obviously is not satisfied, any conclusion is
suspect. However for the constant nominal = (a 1)/a, with 1 <a < 3, the solution of
the linearized state equation indicates that x5(k) approaches zero as k
That is,
beginning at an initial population near this j:, we expect the population size to
asymptotically return to i
the

State Equation Implementation


It is apparent that a discrete-time linear state equation can be implemented in software

on a digital computer. A state equation also can be implemented directly in electronic


hardware using devices that perform the three underlying operations involved in the
state equation. The first operation is a (signed) sum of scalar sequences, represented in
Figure 20.6(a).
x1(k)x,(k)

(c)

x1(k0)

(a)

v1(k)

(b)

20.6 Figure

The elements of a discrete-time state variable diagram.

The second operation is a unit delay, which conveniently implements the


relationship between the scalar sequences x(k) and x(k+l), with an initial value
assignment at k = Ic0. This is shown in Figure 20.6(b), but proper interpretation is a bit
delicate. The output signal of the unit delay is the input signal 'shifted to the right by
one.' Assuming all signal values are zero for k <k0, the output signal value at k0 would
be restricted to zero if the initial condition terminal was not present. Put another way, in
terms of a somewhat cumbersome notation, if

x(k)=( ... ,0,x(),x1,x2,...)


I

then

x(k+l)=(...,0,x,,x,,x3,...)
I

to fabricate x (k) from x (Ic + I) we use a right shift (delay) and replacement of the
resulting 0 at Ic0 by x0.
The third operation is multiplication of a scalar signal by a time-varying
coefficient, as shown in Figure 20.6(c).
So

State Equation Solution

391

These basic building blocks can be connected together as prescribed by a given


linear state equation to obtain a state variable diagram. From a theoretical perspective
such a diagram sometimes reveals structural features of the linear state equation that are
not apparent from the coefficient matrices. From an implementation perspective, a state
variable diagram provides a blueprint for hardware realization of the state equation.

20.7 Example

The linear state equation (10) is represented by the state variable

diagram shown in Figure 20.8.

20.8 Figure

A state variable diagram for Example 20.4.

State Equation Solution


Technical issues germane to the formulation of discrete-time linear state equations are

slight. There is no need to consider properties like the default continuity hypotheses on
input signals or state-equation coefficients in the continuous-time case. Indeed the
coefficient sequences and input signal in a discrete-time linear state equation suffer no
restrictions aside from fixed dimension. Given an initial time k0, initial state x(k(,) = x1,,
and input signal ii (k) defined for all k, we can generate a solution of (I) for k k0 by
the rather pedestrian method of iteration. Simply evaluate (1) for k = k0, k(,+1, . .. as
follows:
k =

kk1,+l:

x(k<,+ 1) = A

+ B (k0)u (k,,)

x(k(,+2)=A(kQ+l)x(k0+l)+B(k(,-i-l)u(k(,+l)
= A (k(J+ l)A (k0)x0 + A

k = k0-i-2:

)B

(k0) + B (kr, 1 )u

1)

x(k(,+3) = A(k(,+2)x(k+2) + B(k(,2)u(k0+2)


= A (k0 +2)A (k0 + )A
1

+ A (k(, +2)A (k() + )B (k(,)u (ku)


1

+ A(k0+2)B(k0+l)u(k0+l) + B(k0+2)u(k0+2)

This iteration clearly shows that existence of a solution for k k(, is not a problem.
Uniqueness of the solution is equally easy: x (k() + I) can be nothing other than

Chapter 20

392

Discrete Time: State Equations

so on. (Entering a small contradiction argument in the

A (k(,).v(, + B (k(,)u (k(,), and

margin might be a satisfying formality for the skeptic.)

The situation can be quite different when solution of (1) backward in the time
index is attempted. As a first step, given
and u(k,, 1), we would want to compute
such that, writing (1) at k =
+ B(k<,l)u(k0l)

.v0

(18)

If A (k(,l) is not invertible, this may yield an infinite number of solutions for x (k(,l),
or none at all. Therefore neither existence nor uniqueness of solutions for k <k(, can be
claimed in general for (1). Of course if A (k0l) is invertible, then (18) gives
x(k0l) =A1(k(,l)x0

this by iteration, for k = k02,


it follows that if A (k) is invertible
for all k, then given k0, x (ku), and u (k) defined for all k, there exists a unique solution
x(k) of (1) defined for all k, both backward and forward from k0. In the sequel we
typically work only with the forward solution, viewing the backward solution as an
uninteresting artifact.
Having dispensed with the issues of existence and uniqueness of solutions, we
resume the iteration in (17) for k k0. A general form quickly emerges. Convenient
notation involves defining a discrete-time transition matrix, though in general only for
the ordering of arguments corresponding to forward iteration. Specifically, for k j let
Pursuing

A(kl)A(k2)

cb(k

A(j), kjl

I, k=j

(19)

By adopting the perhaps-peculiar convention that an empty product is the identity, this

definition can be condensed to one line, and indeed other unwieldy formulas are
simplified. In the presence of more than one transition matrix, we often use a subscript
to avoid confusion, for example c1A(k, j).
The

default is to leave cD(k, j)

undefined

for k jl. However under the

additional hypothesis that A (k) is invertible for every k we set

.. A'(jl), kjl

(20)

Explicit mention is made when this extended definition is invoked.


In terms of transition-matrix notation, the unique solution of (1) provided by the
forward iteration in (17) can be written as
kI

k0)x0 +

kk0+l

(21)

j =k,,
And

if it is not clear that this emerges from the iteration, (21) can be verified by

substitution into the state equation. Of course x(k0) = xe,, and in many treatments (21) is

extended to include k

= k0

by (at least informally) adopting a convention that a

State Equation Solution

393

summation is zero if the upper limit is less than the lower limit. However this convention
can cause confusion in manipulating complicated multiple summation formulas, and so
we leave the k = k0 case to separate listing or obvious understanding.
Accounting for the output equation in (I) provides the complete solution
C

+D

k =

(k1,) ,

(22)
C(k)cD(k,

Each of these solution formulas, (21) and (22), appears as a sum of a zero-state response,
which is the component of the solution due to the input signal, and a zero-input response,
the component due to the initial state.

A number of response properties of discrete-time linear state equations can be


gathered directly from the solution formulas. From (21) it is clear that the i"-column of
cb(k, k(,) represents the zero-input response to the initial state

= e,, the i'1'-column

of 1,,. Thus a transition matrix can be computed for fixed k,, by computing the zerochanges, then the whole
input response to n initial states at k). In general if
computation must be repeated at the new initial time.

The zero-state response can be investigated in terms of a simple class of input


signals. Define the scalar zil7it pulse signal by
=

I, k=O
0, otherwise

Consider the complete solution (22) for fixed k(,, .v(k0) = 0, and the input signal that has
entry. That is, ii (k) = 8(k k(,), where
all zero entries except for a unit pulse as the
This gives
e now is the i'1' column of
D

k=
k

k0l

In words, the zero-state response to u (k) = e,


k0) provides the i'1'-column of
D(k(,), and the i'1'-column of the matrix sequence C(k)1(k, k(, + 1)B(k0), k k(, + 1.
Repeating for each of the input signals, defined for i = 1, 2,..., m, provides the p x m
k k0 + 1.
matrix D (k(,) and the p x in matrix sequence C (k)b(k,
Unfortunately this information in general reveals little about the zero-state response to
other input signals. But we revisit this issue in Chapter 21 and find that for time-invariant
linear state equations the situation is much simpler.
Additional, standard terminology can be described as follows. The discrete-time
linear state equation (1) is called linear because the right side is linear in x(k) and u(k).
From (22) the zero-input solution is linear in the initial state, and the zero-state solution
is linear in the input signal. The zero-state response is called causal because the response
v (k) evaluated at any k = Ic1, k<, depends only on the input signal values
u (k0),..., 11(k0). Additional features of both the zero-input and zero-state response in

Chapter 20

394

Discrete Time: State Equations

general depend on the initial time, again an aspect that simplifies for the time-invariant
case discussed in Chapter 21.
Putting the default situation aside for a moment, similar formulas can be derived
for the complete solution of (1) backward in the time index under the added hypothesis

that A (k) is invertible for every k. We leave it as a small exercise in iteration to show
that the complete backward solution for the output signal is
kI

y(k) =

k k01

j =k
where of course the definition (20) is involved.
The iterative nature of the solution of discrete-time state equations would seem to
render features of the transition matrix relatively transparent. This is less true than might
be hoped, and computing explicit expressions for c1(k, j) in simple cases is educational.
20.9 Example

The transition matrix for

A(k)=

(24)

a(k)

can be computed by considering the associated pair of scalar state equations

x1(k+1) =x1(k), x1(k(,)=x01

x,(k+l)=a(k)x,(k) +x1(k),
and applying the complete solution formula to each. The first equation gives
x (k) =

and then the second equation can be written as

x,(k+l)=a(k)x7(k) +

x01 ,

From (21), with B(k)u(k) =x01 for kk0, we obtain

x2(k)=a(kl)a(k2)

a(1c0)x07

kI

a(k1)a(k2)

a(j+l)x01 , kk(,+l

j4,,
Repacking into matrix notation gives

a(j+l)

a(kl)a(k2)

a(k0)

Transition Matrix Properties

395

Note that the product convention can be deceptive. For example

(25)

a(0)
a conclusion that rests on interpreting the (2, 1)-entry as a sum of one empty product.
If a (k) 0 for all k, then A (k) is invertible for every k and (20) gives
0

k k0l
j=k

a(j) ... a(k+1)a(k)

a(k(,l)

... a(k+l)a(k)

Transition Matrix Properties


Properties of the discrete-time transition matrix rest on the simple formula (19), with the

occasional involvement of (20), and thus are less striking than continuous-time

counterparts. Indeed the properties listed below have easy proofs that are omitted. We
begin with relationships conveyed directly by (19).
20.10 Property
satisfies

The transition matrix 1(k, j) for the n x n matrix sequence A (k)

4(k + 1, j)

j),

=A

j
kj

(26)

It is traditional, and in some instances convenient, to recast these identities in


terms of linear, n x n matrix difference equations. Again, solutions of these difference
equations have essential one-sided natures.

20.11 Property The linear ii x ii matrix difference equation

X(k+1)=A(k)X(k),
has

X(k0)=I

(27)

the unique solution

X(k)=bA(k,

k(,), kk0

This property provides a useful characterization of the discrete-time transition


matrix. Furthermore it is easy to see that if the initial condition is an arbitrary n x ii
matrix X(k(,) = X0, in place of the identity, then the unique solution for k k0 is
X(k) =
k(,)X0.

Chapter 20

396

20.12

Discrete Time: State Equations

Property The linear ii x ii matrix difference equation


Z(kI) =AT(k_1)Z(k), Z(k11) =1

has

(28)

the unique solution

Z(k)

=tj(k(,,k), kk(,

From this second property we see that ZT(k) generated by (28) reveals the
behavior of the transition matrix
(k(,, k) as the second argument steps backward:
k = k0, k(?1, k02
The associated n x 1 linear state equation

:(kl) =AT(k_l):(k), z(k(,) =

k k1,

is called the adjoint state equation for


x (k + I) = A (k)x (k)

x (k0) = X() , k k(,

The respective solutions

:(k)
x(k) =

k):0, k k1,
,

in opposite directions. However if A (k) is invertible for every k, then both


solutions are defined for all k.
The following composition property for discrete-time transitionmatrices is
another instance where index-ordering requires attention.

proceed

20.13 Property

The transition matrix for an ii x n matrix sequence A (k) satisfies

'1(k, i) =

j)4(j, i),

j k

(29)

If A (k) is invertible for every k, then (29) holds without restriction on the indices I, j, k.

Invertibility of the transition matrix for an invertible A (k) is a matter of definition


in (20). For emphasis we state a formal property.

20.14 Property If the n x n matrix sequence A (k) is invertible for every k, then the
transition matrix D(k, j) is invertible for every k and j, and
(k. j) = c1(J, k)
Note

that failure of A (k) to be invertible at even a single value of k has


is not invertible, then '1(k, j) is not invertible for

widespread consequences. If A (k0)

(30)

Additional Examples

397

State variable changes are of interest for discrete-time linear state equations, and

the appropriate vehicle is an n x ii matrix sequence P (k) that is invertible at each k.


Beginning with (I) and letting
2(k) =
we easily substitute for x (k) and .v (k + I) in (I) to arrive at the corresponding linear
state equation in terms of the state variable (k):

z(k(?)=P'(kO)x(,

= C(k)P(k)z(k) + D(k)u(k)
One consequence of this calculation is a relation between two discrete-time transition
matrices, easily proved from the definitions.

20.15 Property

Suppose P(k) is an ii x n matrix sequence that is invertible at each k.

If the transition matrix for the ii x n matrix sequence A (k) is 'tA(k, j), k j, then the
transition matrix for

F(k) =

j)P(j), k j

j)

(32)

Additional Examples
We examine three additional examples to further illustrate features of the formulation
and solution of discrete-time linear state equations.

20.16 Example Often it is convenient to recast even a linear state equation in terms of
deviations from a nominal solution, particularly a constant, nonzero nominal solution.
Consider again the economic model in Example 20.1, and imagine (if you can) constant
A corresponding constant nominal solution can be
government expenditures, g (Ic) =
computed from
U

CX

CX

f3(al)
as

-I

1cc

Then the constant nominal output is

CL
1

cC
CL

Icc

(33)

Discrete Time: State Equations

Chapter 20

398

y=[l
the state equation in terms of deviations from this nominal solution, with
deviation variables defined by
We can rewrite

x6(k) =x(k), y8(k) =y(k)5

Straightforward substitution into the original state equation (3) gives

[cl)

x8(k+1)=

[1

l]x3(k) + g5(k)

(34)

The coefficient matrices are unchanged, and no approximation has occurred in deriving

this representation. An important advantage of (34) is that the nonnegativity constraint


on entries of the various original signals is relaxed for the deviation signals, within the
ranges of deviation signals permitted by the nominal values.
20.17 Example Another class of continuous-time systems that generates discrete-time
linear state equations involves switches that are closed periodically for a duration that is
a specified fraction of each period. For the electrical circuit shown in Figure 20.18,
suppose u (k) is the fraction of the k"-period during which the switch S is closed,
0 u (k) < 1. Let T denote the constant period, and suppose also that the driving
voltage
the resistance r, and the inductance I are constants.

20.18 Figure A switched electrical circuit.

Elementary circuit laws give the scalar linear state equation describing the current x (t)
as

+v(t)

The solution formula for continuous-time linear state equations yields

x(t)=e

+ 1-Je

v(t)dt

(35)

Additional Examples

399

In any interval kT t < (k +l)T, the voltage v(t) has the form

v(t)=

kTt<kT+u(k)T

0, kT+u(k)Tt<(k+l)T

Therefore evaluating (35) for t = (k + 1)T, t0

kT yields
AT+u(k)T

x [(k + 1 )T] =

(kT)

+T

[(kI)TtI

v5 dt

(36)

and computing the integral gives


kTu(k)T
5

[(k+l)Tr)
e

'

e rT/! [erTu(kT)II

v5 dr =

ii

If we assume that i-TI! is very small, then


+

i-Tzi

(kT)

In this way we arrive at an approximate representation in the form of a discrete-time

linear state equation,


x [(k + 1 )T] = e_ITll x (kT) +

This is an example of pulse-width modulation;


in Exercise 20.1.
20.19 Example

'

rT/l

u (kT)

more general formulation is suggested

To compute the transition matrix for

A(k)=

[i a(k)]

(37)

a mildly clever way to proceed is to write

A(k)=1 F(k)
where / is the 2 x 2 identity matrix, and

F(k)=
Since F (k) F (j) = 0

a(k)]

regardless of the values of k, j, the product computation

b(k,j)=[I+F(k1)][J+F(k2)]
becomes the summation

[l+F(j)]

Chapter 20

400

Discrete Time: State Equations

+ F(j)

't'(k, j) = 1 + F(k1) + F(k2) +


That is,
kI

D(k,

j) =

kj+1

(38)

In this example A (k) is invertible for every k, and (20) gives


fI
1

kj1

i=k

EXERCISES
Exercise 20.1

Suppose

the scalar input signal to the continuous-time, time-invariant linear state

equation

= Fz(t) + Gv(t)

is specified by a scalar sequence u(k), where 0 lu (k)I I, k =


fixed T> 0 and k

0,

as follows. For a

0, let

1, u(k)>0

v(t)=sgn[u(k)]= 0, u(k)=0
1, u(k)<0

and

v(f)= 0, kT-i-Iu(k)IT<t<(k+1)T
5, sketch v (t) to see why this is called pulse-width modulation.
Formulate a discrete-time state equation that describes the sequence z (kT). For small u (k) I,
show that an approximate linear discrete-time state equation description is
For u (k) = k/5, k = 0

z{(k+l)fl

(Properties of the continuous-time state equation solution are required for this exercise.)

Exercise 20.2

Consider a single-input, single-output, time-invariant, discrete-time, nonlinear

state equation
qI

x(k+l)

=
j=O
qI

y(k) =

j=I
q

+
j=o

j=I

where q is a fixed, positive integer. Under an appropriate assumption show that corresponding to

all but a finite number of constant nominal inputs u (k) = there exist corresponding constant
nominal trajectories
and constant nominal outputs
Derive a general expression for the
linearized state equation for such a nominal solution.

Exercises

401

Exercise 20.3 Linearize the nonlinear state equation

.v1(k+l)

.v,(k+l)

0.5.v1(k)+u(k)
x,(k).v1(k)u(k)+2u2(k)

v(k)= 0.5x,(k)
about constant nominal solutions corresponding to the constant nominal input u (k) =
any unusual features.

Explain

Exercise 20.4 Linearize the nonlinear state equation

.v1(k+l)

x,(k+l)

.r1(k)2u(k)
.v,(k)+2u2(k)

(k) = .v,(k) + 2.v1(k)u(k)


\Vhat is
about constant nominal solutions corresponding to the constant nominal input ,i(k) =
the zero-state response of the linearized state equation to an arbitrary input signal u6(k)?

Exercise 20.5

Consider a linear state equation with specified forcing function,

x(k4-l) =A(k)x(k) +f(k)


and specified twopoint how,darv conditions
H,,

-v

(k1) = Ii

(k,,) +

on x (k). Here H,, and H1 are ii x n matrices, Ii is an ii x I vector, and k1> k,,. Derive a necessary
and sufficient condition for existence of a unique solution that satisfies the boundary conditions.
Exercise 20.6

For the ii x ?Z matrix difference equation

X(k+l)=X(k)A(k), X(k,,)=X,,
express the unique solution for k k,, in terms of an appropriate transition matrix related to
j). Use this to determine a complete solution formula for the n xii matrix difference
equation

X(k4-t)=A1(k)X(k)A,(k)+F(k), X(k,,)=X,,
where A 1(k), A2(k). and the forcing function F(k) are ii x ii matrix sequences. (The reader versed
in continuous time might like to try the matrix equation

X(k+l)=A1(k)X(k)+X(k)A,(k)+F(k), X(k,,)=X,,
just to see what happens.)
Exercise 20.7 For the linear state equation (34) describing the national economy in Example
20.16, suppose a = 1/2 and = 1.Compute a general form for the state transition matrix.
Exercise 20.8

j) for

Compute the transition matrix

0 kO
A(k)=

Ok

000

Chapter 20

402

Exercise 20.9 Compute the transition matrix

Discrete Time: State Equations

j) for
1/2]

where cx is a real number

Exercise 20.10 Compute an expression for the transition matrix cD(k, j) for

A(k)=
Exercise 20.11

F(k) =AT(_k)?

If

j)

is

the transition matrix for A (k), what is the transition matrix for

Suppose A (k) has the partitioned form

Exercise 20.12

Ak

A11(k) A11(k)
0
A,,(k)

22(k) are square (with fixed dimension, of course). Compute an expression for
the transition matrix
j) in terms of the transition matrices for A (k) and A 22(k).
where A (k) and

Exercise 20.13 Suppose A (k) is invertible for all k. If x(k) is the solution of

x(k+1) =A(k)v(k) , x(k,,) =x,,


and z (k) is the solution of the adjoint state equation

:(k_l)=AT(k_l):(k),
derive a formula for :T(k)x(k).
Exercise 20.14 Show that the transition matrix for the n x n matrix sequence A (k) satisfies
AI

IIcb(k, k0)II
k0

1k1(k, 1)11 Ikb(j,

kk
I

k0 +

1.

Exercise 20.15 For n x n matrix sequences A (k) and F (k), show that
AI

Ic,,)

cD4(k, Ic,,) =

4A(k,

j+l)[F(j)

A(j)](1)F(j, k,,), k k. +

Exercise 20.16 Given an ii x n matrix sequence A (k) and a constant n x n matrix F, show (under

appropriate hypotheses) how to define a state variable change that transforms the linear state
equation

x(kl) = A(k)x(k)
into

:(k+l) =Fz(k)
What is the variable change if F = I? Illustrate this last result by computing
for Example 20.19.

(k)P(k)

Notes

403

Exercise 20.17 Suppose the n x ii matrix sequence A (k) is invertible at each k. Show that the
transition matrix for A (k) can be written in terms of constant ii x matrices as

j)
if and only if there exists an invertible matrix A satisfying
A(k+1)A1 =A1A(k)
for all k.

NOTES
Discrete-time and continuous-time linear system theories occupy parallel universes,
with just enough differences to make comparisons interesting. Historically the theory of difference
equations did not receive the mathematical attention devoted to differential equations. Somewhat
the same lack of respect was inherited by the system-theory community. This situation has been
changing rapidly in recent years as the technological world becomes ever more digital.
Treatments of difference equations and discrete-time state equations from a mathematical
point of view can be found in the recent books, listed in increasing order of sophistication,
Note 20.1

W.G. Kelley, A.C. Peterson, Difference Equations, Academic Press, San Diego, California, 1991

V. Lakshmikantham, D. Trigiante, Theory of Difference Equations, Academic Press, San Diego,


California, 1988
R.P. Agarwal, Difference Equations and Inequalities, Marcel Dekker, New York, 1992

Recent treatments from a system-theoretic perspective include


F.M. Callier and C.A. Desoer, Linear System Theoty, Springer-Verlag, New York, 1991
F. Szidarovszky and A.T. Bahill, Linear Systems Theoiy, CRC Press, Boca Raton, Florida 1992

Note 20.2 Existence and uniqueness properties of solutions to difference equations of the forms
we discuss, including the discrete-time nonlinear state equations, follow directly from the iterative
nature of the equations. But these properties can fail in more general settings. For example the
second-order, scalar linear difference equation (that does not fit the form in Example 20.6)

ky(k+2)y(k)=0,
with initial conditions y (0) =

1,

y (1)

k0

0 does not have a solution. And for two-point boundary

conditions, as posed in Exercise 20.5, there may not exist a solution.

Note 20.3 While iteration is the key concept in our theoretical solution of discrete-time state
equations, due to roundoff error it can be folly to adopt this approach as a computational tool. A
standard, scalar example is

x(k+l)=kx(/c)+u(k),
with

input

e = 2.718281

signal u (k) =
for all k, and initial
. The solution can be written as
1

x(k)=(kl)!( le

kl
state

=1

k1

e, where of course

Chapter 20

404

Discrete Time: State Equations

From the formula

J=I

it is clear that x (k) <0 for k 1. However solving numerically by iteration using exact arithmetic
but beginning with a decimal truncation of the initial state quickly yields positive solution values.
For example x (I) = I 2.718 produces .v (7) > 0.

Note 20.4

The plain fact

that

a discrete-time transition matrix need not be invertible

is

responsible for many phenomena that can be troublesome, or at least annoying. We encounter this

regularly in the sequel, and it raises interesting questions of reformulation. A discussion that
begins in an elementary fashion, but quickly becomes highly mathematical, can be found in

M. Fliess, 'Reversible linear and nonlinear discrete-time dynamics," IEEE Transactions on


Automatic Control, Vol. 37, No.8. pp. 1144 1153. 1992
Note 20.5 The direct trans,ni.csion term D (k )u (k) in the standard linear state equation causes a
dilemma. It should be included on grounds that a theory of linear systems ought to encompass the
identity system where D (k) is unity. C (k) is zero, and A (k) and B (k) are anything, or nothing.
Also it should be included because physical systems with nonzero D(k) do arise. In many topics,
for example stability and realization, the direct transmission (cmi is a side issue in the theoretical

development and causes no problem. But in other topics. for example feedback and the
polynomial fraction description, a direct transmission complicates the situation. The decision in
this book is to simplify matters by frequently invoking a zero-D(k) assumption.
Note 20.6 Some situations might lead naturally to discrete-time linear state equations in the
more general form

x(k+l) =

A.(k).v(kj) +
j=O
r

j=()

v(k) =

D(k)u(kj)

C(k).v(kj) +
j=O

j=O

Properties of such state equations in the time-invariant case, including relations to the q = r = 0
situation we consider, are discussed in

J. Fadavi-Ardekani. S.K. Mitra, B.D.O. Anderson. "Extended state-space models of discrete-time


dynamical systems," iEEE Transactions on Circuits and Svste,ns, Vol. 29. No. 8, pp. 547 556,
1982

Another form is the descriptor or singular linear state equation where x(k + I) in (I) is multiplied
by a not-always-invertible n x n matrix E (k + I). An early reference is
D.G. Luenberger,
Dynamic equations in descriptor form," IEEE Transactions on Automatic
Control, Vol. 22, No.3, pp.312321. 1977
See also Chapter 8 of the book

L. Dai, Singular Control Systems. LectureNotes in Control and Information Sciences, Vol. 118,
Springer-Verlag, Berlin, 1989
Finally there is the behavioral approach wherein exogenous signals are not divided into 'inputs'
and 'outputs.' In addition to the references in Note 2.4, a recent, advanced mathematical

Notes

405

treatment is given in

M. Kuijper, First-Order Representations of Linear Systems, Birkhauser, Boston, 1994

20.7 Remark In a number of applications, population models for example, linear state equations
arise where all entries of the coefficient matrices must be nonnegative, and the input, output, and
state sequences must have nonnegative entries. Such positive linear systems are introduced in
D.G. Luenberger, Introduction to Dynamic Systems, John Wiley, New York, 1979

Indeed nonnegativity requirements are ignored in some of our examples.


Note 20.8 There are many approaches to discrete-time representation of a continuous-time state
equation with digitally specified input signal. Some involve more sophisticated digital-to-analog

conversion than the zero-order hold in Example 20.3. For instance a first-order hold performs
straight-line interpolation of the values of the input sequence. Other approaches for timeinvariant systems rely on specifying the transfer function for the discrete-time state equation
(discussed in Chapter 21) more-or-less directly from the transfer function of the continuous-time
state equation. These issues are treated in several basic texts on digital control systems, for
example

K.J. Astrom, B. Wittenmark, Computer Controlled Systems, Second Edition, Prentice Hall,
Englewood Cliffs, New Jersey, 1990

C.L. Phillips, H.T. Nagle, Digital Control System Analysis and Design, Second Edition, Prentice
Hall, Englewood Cliffs, New Jersey, 1990
A more-advanced look at a variety of methods can be found in

Z. Kowalczuk, "On discretization of continuous-time state-space models: A stable-normal


approach," IEEE Transactions on Circuits and Systems, Vol. 38, No. 12, pp. 1460 1477, 1991

The reverse problem, which in the time-invariant case necessarily focuses on properties of the
logarithm of a matrix, also can be studied:

E.I. Verriest, "The continuization of a discrete process and applications in interpolation and
multi-rate control," Mathematics and Computers in Simulation, Vol. 35, pp.

15 31, 1993

21
DISCRETE TIME
TWO IMPORTANT CASES

special cases of the general time-varying linear state equation are examined in
further detail in this chapter. First is the time-invariant case, where all coefficient
Two

matrices are constant, and second is the case where the coefficients are periodic matrix
sequences. Special properties of the transition matrix and complete solution formulas are
developed for both situations, and implications are drawn for response characteristics.

Time-Invariant Case
If all coefficient matrices are constant, then standard notation for the discrete-time linear
state equation is

x(k+l) =Ax(k) + Bu(k)

)'(k)CX(k) +Du(k)
Of course we retain the ii x 1 state, m x I input, and p x 1 output dimensions.
The transition matrix for the matrix A follows directly from the general formula
in the time-varying case as

A is invertible, then this definition extends to k <j without writing a separate


formula. Typically there is no economy in using the transition-matrix notation when A
is constant, and we conveniently write formulas in terms of AA =
0), leaving
understood the default index range k 0.

Time-Invariant Case

407

Continuing to specialize discussions in Chapter 20, the complete solution of (1)

with specified initial state .v (k0) = x0 and specified input ii (k) becomes
Cx(, +

y(k)=

Dii (k0) ,
k-I

k =

kk(,+l

+
J=&.

(Often the k =

k0 case is not separately displayed, though it doesn't quite fit the general
summation expression.) From this formula, with a bit of manipulation, we can uncover a
key feature of time-invariant linear state equations.
Another formula for the response is obtained by replacing k by q = kk0, and
then changing the summation index from j to i = j k0,
qI

ql

+
1=1)

x0 and an input signal u (k) that


we can assume is zero for Ic <k0. Brief reflection shows that if the initial time k0 is
changed, but
remains the same, and if the input signal is shifted to begin at the new
initial time, the output signal is similarly shifted, but otherwise unchanged. Therefore
we set k0 = 0 without loss of generality for time-invariant linear state equations, and

This describes the evolution of the response to x (Ic0)

usually work with the complete response formula


kI

y(k) =

+ Du(k),

k I

j=0

If the matrix A is invertible, similar observations can be made for the backward
solution, and it is easy to generate the complete solution formula

k<0
we do not consider solutions for k <0 unless special mention is made.
All these equations and observations apply to the solution formula for the state
vector x(k) by the simple device of considering p = n. C = I,,, and D = 0. In this
setting it is clear from (3) that the zero-input response to x0 = e, the i'1'-column of 1,,, is
x(k)
the
of
Ic 0. In particular the matrix A, and thus the
transition matrix, is completely determined by the zero-input response values x(l) for
the initial states e
e,,, or in fact for any 11 linearly independent initial states.
To discuss properties of the zero-state response of (1), it is convenient to simplify
notation. By defining the p x ni matrix sequence
Again

D, k=0
CAk_IB,

kl

Discrete Time: Two Important Cases

Chapter 21

408

we can write the (forward) solution (3) as

y(k) =

kO

(5)

In this form it is useful to interpret G (k) as follows, considering first the scalar-input
case. Recall the scalar unit pulse signal defined by

1, k=O
0,

otherwise

(6)

Simple substitution into (5) shows that the zero-state response of (I) to a scalar unitpulse input is y (k) = G (k), k 0. If in 2, then the input signal u (k) = S(k)e1, where
now e, is the i1'-colunm of 1,,,, generates the i't'-column of G (k) as the zero-state
response. Thus G (Ic) is called, somewhat unnaturally in the multi-input case, the unitpulse response. From (5) we then describe the zero-state response of a time-invariant,
discrete-time linear state equation as a convolution of the input signal and the unit-pulse
response. Implicit is the important assertion that the zero-state response of (1) to any
input signal is completely determined by the zero-state responses to a very simple class
of input signals (a single unit pulse, the lonely at 0, if in = 1).
Basic properties of the discrete-time transition matrix in the time-invariant case
1

follow directly from the list of general properties in Chapter 20. These will not be
0) = A', Ic 0, is the
repeated, except to note the useful, if obvious, fact that
unique solution of the n x n matrix difference equation

X(k+l)=AX(k), X(0)=!
Further results particular to the time-invariant setting are left to the Exercises, while here
we pursue explicit representations for the transition matrix in terms of the eigenvalues of
A.

The :-transform, reviewed in Chapter 1, can be used to develop a representation


for AL as follows. We begin with the fact that
is the unique solution of the n x ii
matrix difference equation in (7). Applying the z-transform to both sides of (7) yields an
algebraic equation in X(z) = Z[X(k)] that solves to

X(:)=z(zI A)'
This implies, by uniqueness properties of the z-transform, and uniqueness of solutions to

adj(:IA)
det(z!A)
Of course det (zi A) is a degree-n polynomial in z, so (zi A)' exists for all but at
most ii values of z. Each entry of ad) (:1 A) is a polynomial of degree at most ii 1.
is a matrix of proper rational functions in z.
Therefore the z-transform of

Time-Invariant Case

409

From (9) we use the inverse :-transform to solve for the matrix sequence A',

k 0. First write
det (:1 A) = (: X,)'

(:

are the distinct eigenvalues of A with corresponding multiplicities


gives, after
a,,, I. Then partial fraction expansion of each entry in (:1
multiplication through by

where

?9

:(:l A)' =

a,

,,,

'

1=1 r=l

W11. is an n x n matrix of partial fraction expansion coefficients. Specifically each


entry of Wir is the coefficient of l/(: X,)r in the expansion of the corresponding entry
in the matrix (:1 AY'. (The matrix W,,. is complex if the corresponding eigenvalue
X, is complex.) In fact, using a formula for partial fraction expansion coefficients. W,,
can be written as

Each

Ir =

a1r

A'

r)! d: a1 r

= A1

The inverse :-transform of (10), from Table 1.10, then provides an explicit form for the
transition matrix AL in terms of the distinct eigenvalues of A:
,,,

a,

k0

sw,, [,k1]At+1_r

I=I r1

emphasize the understanding that any summand where A, has a negative exponent
must be set to zero. In particular for k = 0 the only possibly nonzero terms in (12) occur
for r = I, and a binomial-coefficient convention gives
We

"I

A" =1 =

W,1

Of course if some eigenvalues are complex, conjugate terms on the right side of (12) can
be combined to give a real representation for the real matrix sequence AL.

21.1 Example

To compute an explicit form for the transition matrix of

A=[0l
a

simple calculation gives


..2

l
1

2+1

:2

We

Discrete Time: Two Important Cases

Chapter 21

410

continue the computation via the partial fraction expansion (I =


1

z+l

1/(21)

1/(21)

Z1

Z+i

Multiplying through by z, and sometimes replacing i by its polar form


gives the inverse z-transform

z-I

,z+I

=z_I
..L

:/(21)

+z_I

:i
ikit/2

Table 1.10

:/(2i)

L
+ 2.e

= sin k7t/2

From this result and a shift property of the z-transform,

z'

] = sin [(k +l)it/2] = coskir/2

Therefore

cosk7t/2 sin kit/2


sin kit/2 cosk7t/2

k0

21.2 Example The Jordan form discussed in Example 5.10 also can be used to describe
Ak in explicit terms. With J = P'AP it is easy to see that

k0
Here J is block diagonal with r" diagonal block in the form

xl ...0
1

x
where A is an eigenvalue of A. Clearly jL also is block diagonal, with nh block
devise a representation for
we write

Jr

To

Al + Nr

where the only nonzero entries of Nr are 1 's above the diagonal. Using the fact that N,.
commutes with A.!, the binomial expansion can be used to obtain

Time-Invariant Case

411

k0

(16)

Calculating the general form of N? is not difficult since N, is nilpotent. For example in

the 3 x 3 case N,3 =

0,

and (16) becomes

=IXL +

k(kI)

kO

00

It is left understood that a negative exponent renders an entry zero.


Any time-invariant linear state equation can be transformed to a state equation
with A in Jordan form by a state variable change, and the resulting explicit nature of the
transition matrix is sometimes useful in exploring properties of linear state equations.
This utility is a bit diminished, however, by the occurrence of complex coefficient
matrices due to complex entries in P when A has complex eigenvalues.

DDI
The z-transform can be applied to the complete solution formula (5) by using the
convolution property and (9). In terms of the notation
Y(z)

Z[y(k)] , U(z) = Z[u(k)]

G(z) = Z[G(k)]

we obtain

Y(z) = zC(zI

+ G(z)U(z)

The linearity and shift properties of the z-transform permit computation of G(z) from
the definition of G (k) in (4) and the z-transform given in (9):
G(z) = Z [(D, CB, CAB, CA2B,
= C Z [(0, 1, A, A2, .

)] B + Z [(D, 0, 0, 0,...)]

=C(zJAY'B -i-D
This calculation shows that G(z) is a p x matrix of proper rational functions (strictly
proper if D = 0). Therefore (17) implies that if U(z) is proper rational, then Y(z) is
proper rational. Thus (17) offers a method for computing y (k) that is convenient for
obtaining general expressions in simple examples.
Under the assumption that
= 0, the relation between Y(z) and U(z) in (17) is
simply

Discrete Time: Two Important Cases

Chapter 21

412

Y(:) =

=[C(:1Ay'B +D]U(:)

(18)

G(:) is called the transfer function of the state equation. In the scalar-input case we
note that
I = 1, and thus confirm that the transfer function is the :-transform of
the zero-state response of a time-invariant linear state equation to a unit pulse. Also in
the multi-input case it is often said, again somewhat confusingly, that the transfer
function is the :-transform of the unit-pulse response.
and

21.3 Example For a time-invariant, two-dimensional linear state equation of the form.
similar to Example 20.4,

x(k+l)=

a0 a1

)'(k)= [c0

x(k) +

u(k)

ciIx(k) + du(k)

the transfer function calculation becomes

[:

G(z)= [CO

1
]

[?]

+ d

Since
l

:+a1

..2

a0 :+a1

+ a1: +

a0

a0

we obtain

G(:)=

c1: +

c0

a11

d=

d:2+(c1 +a1d)z +
+ a1: + a0

Periodic Case
The second special case we consider involves linear state equations with coefficients that

are repetitive matrix sequences. A matrix sequence F(k) is called K-periodic if K is a


positive integer such that for all k,

F(k +K) = F(k)


It is convenient to call the least such integer K the period of F(k). Of course if K = 1,
then F(k) is constant. This terminology applies also to discrete-time signals (vector or
scalar sequences).

Obviously a linear state equation with periodic coefficients can be expected to


have special properties in regard to solution characteristics. First we obtain a useful
representation for 1(k, j) under an invertibility hypothesis on the K-periodic A (k).

Periodic Case

413

(This property is a discrete-time version of the Floquet decomposition in Property 5.11.)

21.4 Property Suppose the ,z x n matrix sequence A (k) is invertible for every k
K-periodic. Then the transition matrix for A (k) can be written in the form

and

(20)

for all k, j, where R is a constant (possibly complex), invertible, n x ii matrix, and


P (k) is a K-periodic, n x ii matrix sequence that is invertible for every k.

Proof Define an ii x n matrix R by setting


=

0)

(This is a nontrivial step. It involves existence of a necessarily invertible, though not


unique, K" -root of the real, invertible matrix cD(K, 0), and a complex R can result. See

Exercises 21.11 and 21.12 for further development, and Note 21.1 for additional
information.) Also define P (k) via

P(k) =

(22)

0) R

Obviously P (k) is invertible for every k. Using the composition property, here valid
for all arguments because of the invertibility assumption on A (k), gives

=
K

0)R

=j
P(k +K) = t(k

+K, K)R'

is straightforward to show, from the definition of the transition matrix and the
periodicity property of A (k), that b(k +K, K) = b(k, 0) for all k. Thus we obtain
It

P(k+K)=P(k) forall k.
Finally we use Property 20.14 and (22) to write
D(0, j)
and

= RiP' (j)

then invoke the composition property once more to conclude (20).

21.5 Example

For the 2-periodic matrix sequence

A(k)=
we set

0]

Chapter 21

414

R2=4(2,0)=

Discrete Time: Two Important Cases

[_1
?]

which gives

In this case the 2-periodic matrix sequence P (k) is specified by

Confirmation of Property 21.4 is left as an easy calculation.

DOD
This representation for the transition matrix can be used to show that the growth
properties of the zero-input solution of a linear state equation, when A (k) is invertible
for every k and K-periodic, are determined by the eigenvalues of RK. Given any k0
and x (k(,) = xe,, we use the composition property and (20) to write the solution at time

k+JK, where kk0 and j>0,as


x (k +jK) =
=

+jK, k0)x0

4(k+jK, k+(j1)K)4(k+(jl)K, k+(j2)K) ..

k)4(k,

= P (k +jK)RKP_I (k +(j1)K)P (k +( fI )K)RKP_l (k +(j2)K)

P(k+K)R"P'(k)x(k)
The K-periodicity of P (k) helps deflate this expression to
x (k +jK) = P

(k)x (k)

(23)

Now the argument above Theorem 5.13 translates directly to the present setting. If all
eigenvalues of RK have magnitude less than unity, then the zero-input solution goes to
zero. If RK has at least one eigenvalue with magnitude greater than unity, there are
initial states (formed from corresponding eigenvectors) for which the solution grows
without bound.

The case where R has at least one unity eigenvalue relates to existence of Kperiodic solutions, a topic we address next. Since the definition of periodicity dictates
that a periodic sequence is defined for all k, state-equation solutions both forward and
backward from the initial time must be considered. Also, since an identically-zero
solution of a linear state equation is a K-periodic solution, we must carefully word
matters to include or exclude this case as appropriate.
21.6 Theorem Suppose A (k) is invertible for every k and K-periodic. Given any k0
there exists a nonzero x0 such that the solution of

Periodic Case

x(k+I)=A(k)x(k),

(24)

K-periodic if and only if at least one eigenvalue of RK = t(K, 0) is unity.

is

Proof

Suppose the real matrix RK

associated eigenvector. Then z0

is

has

a unity eigenvalue, and let

z0

be

an

real and nonzero, and the vector sequence

z(k) =
is

x(k0)=x0

RA_k0z0

well defined for all k since R is invertible. Also z (k) is K-periodic since, for any k,

z(k+K)=R k+Kk, z0=R kk, RKz0=R kk, z0


=z(k)
As

in the proof of Property 21.4, let

= P(k0)z0, Property 21.4


k) can be written as

P(k) =

Then with the initial state

gives that the corresponding solution of (24) (defined for all

x(k)=P(k)R kk, P

(k0)x0

=P(k)z(k)

(25)

Since both P (k) and z (k) are K-periodic, x (k) is a K-periodic solution of (24).
Now suppose that given any k0 there is an x0 0 such that the resulting solution
x (k) of (24) is K-periodic. Then equating the identical vector sequences

x(k)=P(k)R kk,, P (k0)x0


I

and

x(k+K) =
=

gives
(k0)x0 =

RKP_I (k0)x,

This displays the nonzero vector P' (k0)x0

as

an eigenvector of RK

associated

to a

unity eigenvalue of RK.


The sufficiency portion of Theorem 21.6 can be restated in terms of R

R".

rather

than

a unity eigenvalue, with corresponding eigenvector z0, then it is clear


from repeated multiplication of Rz0 = z0 by R that
has a unity eigenvalue, with z0
again a corresponding eigenvector. The reverse claim is simply not true, a fact we can
illustrate when A (k) is constant.
If R has

21.7 Example Consider the linear state equation with A given in Example 21.1. This
state equation fails to exhibit K-periodic solutions for K = 1, 2, 3 by the criterion in

Chapter 21

416

Discrete Time: Two Important Cases

Theorem 21.6, since A, A2, and A3 do not have a unity eigenvalue. However A4 = I,
and it is clear that every initial state yields a 4-periodic solution.

ODD
We next consider discrete-time linear state equations where all coefficient matrix
sequences are K-periodic, and the input signal is K-periodic as well. In exploring the
existence of K-periodic solutions, the output equation is superfluous, and it is convenient
to collapse the input notation to write

x(k+1) =A(k)x(k) +

f(k),

(26)

where f (k) is a K-periodic, n x 1 vector signal. The first result is a simple


characterization of K-periodic solutions to (26) that removes the need to explicitly
consider solutions for k <k,,.

21.8 Lemma A solution x (k) of the K-periodic state equation (26), where A (k) is
invertible for every k, is K-periodic if and only if x(k0+K) = x0.
Proof Necessity is entirely obvious. For sufficiency suppose a solution x (k)
satisfies the stated condition, and let :(k) = x (k K) x (k). Then z (k) satisfies the
linear state equation

z(k+1) =A(k):(k), z(k0) =

This has the unique solution :(k) = 0, both forward and backward in k, and we conclude
that x(k) is K-periodic.

DOD
Using this lemma we characterize existence of K-periodic solutions of (26) for
every K-periodic f (k). (Refinements dealing with a single, specified, K-periodic 1(k)
are suggested in the Exercises.)

Suppose A (k) is invertible for all k and K-periodic. Then for every
and every K-periodic f (k) there exists an x0 such that (26) has a K-periodic

21.9 Theorem
Ic0

solution if and only if there does not exist a

0 for which

z(k+1)=A(k):(k), z(k0)=z0

(27)

has a K-periodic solution.

Proof For any k0,

and K-periodic f(k), the corresponding (forward) solution of

(26) is
kI

x(k)=ct(k,k0)x0 +

b(k,j+1)f(j), kk0+l

By Lemma 21.8, x (Ic) is K-periodic if and only if

Periodic Case

417

[I

k,+KI

c1(k0 +K, J + 1)f

+K, k(,) ]x0 =

(28)

(j)

From Property 21.4 we can write

4(k0 +K, k0) =P(k0+K)RKP_I(k0)


=

P(k0)R"P'(k0)

and, similarly,

+K, j+1) =
Using these representations (28) becomes
k+KI

RK

P(k0)RLJ1P1(j+l)f(j)

(29)

Invoking Theorem 21.6 we will show that this algebraic equation has a solution
every k0 and every K-periodic f (k) if and only if RK has no unity eigenvalue.
First suppose
has no unity eigenvalue, that is,

for

det(/
Then it is immediate that (29) has a solution for x<, as desired.

Now suppose that (29) has a solution for every k0 and every K-periodic f (k).
Given k0, corresponding to any n x 1 vector f0 we can craft a K-periodic f (k) as
follows. Set

kk0,k0+1,..., k0-i-K--l

(30)

extend this definition to all k by repeating. (That f (k) is real follows from the
representation in Property 21.4.) For such a K-periodic f(k), (29) becomes
and

P(k0)[I_RK]P_I(k0)x=

f0=Kf0
j

=k,,

For every f (k) of the type constructed above, that is, for every n x 1
a solution for x0 by assumption. Therefore
det
and,

P(k0)[1

vector

f0, (31) has

=det(I

again, this is equivalent to the statement that no eigenvalue of RK is

unity.

ot:;JD

It is interesting to specialize this general result to a possibly familiar case. Note


that a time-invariant linear state equation is a K-periodic state equation for any positive
integer K, with R = A. Thus for various values of K we can focus on the existence of
K-periodic solutions for K-periodic input signals.

21.10

Discrete Time: Two Important Cases

Chapter 21

418

Corollary

For the time-invariant linear state equation

x(k+1)=Ax(k) +Bu(k), x(0)=x0

(32)

A is invertible. If A K has no unity eigenvalue, then for every K-periodic input


signal u (k) there exists an x0 such that the corresponding solution is K-periodic.
suppose

It is perhaps most interesting to reflect on Corollary 21.10 when all eigenvalues of


A have magnitude greater than unity. For then it is clear from (12) that the zero-input
response of (32) is unbounded, but evidently canceled by unbounded components of the
zero-state response to the periodic input when x0 is appropriate, leaving a periodic
solution. We further note that this corollary involves only the sufficiency portion of
Theorem 21.9. Interpreting the necessity portion brings in subtleties, a trivial instance of
which is the case B = 0.

EXERCISES
Exercise 21.1

Using two different methods, compute the transition matrix for


1/2 1/2

A=
Exercise 21.2

1/2 1/2

Using two different methods, compute the transition matrix for

lOt
A= 010
001

Exercise 21.3 For the linear state equation

x(k+l)

x(k)

i2

u(k)

llx(k)

y(k)= [1
compute the response when

u(k) = I,
= [

k0

Exercise 21.4 For the continuous-time linear state equation

i(t) =
y(t) = [0

x(t) +

u(t)
1

1 ]_v(t)

the output of a period-T zero-order hold. Compute the corresponding discretetime linear state equation, and compute the transfer functions of both state equations.
suppose a (1) is

Exercises
Exercise 21.5

for k

419
Given an ti x

ii

matrix A, show how to define scalar sequences cz0(k)

0 such that

k0
(By consulting Chapter 5, provide a solution more elegant than brute-force iteration using the
Cayley-Hamilton theorem.)

Exercise 21.6

Suppose the n xii matrix A has eigenvalues

X,,.

Define a set of a x a

matrices by

P0 =1, P1 =A Xi!, P, =(AX,!)(AX11)


P,_1

Show how to define scalar sequences 130(k)

(k) fork 0 such that

A savings account is described by the scalar state equation


x (k

where

+ I) = (1 + ru )x (k) + b ,

x (0) =

x(k) is the account value after k compounding periods, r >0 is the annual interest rate

(lOOr%) compounded 1 times per year, and b is the constant deposit (b > 0) or withdrawal (b <0)
at the end of each compounding period.
(a) Using a simple summation formula, show that the account value is given by

x(k) =

(1

+ blir) blIr, k 0

(b) The effective interest rate is the percentage increase in the account value in one year, assuming

h = 0. Derive a formula for the effective interest rate. For an annual interest rate of 5%, compute
the effective interest rate for the cases 1 = 2 (semi-annual compounding) and I = 12 (monthly
compounding).
(C) Having won the 'million dollar lottery,' you have been given a check for $50,000 and will
receive an additional check for this amount each year for the next 19 years. How much money
should the lottery deposit in an account that pays 5% annual interest, compounded annually, to
cover the 19 additional checks?
Exercise 21.8

The Fihonacci sequence is a sequence in which each value is the sum of its two
Devise a time-invariant linear state equation and initial

predecessors: 1, 1, 2, 3, 5, 8, 13
state

.v

(k + 1) = Ax (k) , x (0) =

= cx(k)

that provides the Fibonacci sequence as the output signal. Compute an analytical solution of the
state equation to provide a general expression for the
Fibonacci number. Show that
tim

y(k+l)
)'(k)

This is the golden ratio that the ancient Greeks believed to be the most pleasing value for the ratio
of length to width of a rectangle.

Discrete Time: Two Important Cases

Chapter 21

420

Exercise 21.9 Consider a time-invariant, continuous-time, single-input, single-output linear state

equation where the input signal is delayed by


+

seconds,

where T,, is a positive constant:

Gv(ITd), z(0)=z,,

y(t) = Cz(t)
Solving for z(t), I 0, given and an input signal 1'(t), requires knowledge of the input signal
values for Ti, t <0. (The initial state vector ;, and input signal values for t 0 suffice when
= 0. From this perspective we say that 'infinite dimensional' initial data is required when
> 0.) One way to circumvent the situation is to choose an integer I > 0 and constant T> 0 such
that Td = IT, and consider the piecewise-constant input signal

v(t) =v(kT), kTt


Revisiting

<(k+l)T

Example 20.3 and using the state vector


(kT)
v[(kI)T]

[(k I )T]

a discrete-time linear state equation relating (kT) and y (kT) to t' (kT) for k 0. What is
the dimension of the initial data required to solve the discrete-time state equation? What is the
transfer function of this state equation? Hint: The last question can be answered by either a
brute-force calculation or a clever calculation.
derive

Exercise 21.10

If G(z) is the transfer function of the single-input, single-output linear state

equation

x(k+I) =Ax(k) + bu(k)


y(k) =

+ du(k)

and A. is a complex number satisfying G(A.) = A, show that A. is an eigenvalue of the (n + 1)

x (n + I)

matrix

Ab

cd

with associated (right) eigenvector

[(Al AY'b]
Find a left eigenvector associated to A..

Exercise 21.11 Suppose M is an invertible ii x n matrix with distinct eigenvalues and K


positive integer. Show that there exists a (possibly complex) n x n matrix R such that

R"

is a

= Al

Exercise 21.12 By considering 2 x 2 matrices M with one nonzero entry, show that there may or
may not exist a2 x2 matrix R such that R2 =M.
Exercise 21.13

Consider the linear state equation with specified input

Exercises

421

x(k+l)=A(k)x(k) +f(k)
where A (k) is invertible at each k, and A (k) and! (k) are K-periodic. Show that there exists a Kperiodic solution x (k) if there does not exist a K-periodic solution of

:(k+l) =A(k):(k)
other than the constant solution z(k) = 0. Explain why the converse is not true. (In other words
show that the sufficiency portion of Theorem 21.9 applies, but the necessity portion fails when
considering a single J (k).)
Exercise 21.14

Consider the linear state equation with specified input

x(k+l) =A(k)x(k) +f(k)


where A (k) is invertible at each k, and A(k) and f(k) are K-periodic. Suppose that there are no
the solution of the state equation with
K-periodic solutions. Show that for every k0 and
x (k0) =

is unbounded for k

k0. Hint: Use the result of Exercise 21.13.

Exercise 21.15 Establish the following refinement of Theorem 21.9, where A(k) is K-periodic
there exists an
and invertible for every k, and f(k) is a specified K-periodic input. Given
such that the solution of

.v(k+l)=A(k)x(k) +f(k),
is K-periodic if and only if f (k) is such that

for every K-periodic solution z (k) of the adjoint state equation

z(k1) =AT(k_l)z(k)
Exercise 21.16 For what values of o is the sequence sinok periodic? Use Exercise 21.15 to
determine, among these values of o, those for which there exists an x0 such that the resulting
solution of

x(k+1)=
is

1' x(0)=x0

periodic with the same period as sin wk.

Exercise 21.17

Suppose that all coefficient matrices in the linear state equation

x(k+l) =A(k)x(k) + B(k)u(k), x(0) =x.


are K-periodic. Show how to define a time-invariant linear state equation, with the same
dimension n, but dimension-inK input,

:(k+1)=Fz(k) + Gi'(k)
such that for any
and any input sequence u (k) we have z (k) = x (h-K),
equation has a K-periodic output equation,

y(k) = C(k)x(k) + D(k)u(k)


show how to define a time-invariant output equation

0.

If the first state

422

Chapter 21

Discrete Time: Two Important Cases

= H:(k) + Ji'(k)
that knowledge of the sequence w(k) provides the sequence y(k). (Note that for the new state
equation we might be forced to temporarily abandon our default assumption that the input and
output dimensions are no larger than the state dimension.)
so

NOTES
The issue of K"-roots of an invertible matrix becomes more complicated upon
leaving the diagonalizable case considered in Exercise 21.11. One general approach is to work
with the Jordan form. Consult Section 6.4 of
Note 21.1

R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge,
England, 1991

Note 21.2 Using tools from abstract algebra, a transfer function representation can be developed
for time-varying, discrete-time linear state equations. See

E.W. Kamen, P.P. Khargonekar, K.R. Poolla, "A transfer-function approach to linear time-varying
discrete-time systems," SIAM Journal on Control and Optimization, Vol. 23, No. 4, pp. 550 565,
l985
Note 21.3 In Exercise 21.17 the time-invariant state equation derived from the K-periodic state
equation is sometimes called a K-lifting. Many system properties are preserved in this
correspondence, and various problems can be more easily addressed in terms of the lifted state
equation. The idea also applies to multi-rate sampled-data systems. See, for example,

R.A. Meyer, C.S. Burrus, "A unified analysis of multirate and periodically time-varying digital
filters," IEEE Transactions on Circuits and Systems, Vol. 22, pp. 162 168, 1975

and Section III of

PP. Khargonekar, A.B. Ozguler, "Decentralized control and periodic feedback," IEEE
Transactions on Automatic Control, Vol. 39, No. 4, pp. 877 882, 1994

and references therein.

22
DISCRETE TIME
INTERNAL STABILITY

Internal stability
k

0o ) of

deals with boundedness properties and asymptotic behavior (as

solutions of the zero-input linear state equation


x(k + I) = A (k)x (k) , x (k(,) =

While bounds on solutions might be of interest for fixed

k(, and
or for arbitrary
initial states at a fixed k0, we focus on bounds that hold regardless of the choice of k0.
In a similar fashion the concept we adopt relative to asymptotically-zero solutions is
independent of the choice of initial time. These 'uniform in k(,' concepts are the most
appropriate in relation to input-output stability properties of discrete-time linear state
equations that are developed in Chapter 27.
We first characterize stability properties of the linear state equation (1) in terms of
bounds on the transition matrix 4(k,
for A (k). While this leads to convenient
eigenvalue criteria when A (k) is constant, it does not provide a generally useful
stability test because of the difficulty in computing explicit expressions for D(k, k0).
Lyapunov stability criteria that provide effective stability tests in the time-varying case
are addressed in Chapter 23.

Uniform Stability
The first notion involves boundedness of solutions of (1). Because solutions are linear in

the initial state, it is convenient to express the bound as a linear function of the norm of
the initial state.
The discrete-time linear state equation (1) is called unifoi-mly stable if
there exists a finite positive constant y such that for any k0 and
the corresponding
22.1 Definition

solution satisfies
IIx(k)II

k k(,
423

Chapter 22

424

Discrete Time: Internal Stability

Evaluation of (2) at k = k0 shows that the constant y must satisfy )' 1. The
adjective uniform in the definition refers precisely tothe fact that must not depend on
the choice of initial time, as illustrated in Figure 22.2. A 'nonuniform' stability concept

can be defined by permitting 1 to depend on the initial time, but this is not considered
here except to show by a simple example that there is a difference.

yIIx,,II
IIxoII

II.t( )II

22.2

Figure Uniform stability implies the 1-bound is independent of k,,.

22.3 Example Various examples in the sequel are constructed from scalar linear state
equations of the form
f (k + 1)

x(k+1)= f(k) x(k), x(k(,)=x(,


where

(3)

f (k) is a sequence of nonzero real numbers. It is easy to see that the transition

scalar for such a state equation is

f(k)
defined for all k, j. For the purpose at hand, consider

f(k)=

kO

k<O

for which

exp { (k/2)[l _(_1)k]

+(j/2)[l

exp( _(k/2)[l_(_l)k} )

kO>j

1, O>kj
any j it is clear that
1)1 is bounded for k j. Thus given k(, there is a
constant y (depending on k0) such that (2) holds. However the dependence of on
is crucial, for if k0 is an odd positive integer and k = k0+ I,
Given

k0) =
This shows that there is no bound on 4(k0 + 1, k0) that holds independent of k0, and
therefore no bound of the form (2) with independent of k0. In other words the linear

Uniform Exponential Stability

425

state equation is not uniformly stable, but it could be called 'stable' since each initial
state yields a bounded response.

DDD
We emphasize again that Definition 22.1 is stated in a form specific to linear state
equations. Equivalence to a more general definition of uniform stability that is used also
in the nonlinear case is the subject of Exercise 22.1.
The basic characterization of uniform stability in terms of the (induced norm of

the) transition matrix is readily discernible from Definition 22.1. Though the proof
requires a bit of finesse, it is similar to the proof of Theorem 22.7 in the sequel, and thus
is left to Exercise 22.3.

22.4 Theorem
The linear state equation (I) is uniformly stable if and only if there
exists a finite positive constant y such that

1)11 y

for all k,

such that k j.

Uniform Exponential Stability


Next we consider a stability property for (I) that addresses both boundedness of
solutions and asymptotic behavior of solutions. It implies uniform stability, and imposes
an additional requirement that all solutions approach zero exponentially as k

22.5 Definition The linear state equation (1) is called unjformly exponentially stable if
there exist a finite positive constant y and a constant 0 X < I such that for any k(, and
x,, the corresponding solution satisfies

IIx(k) II

II ,

k k0

Again y is no less than unity, and the adjective uniform refers to the fact that
and X are independent of k0. This is illustrated in Figure 22.6. The property of uniform
exponential stability can be expressed in terms of an exponential bound on the transition
matrix norm.

It

IIx(k)II
k

22.6 Figure A decaying-exponential bound independent of

Chapter 22

426
22.7

Theorem

Discrete Time: Internal Stability

The linear state equation (1) is uniformly exponentially stable if and

only if there exist a finite positive constant y and a constant 0 X < 1 such that

j)II
for all k, j such that k

(6)

j.

Proof First suppose y> 0 and 0 A. < 1 are such that (6) holds. Then for any k0
and
the solution of (1) satisfies, using Exercise 1.6,
IIx (k) II = II
and

k0)

II

k k0

II ,

uniform exponential stability is established.

For the reverse implication suppose that the state equation (1) is uniformly
exponentially stable. Then there is a finite y> 0 and 0 A. < 1 such that for any k0 and
the corresponding solution satisfies
IIx(k)
Given any k0 and ka

11x0 II, k Ic0

be such that

Ic0, let
IIXuII

II

1,

IkI)(k0,k0)X011 = Ikb(k0,k0)II

(Such an
exists by definition of the induced
yields a solution of (1) that at time k0 satisfies

IIX(k0) II =
II

k0)x0

norm.) Then the initial state x(k0) =

lI(I)(ka, k0)II

II

= 1, this shows that

k0)II 'Y?""
Because

(7)

such an Xa can be selected for any k0 and k,, k0, the proof is complete.

DOD
Uniform stability and uniform exponential stability are the only internal stability
concepts used in the sequel. Uniform exponential stability is the most important of the
two, and another theoretical characterization is useful.

22.8 Theorem The linear state equation (1) is uniformly exponentially stable if and
only if there exists a finite positive constant 13 such that

forall Ic, j suchthat

kj+1.

Proof If the state equation is uniformly exponentially stable, then by Theorem 22.7
there exist finite y> 0 and 0 A. < I such that

Uniform Exponential Stability

427

i)II
for all k, i such that k

i. Then, making use of a change of summation index, and the

fact that OX< 1,


k

1k1(k, i)II
i=j+l
kfI

q=O

(/=0

that k j + I. Thus (8) is established with 13 = y/( 1 A).


Conversely suppose (8) holds. Using the idea of a telescoping summation, we can

for all k, j such


write

ct(k,j)=I

+
I =j + I

[4(k,i)A(il)b(k,i)]

=1 +
I =j + I

Therefore,

using the fact that (8) with k =j+2 gives the bound IIA(j+1)II

<131,

for

all j,
1)11 IIA(i-1) - /ll

1k1(k, 1)11 1 +
+I

Ikb(k,i)II
I =f + I

that k j + 1. In completing this proof the composition property of the


transition matrix is crucial. So long as k j + 1 we can write, cleverly,
for all k, j such

1k1(k,j)II(kj)=

IkD(k,j)II
i=j + I

IkD(k, i)II Ikb(i, j)lI


i=j + I

13(l

Discrete Time: Internal Stability

Chapter 22

428

From this inequality pick an integer K such that K 213(1 + 132), and set k =

j + K to

obtain

+K,

(10)

1/2

1)11

for all j. Patching together the bounds (9) and (10) on time-index ranges of the form
k =j +qK
j +(q + l)Kl yields the following inequalities.

=j

j+Kl

Ikb(k, 1)11 = 1k1(k, j

+K, 1)11 <

k=j+K

j+2Kl

+ 2K, j + K)D(j + K, j)

IIcD(k, 1)11 = IIcD(k, j +

+K, j)II

J +K)Il

k1(k, j+2K)II IkD(j+2K, j +


,

k=j+2K

1)11

j+3K1

Continuing in this fashion shows that, for any value of j,


,

k=j+qK,...,j+(q+1)K1

(11)

Figure 22.9 offers a picturesque explanation of the bound (11), and with A =

and

y=2(l+f32) we have
IkD(k,

1)11

for all k, j such that k j. Uniform exponential stability follows from Theorem 22.7.

j
22.9 Figure

22.10 Remark
k

j+

is that

j+K

j+2K

j+3K

Bounds constructed in the proof of Theorem 22.8.

A restatement of the condition that (8) holds for all k, j

such that

Uniform Exponential Stability

429

IIcD(k,i)II13
holds for all k. Proving this small fact is a recommended exercise.

DOD
For time-invariant linear state equations, where A (k) = A and
j) =
a
summation-variable change in (8) shows that uniform exponential stability is equivalent
to existence of a finite constant 13 such that

IIAtII13

(12)

k =0

The adjective 'uniform' is superfluous in the time-invariant case, and we drop it in clear

contexts. Though exponential stability usually is called asymptotic stability when


discussing time-invariant linear state equations, we retain the term exponential stability.

Combining an explicit representation for AL developed in Chapter 21 with the


finiteness condition (12) yields a better-known characterization of exponential stability.

22.11 Theorem A linear state equation (1) with constant A (k) = A is exponentially
stable if and only if all eigenvalues of A have magnitude strictly less than unity.

Suppose the eigenvalue condition holds. Then writing AL. as in (12) of

Proof

Chapter 21, where

X,7, are the distinct eigenvalues of A, gives


00

a1

00

W1,

II

k=0

k=0

1=1

r=I

a1

00

IIWirII
/=1

?.k+I1
I

(13)

=1

and the fact that for fixed


the binomial coefficient is a
polynomial in k, an exercise in bounding infinite sums (namely Exercise 22.6) shows
that the right side of(13) is finite. Thus exponential stability follows.
If the magnitude-less-than-unity eigenvalue condition on A fails, then appropriate
selection of an eigenvector of A as an initial state can be used to show that the linear
state equation is not exponentially stable. Suppose first that
is a real eigenvalue
Using

< 1, I?41 =

satisfying
I I, and let p be an associated (necessarily real) eigenvector. The
eigenvalue-eigenvector equation easily yields
I

kO
for the initial state x0 = p it is clear that the corresponding solution of (I),
x(k) = ALp, does not go to zero as k oo. (Indeed IIx(k)II grows without bound if

Thus
I

>1.) Therefore the state equation is not exponentially stable.

Chapter 22

430

Discrete Time: Internal Stability

Now suppose that


is a complex eigenvalue of A with
an eigenvector associated with written

1. Again let p be

p = Re [p1 + i Im[p I
Then

Ixik lip" lipli,

llAtpll
and

this shows that

ALp =ALRe[p] + iAklnl[p]


does not approach zero as k * oc. Therefore at least one of the real initial states
= Re [p1 or x0 = Im [p 1 yields a solution of (1) that does not approach zero. Again
this implies the state equation is not exponentially stable.
EJOD
This

proof, with a bit of elaboration, shows also that

= 0

lirnk

is a

necessary and sufficient condition for uniform exponential stability in the time-invariant
case. The analogous statement for time-varying linear state equations is not true.

22.12 Example

Consider a scalar linear state equation of the form introduced in

Example 22.3, with

k0

f(k)=

(14)

1/k, k>O

Then

k0/k, kk0>0
k0) =

1,

It is obvious that for any k0, limk _,


exist positive y and 0 X < 1 such that

0>kk0

Ic0) = 0. However with Ic0 =

suppose there

kl
This implies

k1
which is a contradiction since 0

<

1.

Thus the state equation is not uniformly

exponentially stable.

ODD
It is interesting to observe that discrete-time linear state equations can be such that
the response to every initial state is zero after a finite number of time steps. For example

Uniform Asymptotic Stability

431

suppose that A (k) is a constant, nilpotent matrix of the form N, in Example 21.2. This
'finite-time asymptotic stability' does not occur in continuous-time linear state
equations.

Uniform Asymptotic Stability


Example 22.12 raises the question of what condition is needed in addition to
k<,) = 0 to conclude uniform exponential stability in the time-varying case.
The answer turns out to be a uniformity condition, and perhaps this is best examined in
terms of another stability definition.
22.13 Definition The linear state equation (1) is called uniformly asymptotically stable
if it is uniformly stable, and if given any positive constant 6 there exists a positive
integer K such that for any k0 and x0 the corresponding solution satisfies

IIx(k)II

kk0+K

Note that the elapsed time K until the solution satisfies the bound (15) must be
independent of the initial time. (It is easy to verify that the state equation in Example
22.12 does not have this feature.) The same tools used in proving Theorem 22.8 can be
used to show that this 'elapsed-time uniformity' is key to uniform exponential stability.
22.14 Theorem The linear state equation (1) is uniformly asymptotically stable if arid
only if it is uniformly exponentially stable.

Proof Suppose that the state equation is uniformly exponentially stable, that is,
whenever k j.
there exist finite positive ? and 0 < < 1 such that Ikb(k, J)
Then the state equation clearly is uniformly stable. To show it is uniformly
asymptotically stable, for a given 3> 0 select a positive integer K such that A!' 6/y.
Then for any k0 and
and k k0 + K,

IIx(k)II =

k0)x0 II

k0)II

II

kk0+K
This demonstrates uniform asymptotic stability.
Conversely suppose the state equation is uniformly asymptotically stable.
Uniform stability is implied, so there exists a positive
such that

y
for all k, j such that k j. Select 6 = 1/2 and, relying on Definition 22.13, let K be a
positive integer such that (15) is satisfied. Then given a k0, let
be such that
Il.VaiI = 1 and

Chapter 22

432

Discrete Time: Internal Stability


k0)

II

With the initial state x(k(,) =


II

X (k0 +

xe,,

K) II =

II

the solution of(l) satisfies


+ K,

II = II

+ K, k0) I

II Xa II

from which
IIcb(k0

+K,

k1,)II

1/2

Of course such an x0 exists for any given k0, so the argument compels (17) for any
Now uniform exponential stability is implied by (16) and (17), exactly as in the proof of
Theorem 22.8.

Additional Examples
Usually in physical examples, including those below, the focus is on stable behavior. But

it should be remembered that instability can be a good thingfrugal readers might


contemplate their savings accounts.

In the setting of Example 20.16, where the economic model in


22.15 Example
Example 20.1 is reformulated in terms of deviations from a constant nominal solution,
constant government spending leads to consideration of the linear State equation

x8(k+l)=

.v6(0)

In this context exponential stability refers to the property of returning to the constant

nominal solution from a deviation represented by the initial state. The characteristic
polynomial of the A-matrix is readily computed as
det

and further algebra yields the eigenvalues

a(j3+l)
2

Even in this simple situation

it

is messy to analyze the eigenvalue condition for

exponential stability. Instead we apply elementary facts about polynomials, namely that
+ 1). This
the product of the roots of (19) is c43, while the sum of the roots is

together with the restrictions 0 < cc <

and

> 0 on the coefficients in the state

equation leads to the conclusion that (18) is exponentially stable if and only if

<

1.

22.16 Example Cohort population models describe the evolution of populations in


different age groups as time marches on, taking into account birth rates, survival rates,

Exercises

433

immigration rates. We describe such a model with three age groups (cohorts) under
the assumption that the female and male populations are identical. Therefore only the
female populations need to be counted.
and

In year k

let x1(k) be

the population in the oldest age group, x,(k)

be

the

population in the middle age group, and .v1(k) be the population in the youngest age
group. We assume that in year k + I the populations in the first two age groups change
according to

x1(k+l)

= 137x7(k) + zi1(k)

v,(k+l) = 133x3(k)+

u,(k)

(20)

are survival rates from one age group to the next, and zi 1(k) and
u2(k) are immigrant populations in the respective age groups. Assuming the birth rates
and a3, the population of the youngest
(for females) in the three populations are a1,
age group is described by

where 137 and

x3(k+1)

= a1x1(k) + a7x7(k) + a3.v3(k) + u3(k)

Taking the total population as the output signal, we obtain the linear state equation

10

a1a,a3

y(k)= [i

001

ti(k)

1]v(k)

(21)

Notice that all coefficients in this linear state equation are nonnegative.

For this model exponential stability corresponds to the vanishing of the three
cohort populations in the absence of immigration, presumably because survival rates and

birth rates are too low. While it is difficult to check the eigenvalue condition for
exponential stability in the absence of numerical values for the coefficients, it is not
difficult to confirm the basic intuition. Indeed from Exercise 1.9 a sufficient condition for
exponential stability is hA II < 1. Applying a simple bound for the matrix norm in terms
of the matrix entries, from Chapter 1, it follows that if
a1, a2, a3, 132' 133 < 1/3

then the linear state equation is exponentially stable.

EXERCISES
Exercise 22.1

Show that uniform stability of the linear state equation

x(k+l) =A(k)x(k) ,
is

equivalent to the following property. Given any positive constant e there exists a positive
such that, regardless of k0, if 11x0 hi & then the corresponding solution satisfies

constant
iix(k)hi

Eforallkk0.

Chapter 22

434
Exercise 22.2
equation

Discrete Time: Internal Stability

Prove or provide counterexamples to the following claims about the linear state

.v(k+l) =A(k)x(k)
(i) If there exists a constant a < 1 such that IA (k) II a for all k, then the state equation is
uniformly exponentially stable.

(ii) If IA (k) II < I for all k, then the state equation is uniformly exponentially stable.

(iii) If the state equation is uniformly exponentially stable, then there exists a finite constant a
such that IA (k) II a for all k.
Exercise 22.3

Prove Theorem 22.4.

Exercise 22.4

For the linear state equation

.v(k+l) =A(k)x(k)
let

k)II

=sup

j=O, I,

where supremum means the least upper bound. Show that the state equation is uniformly
exponentially stable if and only if

lim t'J'J < I


Exercise 22.5 Formulate discrete-time versions of Definition 6.14 and Theorem 6.15 (including
its proof) on Lyapunov transformations.
Exercise 22.6

If

is a complex number with I A. < 1, show how to define a constant

such that

k0

klXkI

Use this to bound k I A. k by a decaying exponential sequence. Then use the well-known series

lal<1
to derive a bound on
I

L =0

where j is a nonnegative integer.

Exercise 22.7

Show that the linear state equation

x(kl) =A(k)x(k)
is uniformly exponentially stable if and only if the state equation

z(k+l) =AT(_k):(k)
is uniformly exponentially stable. Show by example that this equivalence does not hold for

z(k + 1) = A T(k)z (k). Hint: See Exercise 20.11, and for the second part try a 2-dimensional, 3periodic case where the A (k)'s are either diagonal or anti-diagonal.

435

Exercises
Exercise 22.8 For a time invariant linear state equation

x(k+l) =Ax(k)
use techniques from the proof of Theorem 22.11 to derive both a necessary condition and a
sufficient condition for uniform stability that involve only the eigenvalues of A. Illustrate the gap
in your conditions by is = 2 examples.
Exercise 22.9

For a time invariant linear state equation

x(k+l) =Ax(k)
derive a necessary and sufficient condition on the eigenvalues of A such that the response to any x0
is identically zero after a finite number of steps.

Exercise 22.10 For what ranges of constant a is the linear state equation

.v(k+l)=
not

1/2

aL

1/2

x(k)

uniformly exponentially stable? Hint: See Exercise 20.9.

Exercise 22.11

Suppose the linear state equations (not necessarily the same dimension)

.r(k+l) =At1(k)x(k),

:(k-s-l) =A22(k):(k)

are uniformly exponentially stable. Under what condition on A 2(k) will the linear state equation
with

Ak

A11(k) A12(k)
0

A,,(k)

be uniformly exponentially stable? Hint: See Exercise 20.12.

Exercise 22.12 Show that the linear state equation

x(k+1) =A(k)x(k)
is

uniformly exponentially stable if and only if there exists a finite constant ysuch that

IkI(k,i)112y
i=J+I

forallk,

jwithkj+l.

Exercise 22.13 Prove that the linear state equation

x(k+l) =A(k)x(k)
is uniformly exponentially stable if and only if there exists a finite constant

IkD(i, j)II
i=j + I

for all k, j such that k j + I.

such that

436

Chapter 22

Discrete Time: Internal Stability

NOTES
A wide variety of stability definitions are in use. For example a list of 12 definitions
(in the context of nonlinear state equations) is given in Section 5.4 of
Note 22.1

R.P. Agarwal, Difference Equations and Inequalities, Marcel Dekker, New York, 1992

Note 22.2 A well-known tabular test on the coefficients of a polynomial for magnitude-lessthan-unity roots is the Jury criterion. This test avoids the computation of eigenvalues for stability
assessment, and it is particularly convenient for low-degree situations such as in Example 22.15.
An original source is

El. Jury, J. Blanchard. "A stability

test for linear discrete-time systems in

Proceedings of the Institute of Radio Engineers, Vol. 49, pp.

table form,''

1947 1948, 1961

and the criterion also is described in most elementary texts on digital control systems.

Using more sophisticated algebraic techniques, a characterization of uniform


Note 22.3
asymptotic stability for time-varying linear state equations is given in terms of the spectral radius
of a shift mapping in
E.W. Kamen, PP. Khargonekar, K.R. Poolla, "A transfer-function approach to linear time-varying
discrete-time systems," SIAM Journal on Control and Optimi:aiion. Vol. 23, No. 4, pp. 550 565,
1985

Note 22.4 Do the definitions of exponential and asymptotic stability seem unsatisfying, perhaps
because of the emphasis on that never-quite-attained zero state ('asymptopia')? An alternative is
to consider concepts of finite-time stability as in

L. Weiss, J.S. Lee, "Stability of linear discrete-time systems in a finite time interval," Automation
and Remote Control, Vol. 32, No. 12, Part I, pp. 1915 1919, 1971 (Translated from Aviomatika i
Teleniekhanika, Vol.32, No. 12, pp.63 68, 1971)

However asymptotic notions of stability have demonstrated greater theoretical utility, probably

because of connections to other issues such as input-output stability considered in Chapter 27.

23
DISCRETE TIME
LYAPUNOV STABILITY CRITERIA

We

discuss Lyapunov criteria for various stability properties of the zero-input linear

state equation

x(k+l)=A(k)x(k), .v(k0)=x0
In continuous-time systems these criteria arise with the notion that total energy of an
unforced, dissipative mechanical system decreases as the state of the system evolves in

time. Therefore the state vector approaches a constant value corresponding to zero
energy as time increases. Phrased more generally, stability properties involve the
growth properties of solutions of the state equation, and these properties can be
measured by a suitable (energy-like) scalar function of the state vector. This viewpoint
carries over to discrete-time state equations with little more than cosmetic change.
To illustrate the basic idea, we seek conditions that imply all solutions of the linear
oo. For any
state equation (1) are such that IIx(k)Il2 monotonically decreases as k
solution x(k) of(l), the first difference of the scalar function
IIx(k)112 =xT(k)x(k)
can

be written as

Hx(k)112=xT(k)[AT(k)A(k)I]x(k)

In this computation x(k +I) is replaced by A (k)x (k) precisely because x (k) is a
solution of (1). Suppose that the quadratic form on the right side of (3) is negative
definite, that is, suppose the matrix AT(k)A (k) ! is negative definite at each k. (See
the review of quadratic forms and sign definiteness in Chapter 1.) Then II x (k) 112
decreases as k increases. It can be shown that if this negative definiteness does not
asymptotically vanish, that is, if there is a v > 0 such that xT(k)[AT(k)A(k)_I]x(k)
_vxT(k)x(k) forall k,then lIx(k)112 decreases tozero as
437

Discrete Time: Lyapunov Stability Criteria

Chapter 23

438

Notice that the transition matrix for A (k) is not needed in this calculation, and
growth properties of the scalar function (2) depend on sign-definiteness properties of the

quadratic form in (3). Although this particular calculation results in a restrictive


sufficient condition for a type of asymptotic stability, more general scalar functions than
(2) can be considered.

Formalization of this introductory discussion involves definitions of timedependent quadratic forms that are useful as scalar functions of the state vector of (1) for
stability purposes. Such quadratic forms are called quadratic Lyapunov functions. They
can be written as xTQ (k)x, where Q (k) is assumed to be symmetric for all k. If x (k) is
a solution of (1) for k k0, then we are interested in the increase or decrease of
xT(k)Q(k)x(k) for k Ic0. This behavior can be assessed from the difference

xT(k+1)Q(k+l)x(k+l) _xT(k)Q(k).v(k)
Replacing x(k + 1) by A (k )x (Ic) gives

xT(k+1)Q(k+l)v(k+l) xT(k)Q(k).v(k)

_VT(k)[AT(k)Q(k+l)A(k) Q(k)]x(k)
To analyze stability properties, various bounds are required on a quadratic
Lyapunov function and on the quadratic form (4) that arises as the first difference along
solutions of (1). These bounds can be expressed in a variety of ways. For example the
condition that there exists a positive constant
such that

Q(k)iI
for all k is equivalent by definition to existence of a positive 11 such that
VTQ(k)x111x112

for all k and all /1 x 1 vectors x. Yet another way to write this is to require that there
exists a symmetric, positive-definite, constant matrix M such that

xTQ(k)x XTMX

for all k and all n x I vectors x. The choice is largely a matter of taste, and the
economical sign-definite-inequality notation in (5) is used here.

Uniform Stability
We first consider the property of uniform stability, where solutions are not required to
inevitably approach zero.

23.1 Theorem The linear state equation (1) is uniformly stable if there exists an n x n
matrix sequence Q (Ic) that for all k is symmetric and such that
(6)

AT(k)Q(k+l)A(k)
where

Q(k)O

and p are finite positive constants.

(7)

Uniform Stability

439

Proof Suppose Q (k) satisfies the stated requirements. Given any k(, and x0, the
corresponding solution x (k) of (I) is such that, using a telescoping sum and (7),
AI

.v'(k)Q(k)x(k) vj,Q

[.VT(j+l)Q(j+l)x(j+l) xT(j)Q(j).v(j)]

j=k,,
AI

xT(j)[AT(j)Q(j+l)A(j)

Q(j)]x(j)

kk1,+l
From this and the inequalities in (6), we obtain first
xT(k)Q (k)x(k)

(k0)x1, p II-t0 112 ,

k k0

and then

112, kk0
Therefore

IIx(k)II
Since

(8) holds for any

k k(,

(8)

and k0, the state equation (1) is uniformly stable by

Definition 22.1.

DOD
A quadratic Lyapunov function that proves uniform stability for a given linear
state equation can be quite complicated to construct. Simple forms typically are chosen
for Q (k), at least in the initial stages of attempting to prove uniform stability of a

particular state equation, and the form is modified in the course of addressing the
conditions (6) and (7). Often it is profitable to consider a family of linear state equations
rather than a particular instance.
23.2 Example

Consider a linear state equation of the form

.r(k+l)=
a scalar sequence defined for all k. We will choose Q (k) = 1,
xT(k)Q(k)x(k) = XT(k).v(k) = llx(k)112. Then (6) is satisfied by 11 = p = I, and

where a (k) is

AT(k)Q(kl)A(k) Q(k)

so

that

=AT(k)A(k)

a2(k)l 0
0

Applying the negative-semidefiniteness criterion in Theorem 1.4, given more explicitly

for the 2 x 2 case in Example 1.5, would be technical hubris in this obvious case. Clearly

Chapter 23

440

Discrete Time: Lyapunov Stability Criteria

if
1 for all k, then the hypotheses in Theorem 23.1 are satisfied. Therefore we
have proved (9) is uniformly stable if Ia(k)I is bounded by unity for all k. A more
sophisticated choice of Q (k), namely one that depends appropriately on a (k), might
yield uniform stability under weaker conditions on a (k).

Uniform Exponential Stability


Theorem 23.1 does not suffice for uniform exponential stability. In Example 23.2 the
choice Q (k) = I proves that (9) with constant a (k) = 1 is uniformly stable, but
Example 21.1 shows this case is not exponentially stable. The needed strengthening of
conditions appears slight at first glance, but this is deceptive. For example Theorem 23.3
with Q (k) = I fails to apply in Example 23.2 for any choice of a (k).
It is traditional to present Lyapunov stability criteria as sufficient conditions based
on assumed existence of a Lyapunov function satisfying certain requirements. Necessity
results are stated separately as 'converse theorems' typically requiring additional
hypotheses on the state equation. However for the discrete-time case at hand no
additional hypotheses are needed, and we abandon tradition to present a Lyapunov
criterion that is both necessary and sufficient.

23.3 Theorem The linear state equation (1) is uniformly exponentially stable if and
only if there exists an n x n matrix sequence Q (k) that for all k is symmetric and such
that

(10)

AT(k)Q(k+1)A(k)
where

Q(k)

(11)

vi

p and v are finite positive constants.

Proof Suppose Q (k) is such that the conditions of the theorem are satisfied. For
and corresponding solution x(k) of the linear state equation, (11) gives, by
any k0,
definition of the matrix-inequality notation,

kk(,
From (10),

k k0
so that

- IIx(k)112

Therefore

kk0
and this implies

Uniform Exponential Stability

441

*)xT(k)Q(k)x(k), kk0

(12)

It is easily argued from (10) arid (11) that p v, so

<I

01

Setting A2 = 1v/p and iterating (12) for k k0 gives

xT(k)Q (k)x(k)

(k(,).v(,,

k k(,

Using (10) again we obtain

kk0

(13)

and taking the


Note that (13) holds for any X() and k(,. Therefore dividing through by
positive square root of both sides establishes uniform exponential stability.

Now suppose that (I) is uniformly exponentially stable. Then there exist y> 0
and 0 A < such that, purposefully reversing the customary index ordering,
1

k)II
for all j, k such that j

k. We proceed to show that

Q(k) =

k)

(14)

j =1

satisfies all the conditions in the theorem. First compute the bound (using A2 < 1)
k) II

j=L

j=L

q=O

<
IA-

that holds for all k. This shows convergence of the infinite series in (14), so Q (k) is
well defined, and also supplies a value for the constant p in (10). Clearly Q(k) in (14)
is symmetric for all k, and the remaining conditions involve the constants i in (10) and

v in(1l).
Writing (14) as

Q(k)=I

j=k+I

Chapter 23

Discrete Time: Lyapunov Stability Criteria

it is clear that Q (k) / for all k, so we let


Property 20.10 to obtain

= 1. To define a suitable v, first use

AT(k)Q(k+l)A(k)=
j=A+I

[D(j,k+1)A(k)JT.D(j,k+1)A(k)

j=k+I
=

k)T1(j. k)

Therefore Q (k) in (14) is such that

AT(k)Q(k+l)A(k) Q(k) =

and we let v = 1 to complete the proof.


ClOD
For n = 2 and constant Q (k) = Q, the sufficiency portion of Theorem 23.3 admits
a simple pictorial representation. The condition (10) implies that Q is positive definite,
and therefore the level curves of the real-valued function xTQX are ellipses in the

(xi, x2)-plane. The condition (11) implies that for any solution x(k) of the state
equation, the value of xT(k)Qx(k) is decreasing as k increases. Thus a plot of the
solution x(k) on the (x1, x2)-plane crosses smaller-value level curves as k increases, as
shown in Figure 23.4. Under the same assumptions a similar pictorial interpretation can
be given for Theorem 23.1. Note that if Q (k) is not constant, then the level curves vary
with k and the picture is much less informative.

.vl

23.4 Figure A solution x(k) in relation to level curves for .VTQV.

When applying Theorem 23.3 to a particular state equation, we look for a Q (k)
that satisfies (10) and (11), and we invoke the sufficiency portion of the theorem. The

Instability
necessity

443

portion provides only the comforting thought that a suitably diligent search

will succeed if in fact the state equation is uniformly exponentially stable.

Consider again the linear state equation

23.5 Example

ojx(k)

x(k+l)=
discussed in Example 23.2. The choice

Q(k)

l/Ia(k_l)I]

gives

AT(k)Q(kl)A(k) - Q(k)

l-l/Ia(k-I)I]

To address the requirements in Theorem 23.3, suppose there exist constants


such that, for all k,
<1
Ia(k)I

and a2

Then

and

AT(k)Q (k +1)A (k) Q(k)


]
a2
U2

Since

a2
U2

we

have shown that the state equation is uniformly exponentially stable under the

condition (17).

Instability
Quadratic Lyapunov functions also can be used to develop instability criteria of various

types. These are useful, for example, in cases where a Q (k) for stability is proving
elusive and the possibility of instability begins to emerge. The following result is a
criterion that, except for one value of k, does not involve a sign-definiteness assumption

on Q(k).

Discrete Time: Lyapunov Stability Criteria

Chapter 23
23.6

Theorem

Suppose

there exists an n x n matrix sequence Q (k) that for all k is

symmetric and such that


IIQ(k)lI

AT(k)Q(k+l)A(k)

Q(k)

vi

where p and v are finite positive constants. Also suppose there exists an integer
such that Q (kr,) is not positive semidefinite. Then the linear state equation (I) is not
uniformly stable.

Proof

Suppose x(k) is the solution of (1) with k(, =

and

such that

(k(,)Xa <0. Then, from (19),


AI

xT(k)Q(k)x(k)

[.vT(j+l)Q(j+l)x(j+l) _xT(j)Q(j).v(j)]

=
j =k,,
AI

xl(j)[AT(j)Q(j+l)A(j)_Q(j)]x(j)

kk,,+l
One consequence of this inequality is

<0, k

VT(k)Q(k)x(k)

+I

In conjunction with (18) and Exercise 1.9, this gives

<0, k k0 + I

pIIx(k)112

that is,
II

x (k) 112

(k0 )X(, I >

0, k

+I

Also from (20) we can write


kI

xT(j)x(j)

xT(k)Q(k)x(k)

j=A,,

I4Q (k(,)x() I + IxT(k)Q (k)x (k)

kk1,+1
This implies, from (18),

Time-Invariant Case

445
kI

2p
IIx(k)112

kk0+l

(22)

j
From

this point we complete the proof by showing that x (k)

is

unbounded and

noting that existence of an unbounded solution clearly implies the state equation is not

uniformly stable. Setting up a contradiction argument, suppose there exists a finite y


such that IIx(k)JI
for all k k(,. Then (22) gives
2py2

kI

,,

j=A.
But

this implies that IIx(k)II

goes

kk0+l

to zero as

an implication that

increases,

contradicts (21). This contradiction shows that the state-equation solution x(k) cannot
be bounded.

Time-Invariant Case
For a time-invariant linear state equation, we can consider quadratic Lyapunov functions

with constant Q(k) = Q and connect Theorem 23.3 on exponential stability to the
magnitude-less-than-unity eigenvalue condition in Theorem 22.11. Indeed we state
matters in a slightly more general way in order to convey an existence result for
solutions to a well-known matrix equation.

23.7 Theorem Given an n x n matrix A, if there exist symmetric, positive-definite,


ii x n matrices Mand Q satisfying the discrete-time Lyapunov equation

ATQA_Q=_M

(23)

then all eigenvalues of A have magnitude (strictly) less than unity. On the other hand if
all eigenvalues of A have magnitude less than unity, then for each symmetric x n
matrix M there exists a unique solution of (23) given by
(24)

Q =
k =0

Furthermore if M is positive definite, then Q is positive definite.

Proof If M and Q are symmetric, positive-definite matrices satisfying (23), then


the eigenvalue condition follows from a concatenation of Theorem 23.3 and Theorem
22.11.

For the converse we first note that the eigenvalue condition on A implies
exponential stability, which implies there exist y> 0 and 0
IIALII
Therefore

= II(AT)kII

k0

<

such that

Chapter 23

446

II

II

Discrete Time: Lyapunov Stability Criteria

II(Ar)L II IIM II

II

k=O

(24) is well defined. To show it is a solution of (23), we substitute to find, by


use of a summation-index change,
and Q in

ATQA Q =

k=O

L=O

j=I

k=O

(25)

= M

To show Q in (24) is the unique solution of (23), suppose Q


Then rewrite Q to obtain, much as in (25),

is

any solution of (23).

(A T)L [A TQA + Q JA k

Q =
k=O

+
k=O

k=O

=Q
That

is, any solution of (23) must be equal to the Q given in (24). Finally, since the

k = 0 term in (24) is M itself, it is obvious that M > 0 implies Q > 0.

ODD
We can rephrase Theorem 23.7 somewhat more directly as a stability criterion:
The time-invariant linear state equation x (k + 1) = Ax (k) is exponentially stable if and
only if there exists a symmetric, positive-definite matrix Q such that ATQA Q is
negative definite. Though not often applied to test stability of a given state equation,

Theorem 23.7 and its generalizations play an important role in further theoretical
developments, especially in linear control theory.

EXERCISES
Using a constant Q that is a scalar multiple of the identity, what are the weakest
conditions on a (k) and a,(k) under which you can prove uniform exponential stability for the
Exercise 23.1

linear state equation

Exercises

447

x(k+l)=

a1(k)

a constant, diagonal Q show uniform exponential stability under weaker conditions?

Exercise 23.2 Suppose the n x n matrix A is such that A TA <1. Use a simple Q to show that the
time-invariant linear state equation

x(k+1) = FAx(k)
is exponentially stable for any x n matrix F that satisfies
Exercise 23.3

FTF !.

Revisit Example 23.5 and establish uniform exponential stability under weaker

conditions on A (k) by using the Q (k) suggested in the proof of Theorem 23.3.

Exercise 23.4 Using the Q (k) suggested in the proof of Theorem 23.3, establish conditions on
a (k) and a2(k) such that
aI(k)](k)

x(k+l)=
is

uniformly exponentially stable. Hint: See Exercise 20.10.

Exercise 23.5 For the linear state equation

x(k+l)=

a small positive constant, to derive conditions that guarantee uniform exponential


stability. Are there cases with constant a0 and a where your conditions are violated but the state
equation is uniformly exponentially stable?

Exercise 23.6

Use Theorem 23.7 to derive a necessary and sufficient condition on a0 for

exponential stability of the time-invariant linear state equation

x(k+1)=

0
11
a0

x(k)

Exercise 23.7 Show that the time-invariant linear state equation


0
0

0...

0
0

x(k+1)=

x(k)
0

a0 a
is

exponentially stable if

a,,_1

Discrete Time: Lyapunov Stability Criteria

Chapter 23

448

a,,_j]II

II

Hint: Try a diagonal Q with nice integer entries.

Exercise 23.8 Using a diagonal Q (k) establish conditions on the scalar sequence a (k) such that
the linear state equation

x(k+l)=
is uniformly

112]x(k)

exponentially stable. Does your result say anything about the case a (k) =

Exercise 23.9 Given an n x n matrix A, show that if there exist symmetric, positive-definite,
n x n matrices M and Q satisfying
ATQA

with p > 0, then the eigenvalues of A satisfy

p2Q = p2M

A.
<p. Conversely show that if this eigenvalue
condition is satisfied, then given a symmetric, n x n matrix M there exists a unique solution Q.
I

Exercise 23.10 Given an n x n matrix A, suppose Q


n x n matrices satisfying
ATQA
Suppose also that for any n

and

M are symmetric, positive-semidefinite,

- Q = -M

x I vector

zT(AT)AMAtZ=0, k0
implies

lim AAZ=0
Show that every eigenvalue of A has magnitude less than unity.

Exercise 23.11 Given the linear state equation x (k + 1) = A (Ic )x (k), suppose there exists a real
function i' (k, x) that satisfies the following conditions.
(i) There exist continuous, strictly-increasing real functions a() and f3() such that a(0) =
= 0,
and

c((IIxII)v(k, x)
for allkandx.
(ii) For any k0, x0 and corresponding solution

x (k) of the state equation, the sequence v (k, x (k)) is


for k k0.
Prove the state equation is uniformly stable. (This shows that attention need not be restricted to
quadratic Lyapunov functions.) Hint: Use the characterization of uniform stability in Exercise

nonincreasing

22.1.

Exercise 23.12 If the linear state equation x (k + 1) = A (k)x (k) is uniformly stable, prove that
there exists a function v (k, x) that has the properties listed in Exercise 23.11. (Since the converse

of Theorem 23.1 seems not to hold, this exercise illustrates an advantage of non-quadratic
Lyapunov functions.) Hint: Let
v(k,x)=sup

k)xH

Notes

449

NOTES
Note 23.1 A standard reference for the material in this chapter is the early paper

R.E. Kalman, J.E. Bertram, "Control system analysis and design via the 'Second Method' of
Lyapunov, Part II, Discrete-Time Systems," Transactions of the ASME, Series D: Journal of Basic
Engineering, Vol. 82, pp. 394400, 1960
Note 23.2 The conditions for uniform exponential stability in Theorem 23.3 can be weakened in
various ways. Some more-general criteria involve concepts such as reachability and observability
discussed in Chapter 25. But the most general results involve the concepts of siabilizability and
detectability that in these pages are encountered only occasionally, and then mainly for the timeinvariant case. Exercise 23.10 provides a look at more general results for the time-invariant case,
as do certain exercises in Chapter 25. See Section 4 of

B.D.O. Anderson, J.B. Moore, "Detectability and stabilizability of time-varying discrete-time


linear systems," SIAM Journal on Control and Opthn!:ation. Vol. 19, No. 1, pp. 20 32, 1981

for a result that relates stability of time-varying state equations to existence of a time-varying
solution to a 'time-varying, discrete-time Lyapunov equation.'
Note 23.3 What we have called the discrete-time Lyapunov equation is sometimes called the
Stein equation in recognition of the paper

P. Stein, "Some general theorems on iterants," Journal of Research of the National Bureau of
Standards, Vol. 48, No. I, pp. 82 83, 1952

24
DISCRETE TIME
ADDITIONAL STABILITY CRITERIA

There are several types of criteria for stability properties of the linear state equation

=x,

x(k+1) =A(k)x(k),
in

addition to those considered in Chapter 23. The additional criteria make use of

various mathematical tools, sometimes in combination with the Lyapunov results. We


discuss sufficient conditions that are based on the Rayleigh-Ritz inequality, and results
that indicate the types of state-equation perturbations that preserve stability properties.
Also we present an eigenvalue condition for uniform exponential stability that applies
when A (k) is 'slowly varying.'

Eigenvalue Conditions
At first it might be thought that the pointwise-in-time eigenvalues of A (k) can be used
to characterize internal stability properties of (I), but this is not generally true.

24.1 Example

For the linear state equation (1) with

02
1/4 0
A (k) =

k even

[o 1/4]

the pointwise eigenvalues are constants, given by A. =


any stability property, for another easy calculation gives
A

en

But this does not imply

Eigenvalue Conditions

451

22k o
,

k even

kodd

2A

0) =
o

Despite such examples we next show that stability properties can be related to the
pointwise eigenvalues of A'(k)A (k), in particular to the largest and smallest eigenvalues
of this symmetric, positive-semidefinite matrix sequence. Then at the end of the chapter
we show that the familiar magnitude-less-than-unity condition applied to the pointwise
eigenvalues of A (k) implies uniform exponential stability if A (k) is sufficiently slowly
varying in a specific sense. (Beware the potential eigenvalue confusion.)

For the linear state equation (1), denote the largest and smallest

24.2 Theorem

pointwise eigenvalues of AT(k)A(k) by


solution of(l) satisfies

llx(k) II

Ii [J

Then for any x0 and k0 the

and

II [J

k0

Proof For any n x I vector : and any k, the Rayleigh-Ritz inequality gives
ZTAT(k)A (k):

Suppose x (k) is a solution of (1) corresponding to a given k0 and nonzero


can write
lIx(k) 112

Ilx(k +1)112 lIx(k) 112 Xmux(IC)


k

= k,,,

Then we

k0+j gives

A. +jI
112

fl Xrnin(I) lix (k0+j)

Taking the square root,

adjusting notation, and using the empty-product-is-unity

convention to include the k =

case,

we obtain (4).

DUD
By choosing, for each k,
such
II = IkT(k, k()) II, we obtain

that k

kI

fl

X() as a unity-norm vector such that


AI

k1(k, k0)II

fl

j=k,,

kk(,

Discrete Time: Additional Stability Criteria

Chapter 24

452

This inequality immediately supplies proofs of the following sufficient conditions.

The linear state equation (1) is uniformly stable if there exists a finite
constant y such that the largest pointwise eigenvalue of AT(k)A(k) satisfies
24.3 Corollary

for all k, j such that k j.


24.4

Corollary

The linear state equation (1) is uniformly exponentially stable if there

exist a finite constant y and a constant 0 A <

such that the largest pointwise

eigenvalue of AT(k)A(k) satisfies


k

An;ux(1) Sf Ak-i

for all k, j

such

that k j.

These sufficient conditions are quite conservative in the sense that many uniformly

stable or uniformly exponentially stable linear state equations do not satisfy the
respective conditions (6) and (7). See Exercises 24.1 and 24.2.

Perturbation Results
Another approach to obtaining stability criteria is to consider state equations that are

close, in some specific sense, to a linear state equation that possesses a known stability
property. This can be particularly useful when a time-varying linear state equation is
close to a time-invariant linear state equation. While explicit, tight bounds sometimes

are of interest, the focus here is on simple calculations that establish the desired
property. We begin with a Gronwall-Beliman type of inequality (see Note 3.4) for
sequences. Again the empty product convention is employed.

24.5 Lemma

Suppose the scalar sequences v(k) and

are such that v(k)

kk(,,and
iii,

w+11
where

and 11 are constants with

v(j)4(j), kk1,+l

0. Then
AI

AI

4(k)

fl [I

iv(j) I

'qi

exp

E v(j) 1.

k(,+l

0 for

Perturbation Results

453

Proof Concentrating on the first inequality in (9), and inspired by the obvious
+ I is an integer
k = k(,+l case, we set up an induction proof by assuming that K
such that the inequality (8) implies
kI

fl [i + iv(j)], k =

4(k)

Then we want to show that


K

4(K+l)

[1

+rlv(j)]

j=k.

Evaluating (8) at k = K+l and substituting (10) into the right side gives, since 11
and the sequence v(k) are nonnegative,
K

v(j)4,(j)
K

f-I

v(j)fJ [1 -i-lv(i)J

1 +r1v(k0)

i=L0

j=L,,+I

It remains only to recognize that the right side of (12) is exactly the right side of (Ii) by
peeling off summands one at a time:
fI

K
I

v(j) II [1 + iv(i)]

i=k,,
K

iiv(k01)[l

+iv(k0)] +

v(j)

f-I

[J {i

i=k,,
K

= [1 +

f-I

v(j) fl {l

TI

j=k0+2

i=k,,

...
j=k,

Thus we have established (11), and the first inequality in (9) follows by induction.

For the second inequality in (9), it is clear from the power series definition of the
exponential and the nonnegativity of v(k) and '1 that

I + Tlv(j)
So we immediately conclude

Discrete Time: Additional Stability Criteria

Chapter 24

454

kI

4(k)llIfl [1 +iv(j)]
kI

fl
kI

v(j)],

j=k0

E100
Mildly clever use of the complete solution formula and application of this lemma
yield the following two results. In both cases we consider an additive perturbation F(k)
to an A (k) for which stability properties are assumed to be known and require that
F(k) be small in a suitable sense.

24.6 Theorem

Suppose the linear state equation (1) is uniformly stable. Then the

linear state equation

:(k+1) = [A(k) + F(k)]:(k)


is

such that for all k,

uniformly stable if there exists a finite constant

IIF(j)II
j=L

Pi-oof For any k(, and z0 we can view F(k):(k) as an input term in (13) and
conclude from the complete solution formula that z (k) satisfies
kI

kk0+l

+
J =&,

Of course 4bA(k, j) denotes the transition matrix for A (k). By uniform stability of (1)

there exists a constant

such that

for all k, j

kIA(k, 1)11

Therefore, taking norms,


kI

y IIF(j)II IIz(j)II

II +

IIz(k)II

k k0 + I

j=L,,

Applying Lemma 24.5 gives


A

IIFU)II

IIz(k)II

kk0+ 1

Then the bound (14) yields

IIz(k)II

IIz<,II

k k(, + I

such that

k j.

Perturbation Results

455

and uniform stability of (13) is established since k(, and

are arbitrary.

Theorem Suppose the linear state equation (I) is uniformly exponentially stable.
Then there exists a (sufficiently small) positive constant
such that if IIF(k)II
for
all k, then
24.7

z(k+l) = [A(k) + F(k)]z(k)


is

(15)

uniformly exponentially stable.

Proof

Suppose

constants y and 0 A <

are such that

1)11

that k j. In addition we suppose without loss of generality that


A> 0. As in the proof of Theorem 24.6, F(k):(k) can be viewed as an input term and

for all k. j such

the complete solution formula for (15) provides, for any k0


II: (k) II

and

IF (J) liii: (1)11, k k(, + 1

1;, II +

= A& IIz(k)II gives

Letting

A-I

Then Lemma 24.5 and the bound on IIF(k)II imply

(1 +

k0 +

In the original notation this becomes

IIz(k)II yA

kk,,

(1 +

Obviously, since A < 1, 3 can be chosen small enough that A +


exponential stability of (15) follows since
and Z() are arbitrary.

< 1, and uniform

DDD
The different perturbation bounds that preserve the different stability properties in

Theorems 24.6 and 24.7 are significant. For example the scalar state equation with
A (k) = is uniformly stable, but a constant perturbation of the type in Theorem 24.7,
F(k) = 3, for any positive constant 3, no matter how small, yields unbounded solutions.
1

DIscrete Time: Additional Stability Criteria

Chapter 24

456

Slowly-Varying Systems
Despite the negative aspect of Example 24.1, it turns out that an eigenvalue condition on

A (k) for uniform exponential stability can be developed under an assumption that A (k)
is slowly varying. The statement of the result is very similar to the continuous-time case,
Theorem 8.7. And again the proof involves the Kronecker product of matrices, which is
defined as follows. If B is an Ilfi X ?n8 matrix with entries
and C is an
x mc
matrix, then the Kronecker product BC is given by the partitioned matrix
b11C

(17)
C
Obviously BC is an

x mBnzc matrix, and any two matrices are conformable with

respect to this product.

We use only a few of the many interesting properties of the Kronecker product
(though these few are different from the few used in Chapter 8). It is easy to establish
the distributive law
(B + C)(D + E) = BD + BE + CD + CE

of course, conformability of the indicated matrix additions. Next note that


x
BC can be written as a sum of
matrices, where each matrix has one

assuming,

(possibly) nonzero partition b,3C from (17). Then from Exercise 1.8 and an elementary
spectral-norm bound in Chapter 1,
lB II IC II
II
(Tighter bounds can be derived from properties of the Kronecker product, but this

suffices for our purposes.) Finally for a Kronecker product of the form A A, where A is

an n x n matrix, it can be shown that the n2 eigenvalues of AA are simply the n2


.., are the eigenvalues of A. Indeed this
products
i, j = I, .. ., n , where
is transparent in the case of diagonal A.

24.8 Theorem Suppose for the linear state equation (1) there exist constants a> 0 and
of
0 p. < 1 such that, for all k, hA (k)hI <a and every pointwise eigenvalue
A(k) satisfies
Then there exists a positive constant 3 such that (1) is
uniformly exponentially stable if II A (k) A (ki) II J3 for all k.

Proof For each k let Q (k + 1) be the solution of

AT(k)Q(k+l)A(k)_Q(k+1)=

1,,

Existence, uniqueness, and positive definiteness of Q (k +1)

by Theorem 23.7. Furthermore

(18)

for every k are guaranteed

Slowly-Varying Systems

457

Q(k+1)=i,, +

(19)
1=l

The

strategy of the proof is to show that this Q (k +

I)

satisfies the hypotheses of

Theoreni 23.3, thereby concluding uniform exponential stability of (1).

Clearly Q(k+1) in (19) is symmetric, and we immediately have also that


I Q(k). for all k. For the remainder of the proof, (18) is rewritten as a linear equation
by using the Kronecker product. Let i'ec[Q(k+l)J be the n2 x 1 vector formed by
stacking the n columns of Q(k+l), selecting columns from left to right with the first
column on top. Similarly let
be the ,i2 x 1 stack of the columns of F,,. With
A1(k) and
+ I) denoting the j"-colunins of A (k) and Q (k + I), we can write
=

= [a
Then

(k + I)]

the j"-column of AT(k)Q(k+l)A(k) can be written as


= [aij(k)AT(k)
= [AJ(k)AT(k) I vec[Q (k + I)]

Stacking these columns gives


[Af(k)AT(k) I vec[Q(k +1)1

= [AT(k)AT(k)]vec[Q(k+l)]
[A,1(k)AT(k) I vec[Q (k + 1)]

Thus (18) can be recast as the n2 x 1 vector equation


[AT(k)AT(k)

i,,2

]vec[Q(k+1)] =

(20)

We proceed by showing that i'ec[Q (k +1)] is bounded for all k. This implies
boundedness of Q (k + 1) for all k by the easily-yen fled matrix/vector norm property
IIQ (k + 1)11

ii II

vec[Q (k +1)111. To work this out begin with

det

AT(k)AT(k)] =

i.j=l

Evaluating the magnitude of this expression for X =

and using the magnitude bound

on the eigenvalues of A (k) gives, for all k,


I

det [AT(k)AT(k)

I H [I -

i.j=l

(I

>0

Therefore a simple norm argument involving Exercise 1.12, and the fact noted above that

Discrete Time: Additional Stability Criteria

Chapter 24

458

bound on IIA(k)II implies a bound on [AT(k)AT(k)_l,,21, yields existence of a


constant p such that
a

Ilvec[Q (k +1)111

II [AT(k)AT(k)

II

Ilvec[I,,I

p/n

(21)

for all k. Thus II Q (k + 1)11 p for all k, that is, Q (k) p1 for all k.
It remains to show existence of a positive constant v such that

AT(k)Q(k+1)A(k) Q(k) v!,1


However (18) implies

AT(k)Q(k+l)A(k) Q(k) =
so

+ [Q(k+l) Q(k)]

we need only show that there exists a constant ii such that

IIQ(k+l)Q(k)IIi<l

(22)

for all k. This is accomplished by again using the representation (20) to show that given
< I a sufficiently-small, positive 13 yields
any 0

Ilvec[Q(k+l)]vec[Q(k)III

forall k.
Subtracting successive occurrences of (20) gives
[AT(k)AT(k)

1,12 ]vec[Q(k+l)J

[AT(k_l)AT(k_l) 1,,2 ]vec[Q(k)] = 0

for all k, which can be rearranged in the form


[AT(k)AT(k)

{vec[Q(k+l)]

vec[Q(k)]J

= [AT(k_l)AT(k_1) AT(k)AT(k)] vec[Q (k)}


Using norm arguments similar to those in (21), we obtain existence of a constant y such
that

Ilvec[Q(k+1)1 vec[Q(k)]II yIIAT(kl)AT(kl)

(23)

Then the triangle inequality for the norm gives

IIAT(k_l)AT(k_l)_AT(k)AT(k)II = lI[AT(k)_AT(k_I)1[AT(k)_AT(k_l)I
+ AT(k_1)[AT(k)
+ [AT(k)

AT(k_1)}

_AT(k_1)]AT(k_1)II
(24)

Exercises

459

Putting together the bounds (23) and (24) shows that (22) can be satisfied by selecting
sufficiently small. This concludes the proof.

13

EXERCISES
Exercise 24.1 Use Corollary 24.3 to derive a sufficient condition for uniform stability of the
linear state equation

al(k)](L)

x(k+l)=

Devise a simple example to show that your condition is not necessary.

Exercise 24.2
Use Corollary 24.4 to derive a sufficient condition for uniform exponential
stability of the linear state equation

]k

x(k+l)=

Devise a simple example to show that your condition is not necessary. Use Theorem 24.8 to state
another sufficient condition for uniform exponential stability.
Exercise 24.3 Apply Theorem 24.6 in two different ways to derive two sufficient conditions for
uniform stability of the linear state equation

x(k+l)=
Can you find examples to show that neither of your conditions are necessary?

Exercise 24.4 Suppose A (k) and F (k) are n x ii matrix sequences with MA (k) II <a for all k,
where a is a finite constant. For any fixed, positive integer F, show that given e > 0 there exists a
0 such that

IIF(k) A (k) H

for all k implies

k)II c

k)

for all k. Hint: Use Exercise 20.15.


Exercise 24.5

q(k), and v(k), where ii(k) and v(k) are

Consider the scalar sequences

nonnegative. If
kk,,l
show that
/.I

+ r1(k)

H [I
i=J+l

kk,,+ 1

Chapter 24

460

Discrete Time: Additional Stability Criteria

Hint: Let

r(k)

v(j)4(j)

then show

r(k+l)[l +1(k)v(k)]r(k) +
and use the 'summing factor'

Exercise 24.6

If the n x n matrix sequence A (k) is invertible for all k, and

and

denote the smallest and largest pointwise eigenvalues of AT(k)A (k), show that
AI

j=A.,

AI

kk,, I

114(k,,,k)lI fl

Suppose the linear state equation x(k +1) = A(k)x(k) is uniformly stable, and
consider the state equation
Exercise 24.7

:(k+l)=A(k)z(k)+f(k,:(k))
wheref(k, z) is an nxl vector function. Prove that this new state equation is uniformly stable if
there exist finite constants a and aL, k = 0, 1, 2

IIf(k,

such that
aA IIZII

and

a, a
j=A

for all k. Show by scalar example that the conclusion is false if we weaken the second condition

to finiteness of

for every k.

NOTES
Note 24.1

Extensive coverage of the Kronecker product is provided in

R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge,
England, 1991

Note 24.2 An early proof of Theorem 24.8 using ideas from complex variables is in
C.A. Desoer, "Slowly varying discrete system
339-340, 1970

+ I = A,x1,"

Electronics Letters, Vol. 6, No. 11, pp.

Further developments that involve a weaker eigenvalue condition and establish a weaker form of
stability using the Kronecker product can be found in

Notes

461

F. Amato, G. Celentano, F. Garofalo, 'New sufficient conditions for the stability of slowly varying
linear systems," IEEE Transactions on Automatic Control, Vol. 38, No. 9, pp. 1409-1411, 1993
Note 24.3 Various matrix-analysis techniques can be brought to bear on the stability problem,
leading to interesting, though often highly restrictive, conditions. For example in

J.W. Wu, K.S. Hong, "Delay independent exponential stability criteria for time-varying discrete
delay systems," IEEE Transactions on Automatic Control. Vol. 39, No. 4, pp. 811-814, 1994

the following is proved. If the n x n matrix sequence A (k) and the constant n x n matrix F are
such that
for all k and I, j = 1
n, then exponential stability of z(k+l) = Fz(k)
implies uniform exponential stability of x (k + 1) = A (k)x (k). Another stability criterion of this
type, requiring that I F be a so-called M-matrix, is mentioned in

T. Mon. "Further comments on 'A simple criterion for stability of linear discrete systems',"
International Journal of Control, Vol. 43, No. 2, pp. 737 739, 1986
An interesting variant on such problems is to find bounds on the time-varying entries of A (k) such
that if the bounds are satisfied, then

x(k+l) =A(k).v(k)
has a particular stability property. See, for example,

P. Bauer, M. Mansour, J. Duran, "Stability of polynomials with time-variant coefficients," IEEE


Transactions on Circuits and Systems. Vol. 40, No. 6, pp. 423 425, 1993
This problem also can be investigated in terms of perturbation formulations, as in

S.R. Kolla, R.K. Yedavalli, J.B. Farison, "Robust stability bounds on time-varying perturbations
for state space models of linear discrete-time systems," International Journal of Control, Vol. 50,
No. l.pp. 151159, 1989

25
DISCRETE TIME
REACHABILITY AND OBSERVABILITY

fundamental concepts of reachability and observability for an in-input, p-output, ndimensional linear state equation
The

x(k+l) =A(k)x(k) + B(k)u(k), x(k(,) =x0

y(k)=C(k)x(k) + D(k)u(k)
introduced in this chapter. Reachability involves the influence of the input signal on
the state vector and does not involve the output equation. Observability deals with the
influence of the state vector on the output and does not involve the effect of a known
input signal. In addition to their operational definitions in terms of driving the state with
the input and ascertaining the state from the output, these concepts play fundamental
roles in the basic structure of linear state equations addressed in Chapter 26.
are

Reachability
For

a time-varying linear state equation. the connection of the input signal to the state

variables can change with time. Therefore we tie the concept of reachability to a
specific, finite time interval denoted by the integer-index range k = k(,,..., k1, of course
with k1 k0 + I. Recall that the solution of (1) for a given input signal and x(k0) = 0 is
conveniently called the zero-state response.

25.1 Definition The linear state equation (1) is called reachable on [kQ, k1] if given
any state x1 there exists an input signal such that the corresponding zero-state response
of (1), beginning at k0, satisfies x (k1) = x1.

This definition implies nothing about the zero-state response for k k1 + 1. In


for k k1 + 1. However the

particular there is no requirement that the state remain at


A

Reachability

463

definition reflects the notion that the input signal can independently influence each state

variable, either directly or indirectly, to an extent that any desired state can be attained
from the zero initial state on the specified time interval.
25.2 Remark A reader familiar with the concept of controllability for continuous-time
state equations in Chapter 9 will notice several differences here. First, in discrete time we
concentrate on reachability from zero initial state rather than controllability to zero final
state. This is related to the occurrence of discrete-time transition matrices that are not
invertible, an occurrence that produces completely uninteresting discrete-time linear
state equations that are controllable to zero. Further exploration is left to the Exercises,
though we note here an extreme, scalar example:

.v(k+l)=Ox(k)+Ou(k), x(O)=x0
a time-invariant discrete-time linear state equation might fail to be reachable on
[kg,, k1j simply because the time interval is too shortsomething that does not happen in
Second,

continuous time. A single-input, u-dimensional discrete-time linear state equation can


require ii steps to reach a specified state. This motivates a small change in terminology
when we consider time-invariant state equations. Third, smoothness issues do not arise
for the input signal in discrete-time reachability. Finally, rank conditions for
reachability of discrete-time state equations emerge in an appealing, direct fashion from
the zero-state solution formula, so Gramian conditions play a less central role than in the
continuous-time case. Therefore, for emphasis and variety, we reverse the order of
discussion from Chapter 9 and begin with rank conditions.

DOD
A rank condition for reachability arises from a simple rewriting of the zero-state
response formula for (1). Namely we construct partitioned matrices to write
kfI

(k1l)

ii (k12)

R(k0,k1)
U

where

the n x

(k0)

(k1 k0),n matrix

R(k0, k1) = [B(kf_l)

b(k1, k1l)B(k12)

..

D(kf,k(,4-l)B(k(,)]

is called the reachability Fnatrix.

25.3 Theorem The linear state equation (I) is reachable on [k0, k1] if and only if
rank

= ii

Chapter 25

464

Discrete Time: Reachability and Observability

Proof If the rank condition holds, then a simple contradiction argument shows that
the symmetric, positive-semidefinite matrix R (Ic0,
k1) is in fact positive

definite, hence invertible. Then given a state Xj we define an input sequence by setting
ii

= RT(kI,,

kf)[R(k0, k1)RT(k0,

kj)}'x1

U (Ic0)

letting the immaterial values of the input sequence outside the range Ice,,..., k1 I
be anything, say 0. With this input the zero-state solution formula, written as in (2),
and

immediately gives x (Ic1) =

x1.

On the other hand if the rank condition fails, then there exists an n x I vector
0 such that
(Ic0, Ic1) = 0. If we suppose that the state equation (1) is reachable

X0

on [Ic0, Ic1], then there is an input sequence

such that
11aVCf1)

LIa(ko)

both sides by
this implies 4x0 = 0. But then
that shows the state equation is not reachable on [k0, Ic1].

= 0, a contradiction

Premultiplying

ODD
In

developing an alternate form for the reachability criterion, it will become

apparent that the matrix W (Ic0 ,k1) defined below is precisely R (k<,, k1)RT(k0, ks). We

often ignore this fact to emphasize similarities to the controllability Gramian in the
continuous-time case.

25.4 Theorem
o x n matrix

The linear state equation (1) is reachable on [k0,

Ic1]

if

and only if the

tiI

W(k0,
is

Ic1)

invertible.

Proof Suppose W (k,k1)


input signal by setting

u(k)
and

is

invertible. Then given an ii x 1 vector x1 we specify an

=BT(k)cbT(k1,

k1)xj,

k =

k(,,...,

k1l

setting u (k) = 0 for all other values of k. (This choice is readily seen to be

identical to (4).) The corresponding zero-state solution of (1) at Ic = Ic1 can be written as

Reachability

465

j + l)W - (k(), k4x1

= XI
Thus the state equation is reachable on [k0, k1].
To show the reverse implication by contradiction, suppose that the linear state
kf) in (5) is not invertible. Of course the
equation (1) is reachable on [k,,, k1] and
assumption that W(k0, k1) is not invertible implies there exists a nonzero ,j x I vector
such that
I., I

0=

j+

kf).v(1 =

But the summand in this expression

is

j+ I

)B

(7)

simply the nonnegative scalar sequence

I )B (j) 112, and it follows that

j=k0

k1l

(8)

Because the state equation is reachable on [kr,, Icj], choosing x1 = Xa there exists an
input
such that
I

=
j=k.

Multiplying through by

and using (8) gives

= 0. a contradiction. Thus

W(k(,, k1) must be invertible.

DDIJ
k1), has important
The reachability Gramian in (5), W (k(), k1) = R (k0,
properties, some of which are explored in the Exercises. Obviously for every k1 k0 + 1
it is symmetric and positive semidefinite. Thus the linear state equation (1) is reachable

on [k0, k1] if and only if


k1) is positive definite. From either Theorem 25.3 or
Theorem 25.4, it is easily argued that if the state equation is not reachable on [ks,, k1J,
then it might become so if k1 is increased. And reachability can be lost if k1 is lowered.
Analogous observations can be made about changing Ic0.
For a time-invariant linear state equation.

x(k+l) =Ax(k)

+ Bu(k)

x(k(,) =X()

y(k) = Cx(k) + Du(k)


the test for reachability in Theorem 25.3 applies, and the reachability matrix simplifies to

Chapter 25

466

Discrete Time: Reachability and Observability

[B

R(k0, k1) =

B]

AB

Therefore reachability on [kr,, k11 does not depend on the choice of k0, it only depends

on the number of steps k1 k0. The Cayley-Hamilton theorem applied to the n x n


matrix A shows that consideration of k1 k(, > ii is superfluous to the rank condition.
On the other hand, in the single-input case (m = 1) it is clear from the dimension of
R(k0, k1) that the rank condition cannot hold with k1 ks, <ii. In view of these matters
we pose a special definition for exclusively time-invariant settings, with = 0, and thus
slightly recast the rank condition. (This can cause slight confusion when specializing
from the time-varying case, but a firm grasp of the obvious suffices to restore clarity.)
25.5 Definition The time-invariant linear state equation (9) is called reachable if given
any state
there is a positive integer k1 and an input signal such that the corresponding
zero-state response, beginning at k0 = 0, satisfies x (ks) = x1.
This leads to a result whose proof is immediate from the preceding discussion.

25.6 Theorem

The time-invariant linear state equation (9) is reachable if and only if

rank [B AB

=n

It is interesting to note that reachability properties are not preserved when a timeinvariant linear state equation is obtained by freezing the coefficients of a time-varying
linear state equation. It is easy to pose examples where freezing the coefficients of a
time-varying state equation at a value of k where B (k) is zero destroys reachability.
Perhaps a reverse situation is more surprising.

25.7 Example

Consider the linear state equation

x(k-i-l) =

where the constants a

a1

0 a,

x(k) +

h1(k)

and a2 are not equal. For any constant, nonzero values

b1 (k) = h1, b2 (k) = b2, we can call on Theorem 25.6 to show that the (time-invariant)

state equation is reachable. However for the time-varying coefficients

the reachability matrix for the time-varying state equation is


k1I

k1I

k1I

R (k0, k1)

=
By the rank condition in Theorem 25.3, the time-varying linear state equation is not

reachable on any interval [k0, Ic1]. Clearly a pointwise-in-time interpretation of the


reachability property can be misleading.

Observability

467

Observability
second concept of interest for (1) involves the influence of the state vector on the
output of the linear state equation. It is simplest to consider the case of zero input, and
this does not entail loss of generality since the concept is unchanged in the presence of a
known input signal. Specifically the zero-state response due to a known input signal can
The

be computed and subtracted from the complete response, leaving the zero-input
response. Therefore we consider the zero-input response of the linear state equation (1)

and invoke an explicit, finite index range in the definition. The notion we want to
capture is whether the output signal is independently influenced by each state variable,
either directly or indirectly. As in our consideration of reachability, k1 k0 + 1 always
is assumed.

25.8 Definition The linear state equation (1) is called observable on [k0, k1] if any
initial state x (k0) = X() is uniquely determined by the corresponding zero-input response

y(k) for k

The basic characterizations of observability are similar in form to the reachability


criteria. We begin with a rank condition on a partitioned matrix that is defined directly
from the zero-input response by writing

C(k0)x(,

y(kjl)

k0)x0

C(k1l)D(k1l, k0)x0

= 0 (k0,
The

kf)x(,

p(k1k0)xn matrix
C (k(,)
Ic0)

O(k0,k1)
Ic0)

is

called the observability ,natrLv.

25.9 Theorem

The linear state equation (1) is observable on [k0, k1]


rank O(k<,,

kf) = n

if

and only if

Chapter 25

468

Proof

Discrete Time: Reachability and Observability

If the rank condition holds, then OT(k(,.

matrix. Given the zero-input response

k1)O (kg,, k1) is

an invertible pi xii

(k1l), we can determine the initial

(k,,)

state from (11) according to


y (k0)
kf)O(k(,. k1)

Or(k

k1)

On the other hand if the rank condition fails, then there is a nonzero ii x vector
= 0. Then the zero-input response of (1) to x (ks,) =
Xa such that 0 (k0,
is
1

=v(k1l)=O
This of course is the same zero-input response as is obtained from the zero initial state,
so clearly the linear state equation is not observable on [k,,, k11.
EJOD
The proof of Theorem 25.9 shows that for an observable linear state equation the
initial state is uniquely determined by a linear algebraic equation, thus clarifying a vague
aspect of Definition 25.8. Also observe the role of the interval lengthfor example if
p = 1, then observability on [k0, k1] implies k1k0 n.
The proof of the following alternate version of the observability criterion is left as
an easy exercise.

25.10 Theorem
the n x n matrix

The linear state equation (1) is observable on [ku, k1] if and only if
kf I

k0)

k1) =
j=k0

is invertible.
By writing
OT(k

...

DT(k1_1,

see that the ohsen'ahility Grarnian


k1) is exactly OT(k0, kj)0(ka, k1). Just as
the reachability Gramian, it has several interesting properties. The observability
Gramian is symmetric and positive semidefinite, and positive definite if and only if the
state equation is observable on [k(,, k1]. It should be clear that the property of
observability is preserved, or can be attained, if the time interval is lengthened, or that it
can be destroyed by shortening the interval.
For the time-invariant linear state equation (9), the observability matrix (12)
we

simplifies to

Observability

469
C
CA

(15)

Observability for time-invariant state equations thus involves the length of the time
interval k1k0, but not independently the particular values of k0 and k1. Also
consideration of the Cayley-Hamilton theorem motivates a special definition based on
k0 = 0, and a redefinition of the observability matrix leading to a standard criterion.

25.11 Definition The time-invariant linear state equation (9) with k(, = 0 is called
observable if there is a finite positive integer k1 such that any initial state x (0) = x0 is
uniquely determined by the corresponding zero-input response y (k) for k = 0,
1

25.12

k1l.

Theorem The time-invariant linear state equation (9) is observable if and only if
C
CA

rank

CA't -'

It is straightforward to show that the properties of reachability on [Ic0, k1] and


observability on [k0, k1] are invariant under a change of state variables. However one
awkwardness inherent in our definitions is that the properties can come and go as the
interval [Ic0, Ic1] changes. This motivates stronger forms of reachability and observability
that apply to fixed-length intervals independent of Ic0. These new properties, called Istep reachability and I-step observability, are introduced in Chapter 26.
For the time-invariant case a comparison of (10) and (16) shows that the state
equation
x(k+l) =Ax(k) + Bu(k)

is reachable if and only if the state equation

z(k+1) =ATZ(k)

y(k) = BTz(k)
observable. This somewhat peculiar observation permits easy translation of algebraic
consequences of reachability for time-invariant linear state equations into corresponding
results for observability. (See for example Exercises 25.5 and 25.6.) Going further, (10)
and (16) do not depend on whether the state equation is continuous-time or discrete-time
only the coefficient matrices are involved. This leads to treatments of the structure of
time-invariant linear state equations that encompass both time domains. Such results are
pursued in Chapters 13, 18, and 19.
is

Discrete Time: Reachability and Observability

Chapter 25

470

Additional Examples
The

fundamental concepts of reachability and observability have utility in many

different contexts. We illustrate by revisiting some simple situations.

25.13 Example In Example 20.16 a model for the national economy is presented in
terms of deviations from a constant nominal. The state equation is

x6(k+l)

x5(k)

[P:_l

g5(k)
+

1 ]x&(k) +

= [1

(17)

where all signals are permitted to take either positive or negative values within suitable
ranges. A question of interest might be whether government spending g5(k) can be used

to reach any desired values (again within a range of model validity) of the state
variables, consumer
affinnatively since

expenditure and

rank [B

private investment. Theorem 25.6 answers this

AB } = rank

Pa

a quick calculation shows that the determinant of the reachability matrix cannot be
zero for the permissible coefficient ranges 0 < a < 1, 3 > 0. Indeed any desired values
can be reached from the nominal levels in just two years.
Another question is whether knowledge of the national income y (k) for
successive years

can

be used to ascertain consumer expenditure and private investment.

This reduces to an observability question, and again the answer is affirmative

by

a simple

calculation:

C
det

CA

a+J3(al) a+f3a =p>0

=det

Of course observability directly permits calculation of the initial state x3(0) from
and y5(l). But then knowledge of subsequent values of g5(k) and the coefficients in
(17) is sufficient to permit calculation of subsequent values of
25.14 Example

In Example 22.16 we introduce the cohort population model

a1a2a3

y(k)= [1

u(k)

001

lIx(k)

The reachability property obviously holds for (19) since the B-matrix is invertible.
However it is interesting to show that if all birth-rate and survival coefficients are

Additional Examples
positive,

471

then any desired population distribution can be attained by selection of

immigration levels in any single age group. (We assume that emigration, that is,
negative immigration, is permitted.) For example allowing immigration only into the
second age group gives the state equation

x(k+l) =

132

x(k) +

a1a2cx3

y(k)= [1

u,(k)

l]x(k)

(20)

and the associated reachability matrix is


0137
0

0
a2j33

0 a, a1f32+a7a3
Clearly this has rank three when all coefficients in (20) are positive. A little reflection
shows how this reachability plays out in a 'physical' way. Immigration directly affects
x,(k) and indirectly affects x1 (k) and x3(k) through the survival and birth processes.
For this model the observability concept relates to whether individual age-group
populations can be ascertained by monitoring the total population y(k). The
observability matrix is
1

a,+f3,

a3+f33
a1(a3+133) a1137+a,(a3+133) 133(a7+J37)+a3(a3+133)
cx1

and

the rank depends on the particular coefficient values in the state equation. For

example the coefficients

a1=l/2, a,=a3=137=133=l/4
render the state equation unobservable. While this is perhaps an unrealistic case, with
old-age birth rates so high, further reflection on the physical (social) process provides
insight into the result.

DOD
For those familiar with continuous-time state equations, we return to the sampleddata situation where the input to a continuous-time linear state equation is the output of a
period-T sampler and zero-order hold. As shown in Example 20.3, the behavior of the
overall system at the sampling instants can be described by a discrete-time linear state

equation. A natural question is whether controllability of the continuous-time state


equation implies reachability of the discrete-time state equation. (A similar question
arises for observabi]ity.) We indicate the situation with an example and refer further
developments to references in Note 25.5.

25.15

Discrete Time: Reachability and Observability

Chapter 25

472

Example

Suppose the single-input, time-invariant linear state equation

=Av(t) + bu(t)
is such that the ii x n (controllability) matrix

[b Ab

(22)

is invertible. Following Example 20.3 the corresponding sampled-data system can be


described, at the sampling instants, by the time-invariant, discrete-time state equation

x[(k+l)T]

= CATV(kT) +

JeAthdt u(kT)

The question to be addressed is whether the n x n matrix

Je"hdt
is

(23)

invertible. It is clear that if there are distinct integers

in the range 0,.

. .

, nI

=
such that
then (23) fails to be invertible. Indeed we call on Example 5.9 to
show that this 'loss of reachability under sampling' can occur. For the controllable linear
state equation

l 0

x(f)+

u(t)

we obtain
+

x[(k+1)T]=

[l_.co;T1U(kT)

(24)

It is easily checked that if T = lit, where / is any positive integer, then the discrete-time
state equation (24) is not reachable. Adding the output

y(t)= [1 0]x(t)
to the continuous-time state equation, a quick calculation shows that observability is lost
for these same values of T.

EXERCISES
Exercise 25.1

Exercise 25.2

Prove Theorem 25.10.

Provide a proof or counterexample to the following claim. Given any a x ii

matrix sequence A (k) there exists an a x I vector sequence b (k) such that

.r(k+1) =A(k)x(k) + b(k)u(k)


is reachable on [0, k1] for some k1> 0. Repeat the question under the assumption that A (k) is

Exercises

473

invertible at each A.

Exercise 25.3 Show that the reachability Gramian satisfies the matrix difference equation
W(k,,,

fork

k)AT(k) + B(k)BT(k)

k+l)

Ic, + 1. Also prove that

k)W(k,,.

W(k,,, k1) =

A) + W(k. A1). A = k0+l

Exercise 25.4 Establish properties of the observability Gramian M(k0, A1) corresponding to the
properties of W(k,,. k1) in Exercise 25.3.

Exercise 25.5 Suppose that the time-invariant linear state equation

.v(k+l) =A.v(k) + Bu(k)


is reachable and A has magnitude-less-than-unity eigenvalues. Show that there exists a symmetric.
positive-definite. n x ii matrix Q satisfying
AQAr Q = _BBT
Exercise 25.6

Suppose that the time-invariant linear state equation

.v(k+l) =A.v(k) + Bu(k)


is

reachable and there exists a symmetric, positive-definite, ii x n matrix Q satisfying


AQAr Q = _BBT

Show that all eigenvalues of A have magnitude less than unity. Hint: Use the (in general complex)

left eigenvectors of A in a clever way.

Exercise 25.7 The linear state equation

v(k+l)=A(k).v(k) +B(k)u(k)
v(k) = C(k).v(k)
called output reae/iable on IA0, k,] if for any given p x I vector y1 there exists an input signal
that the corresponding solution with x(k,,) = 0 satisfies y(k1) =yj. Assuming
iank C(k1) =p, show that a necessary and sufficient condition for output reachability on [A,,. A1] is
invertibility of the p xp matrix
is

u(k) such

A1) =

C(k1)D(k1. i+l)B(j)BT(j)D1(k1. j+l)CT(kj)

Explain the role of the rank assumption on C(k1). For the special case in = p =

1,

express the

condition in terms of the unit-pulse response of the state equation.


Exercise 25.8 For a time-invariant linear state equation

.v(k+l) =Ax(k) + B,i(k)


y(k) = C.r(k)

with ,ank C = p, continue Exercise 25.7 by deriving a necessary and sufficient condition for
output reachability similar to the condition in Theorem 25.6. If in = p = I characterize an output

Discrete Time: Reachability and Observability

Chapter 25

474

reachable state equation in terms of its unit-pulse response, and its transfer function.

Exercise 25.9 Suppose the single-input, single-output, n-dimensional, time-invariant linear state
equation
.v(k+l) =A.v(k) + hu(k)

y(k) =cx(k)
is reachable and observable. Show that A and he do not commute if ii

2.

Exercise 25.10 The linear state equation


x(k +1) = A (k)x(k) + B (k)u (k) ,

.v

(k,,) =

called controllable on [k,,, k11 if for any given ii x I vector .v,, there exists an input signal ii (k)
x,, satisfies .v(k1) = 0. Show that the state equation is
controllable on [k0, k1] if and only if the range of
k0) is contained in the range of R(k,,, k1).
Under appropriate additional assumptions show that the state equation is controllable on 1k,,, k4 if
and only if the it x n controllability Grarnian
is

such that the solution with x(k,,) =

Wc(k,,, k1) =

cD(k0,

j+l)

j+l)B

J ='

is

invertible. Show also that if A(k) is invertible at each k, then the state equation is reachable on

[k9, k11 if and only if it is controllable on [k,,, k1].

Based on Exercise 25.10, define a natural concept of output controllability for a


time-varying linear state equation. Assuming A (k) is invertible at each k, develop a basic
Exercise 25.11

Gramian criterion for output controllability of the type in Exercise 25.7.

Exercise 25.12 A linear state equation

.v(k+l)=A(k).v(k), .v(k0)=x,,
y(k) = C(k).r(k)
called reconstructible on [k,,, k1] if for any .v,, the state x(kj) is uniquely determined by the
response y(k), k = k0
Ic1 I. Prove that observability on [Ic,,, Ic1] implies reconstructibility on
[Ic0, k1]. On the other hand give an example that is reconstructible on a fixed [Ic,,, Ic1], but not
observable on [k,,, k1]. Then assume A (Ic) is invertible at each k, and characterize the
reconstructibility property in terms of the ii x n reconstrucribility Gramian
is

'.1-I

MR(k0, k1) =

kj)CT(j)C(j)D(j. Ic1)

Establish the relationship of reconstructibility to observability in this case.

Exercise 25.13 A time-invariant linear state equation

.v(k+l) =A.v(k)

.v(0) =x,,

y(k) = C.v(k)
called reconstructible if for any .v,, the state x (ii) is uniquely determined by the response y (Ic),
nI. Derive a necessary and sufficient condition for reconstructibility in terms of
the observability matrix. Hint: Consider the null spaces of A" and the observability matrix.
is

Ic = 0,

Notes

475

NOTES
Note 25.1 As noted in Remark 25.2, a discrete-time linear state equation can fail to be reachable
on
k1J simply because k1 ks, is too small. One way to deal with this is to use a different type
of definition: A discrete-time linear state equation is reachable at time k1 if there exists a (finite)
integer k(, <k1 such that it is reachable on [Ice,, k1]. Then we call the state equation reachable if it

is reachable at k1 for every k1. This style of formulation is typical in the literature of observability
as well.

Note 25.2 References treating reachability and observability for time-varying, discrete-time
linear state equations include
L. Weiss, "Controllability, realization, and stability of discrete-time systems," SIAM Journal on
C'ontrol and Optimization, Vol. 10, No. 2, pp. 230251, 1972

F.M. Callier, C.A. Desoer, Linear System

Springer-Verlag, New York, 1991

as well as many publications in between. These references also treat the notions of controllability

and reconstructibility introduced in the Exercises, but there is wide variation in the details of
definitions. Concepts of output controllability are introduced in

P.E. Sarachuk, E. Kriendler, "Controllability and observability of linear discrete-time systems,"


International Journal of Control, Vol. I, No. 5, pp. 419 432, 1965

Note 25.3 For periodic linear state equations the concepts of reachability, observability,
controllability, and reconstructibility in both the discrete-time and continuous-time settings are
compared in

S. Bittanti, "Deterministic and stochastic linear periodic systems," in Time Series and Linear
Systems, S. Bittanti, ed., Lecture Notes in Control and Information Sciences, Springer-Verlag,
New York, 1986

So-called structured linear state equations, where the coefficient matrices have some fixed zero

entries, but other entries unknown, also have been studied. Such a state equation is called
structural/v reachable if there exists a reachable state equation with the same fixed zero entries,

that is, the same structure. Investigation of this concept usually is based on graph-theoretic
methods. For a discussion of both time-invariant and time-varying formulations and references,
see

S. Poljak, "On the gap between the structural controllability of time-varying and time-invariant
systems," iEEE Transactions on Automatic Control, Vol. 37, No. 12, pp. 1961 1965, 1992

Reachability and observability concepts also can be developed for the positive state equations
mentioned in Note 20.7. Consult

M.P. Fanti, B. Maiione, B. Turchiano, "Controllability of multi-input positive discrete-time


systems," International Journal of Control, Vol. 51, No. 6, pp. 1295 1308, 1990

Note 25.4 Additional properties of a reachability nature, in particular the capability of exactly
following a prescribed output trajectory, are discussed in

J.C. Engwerda, "Control aspects of linear discrete time-varying systems," International Journal
of Control, Vol.48, No.4, pp. 1631 1658, 1988
Geometric ideas of the type introduced in Chapter 18 and 19 are used in this paper.

476

Chapter 25

Discrete Time: Reachability and Observability

Note 25.5 The issue of loss of reachability with sampled input raised in Example 25.15 can be
pursued further. It can be shown that a controllable, continuous-time, time-invariant linear state
equation with input that passes through a period-T sampler and zero-order hold yields a reachable
discrete-time state equation if

q=l,2,...
for every pair of eigenvalues
of A. (This condition also is necessary in the single-input case.)
A similar result holds for loss of observability. A proof based on Jordan form (see Exercise 13.5)
is given in

Kalman, Y.C. Ho, KS. Narendra, "Controllability of linear dynamical


Contributions to Differential Equations. Vol. I. No. 2, pp. 189 213, 1963.
R.E.

systems,"

A proof based on the rank-condition tests for controllability in Chapter 13 is given in Chapter 3 of
E.D. Sontag, Mathematical Control Theory. Springer-Verlag, New York, 1990

In any case by choosing the sampling period T sufficiently small, that is, sampling at a sufficiently
high rate, this loss of reachability and/or observability can be avoided. Preservation of the weaker
property of stahili;ahiliiv (see Exercise 14.8 or Definition 18.27) under sampling with zero-order
hold is discussed in

M. Kimura, "Preservation of stabilizability of a continuous-time system after discretization."


International Journal of System Science, Vol. 21, No. 1. pp. 65 91, 1990
Similar questions for sampling with a first-order hold (see Note 20.8) are considered in
T. Hagiwara, 'Preservation of reachability and observability under sampling with a first-order
hold," IEEE Transactions on Automatic Control, Vol. 40, No. I, pp. 104 107, 1995

26
DISCRETE TIME
REALIZATION

In

this chapter we begin to address questions related to the input-output (zero-state)

behavior of the discrete-time linear state equation

x(k+l)=A(k)x(k) +B(k)u(k), x(k0)=O


y(k)=C(k)x(k) + D(k)u(k)
retaining of course our default dimensions n, m, and p for the state, input, and output.

With zero initial state assumed, the output signal y (k) corresponding to a given input
signal u (k) can be written as

G(k,j)u(j), kk0

y(k)=
j=k,,

where

G(k,j)=

D(k),

j=k

C(k)ct(k,j+l)B(j), kj+1

j) can be computed so that the inputoutput behavior is known according to (2). Our interest here is in reversing this
Given the state equation (1), obviously G (k,

computation, and in particular we want to establish conditions on a specified G (k, j)


that guarantee existence of a corresponding linear state equation. Aside from a certain
theoretical symmetry, general motivation for our interest is provided by problems of
implementing linear input/output behavior. Discrete-time linear state equations can be
constructed in hardware, as mentioned in Chapter 20, or easily programmed in software
for recursive numerical solution.
Some terminology in Chapter 20 that goes with (2) bears repeating. The inputoutput behavior is causal since, for any
k(,, the output value y (kr,) does not depend
477

Chapter 26

478
on values

of the input at times greater than

Discrete Time: Realization

Also the input-output behavior is linear

since the response to a (constant-coefficient) linear combination of input signals


+ I3Uh(k) is
+ I3yh(k), in the obvious notation. (In particular the zero-state

response to the all-zero input sequence is the all-zero output sequence.) Thus we are
interested in linear state equation representations for causal, linear input-output behavior
described in the form (2).

Realizability
In considering existence of a linear state equation (1) corresponding to a given G (k, j),
it is apparent that D (k) = G (k, k) plays an unessential role. We assume henceforth that
D (k) is zero to simplify matters, and as a result we focus on G (k, j) for k, j such that
k j + 1. Also we continue to call G (k, j) the unit-pulse response, even in the multiinput, multi-output case where the terminology is slightly misleading.
there

When there exists one linear state equation corresponding to a specified G(k, j),
exist many, since a change of state variables leaves G (k, j) unaffected. Also there

exist linear state equations of different dimensions that yield a specified unit-pulse
response. In particular new state variables that are disconnected from the input, the
output, or both can be added to a state equation without changing the associated inputoutput behavior.

26.1 Example

If the linear state equation (1), with D (k)

zero,

corresponds to the

input-output behavior in (2), then a state equation of the form

[x(k+l)]

A(k)

x(k)

F(k)

:(k)

y(k)=
yields

[C(k)

B(k)
U

(k)

(3)

01

the same input-output behavior. This is clear from Figure 26.2, or, since the

transition matrix for (3) is block diagonal, from the easy calculation

[c(k) 0]

[)]=

j+l)B(j), k j+l

IJOG
This example shows that if a linear state equation of dimension n has input-output
behavior specified by G (k, j), then for any positive integer q there are state equations
of dimension ii + q that have the same input-output behavior. Thus our main theoretical
interest is to consider least-dimension linear state equations corresponding to a specified
G (k, j). A direct motivation is that a least-dimension linear state equation is in some
sense a simplest linear state equation yielding the specified input-output behavior.

Realizability

479

.v(k0)
?

u(k)

x(k+ I)

III

A(k)x(k) + B(k)u(k)

x(k)

y(k)
C(L)_F

:(k0)

26.2

Figure Structure of the linear state equation (3).

26.3 Remark Readers familiar with continuous-time realization theory (Chapter 10)
might notice that we do not have the option of defining a weighting pattern in the
discrete-time case. This restriction is a consequence of non-invertible transition
matrices, and it leads to a number of difficulties in discrete-time realization theory.

Methods we use to circumvent some of the difficulties are reminiscent of the


continuous-time minimal realization theory for impulse responses discussed in Chapter
II. However not all difficulties can be avoided easily, and our treatment contains gaps.
See Notes 26.2 and 26.3.

DOD
Terminology that aids discussion of the realizability problem can be formalized as
follows.
26.4 Definition

A linear state equation of dimension n

x(k+l) =A(k)x(k) + B(k)u(k)


y(k) = C(k)x(k)
is

called a realization of the unit-pulse response G (k, j) jf


G(k, j) =

for all k, j such

that k j

j+l)B(j)

+1. If a realization (4) exists, then the unit-pulse response is

called realizable, and (4) is called a minimal realization if no realization of G (k, j)


with dimension less than n exists.

26.5 Theorem The unit-pulse response G (k, j) is realizable if there exist a p x n


matrix sequence H (k) and an n x m matrix sequence F (k), both defined for all k, such
that

G(k,

for all k, j

such that

k j

1.

j) =H(k)F(j)

Chapter 26

480

Proof
such that

Discrete Time: Realization

Suppose there exist (constant-dimension) matrix sequences F (k) and H (k)


(6) is satisfied. Then it is easy to verify that

x(k+l)=Ix(k) + F(k)u(k)
y(k)=H(k)x(k)

(7)

is a realization of G (k, j), since the transition matrix for an identity matrix is an identity
matrix.

DOD
Failure of the factorization condition (6) to be necessary for realizability can be

illustrated with exceedingly simple examples.

26.6 Example

The unit-pulse response of the scalar, discrete-time linear state equation

x(k+l) = u(k)

y(k)=x(k)

(8)

be written as G (k, j) = (kjl), where (k) is the unit pulse, since the transition
matrix (scalar) for 0 is 4(k, j) = 6(kj). A little thought reveals that there is no way to
can

write this unit-pulse response in the product form in (6).

DOD
While Theorem 26.5 provides a basic sufficient condition for realizability of unit-

pulse responses, often it is not very useful because determining if G (k, j) can be
factored in the requisite way can be difficult. In addition a simple example shows that
there can be attractive alternatives to the realization (7).

26.7 Example

For the unit-pulse response

G(k, j) = 2
an

, k j+1

obvious factorization gives a time-varying, dimension-one realization of the form (7)

x(k+l) =x(k)

+ 2ku(k)

y (k) =

This linear state equation has an unbounded coefficient and clearly is not uniformly

exponentially stable. However neither of these displeasing features is shared by the


time-invariant, dimension-one realization

+ ii(k)

y(k) =x(k)

Transfer Function Realizability

481

Transfer Function Realizability


the time-invariant case realizability conditions and methods for computing a

For

realization can be given in terms of the unit-pulse response G (k), often called in this
context the Markov para!neter sequence, or in terms of the transfer function. Of course
the transfer function is the z-transform of the unit-pulse response. We concentrate here
on the transfer function, returning to the Markov-parameter setting at the end of the
chapter. That is, in place of the time-domain (convolution) description of input-output
(zero-state) behavior

G(kj)u(j)

y(k) =
j=0
the

input-output relation is considered in the form

Y(:) = G(z)U(z)
where

=0

Similarly Y(z) and U(z) are the ztransforms of the output and input signals. We

continue to assume D = 0, so G (0) = 0, and the realizability question is: Given a p x ,iz
transfer function G(z), when does there exist a time-invariant linear state equation

x(k+l)

=Ax(k) + Bu(k)

y(k) = Cx(k)
such

that

C(zI

=G(z)

(This question is identical in format to its continuous-time sibling, and Theorem 10.10
carries over with no more change than a replacement of s by z.)
26.8 Theorem The transfer function G(z) admits a time-invariant realization (13) if
and only if each entry of G(z) is a strictly-proper rational function of z.

Proof If G(z) has a time-invariant realization (13), then (14) holds. As argued in
Chapter 21, each entry of (:1 A)
is a strictly-proper rational function. Linear

combinations of strictly-proper rational functions are strictly-proper rational functions,


so each entry of G(z) in (14) is a strictly-proper rational function.
Now suppose that each entry, G11(z), is a strictly-proper rational function. We can
assume that the denominator polynomial of each G11(z) is
that is, the coefficient
of the highest power of z is unity. Let

d(z)=f

+ d0

Chapter 26

482

Discrete Time: Realization

be the (monic) least common multiple of these denominator polynomials. Then


d(z)G(z) can be written as a polynomial in z with coefficients that are p x in constant
matrices:
d(z)G(z) =
+ N1z + N0
(15)
+
From this data we will show that the mr-dimensional linear state equation specified by

the partitioned coefficient matrices


OflJ

0,,,

0,,,

on,

0,,,

. .

on,

n,

rn

doim d I'n,

is

1,,,

d,. I'n,

a realization of G(z). Let


X(:) = (zi

AY'B

(16)

and partition the mr x m matrix X(z) into r blocks X1 (z),. . ., Xr(z), each nz x m.
Multiplying (16) by (zi A) and writing the result in terms of partitions gives the set of
relations
= :X1(:)

Xr(z) + d0X1 (z) + d1X2(z) + ..

(17)

r 1

= I,,,

Using (17) to rewrite (18) in terms of X1 (z) gives


X1(z) =
Therefore, from (17) again,
1,,,

X(z)= d(z)
2r

Finally multiplying through by C yields

C(zl

= G(z)

ODD

[No + N1: +

... + Nr_If_I)

Minimal Realization
The

483

realization for G(:) written down in this proof usually is far from minimal,

though it is easy to show that it is always reachable.

26.9 Example For ni = p = I the calculation in the proof of Theorem 26.8 simplifies to
yield, in our customary notation, the result that the transfer function of the linear state
equation
o
o

v(k+l)

0
0

0
0

.v(k) +

=
o

I]

is

given by
_,,I

G(:)=

:' +

. .

+CI_ +CO
+ a1: + a0
.

(20)

case is worked out in Example 21.3.) Thus the reachable realization (19)
can be written down by inspection of the numerator and denominator coefficients of a
given strictly-proper rational transfer function in (20). An easy drill in contradiction
proofs shows that the linear state equation (19) is a minimal realization of the transfer
function (20) (and thus also observable) if and only if the numerator and denominator
polynomials in (20) have no roots in common. (See Exercise 26.8.) Arriving at the
analogous result in the multi-input, multi-output case takes additional work that is
carried out in Chapters 16 and 17.
(The n = 2

Minimal Realization
Returning to the time-varying case, we now consider the problems of characterizing and

constructing minimal realizations of a specified unit-pulse response. Perhaps it is helpful


to mention some simple-to-prove observations that are used in the development. The
and observability on [kg,, k1] are not
first is that properties of reachability on [k(,,
effected by a change of state variables. Second if (4) is an n-dimensional realization of a
given unit-pulse response, then the linear state equation obtained by changing variables
according to : (k) = P - '(k)x (k) also is an n-dimensional realization of the same unitpulse response.
It is not surprising, in view of Example 26.1. that reachability and observability

play a role in characterizing minimality. However these concepts do not provide the
whole story, an unfortunate fact we illustrate by example and discuss in Note 26.3.

26.10 Example

The discrete-time linear state equation

Chapter 26

484

x(k+l)=
y(k)=
is

Discrete Time: Realization

+
6(k)]x(k)

[1

both reachable and observable on any interval containing k =

0,

1, 2. However the

unit-pulse response of the state equation can be written as

G(k, j)

=1 +

=1, kjl
since 3(k)6(jl) is zero for k j+l. The state equation (21) is not a minimal
realization of this unit-pulse response, for indeed a minimal realization is provided by
the scalar state equation

z(k+i)=z(k) + ii(k)
y(k)=z(k)
DOD
One

way to avoid difficulty is to adopt stronger notions of reachability and

observability.

26.11 Definition The linear state equation (1) is called I-step reachable if! is a positive
integer such that (1) is reachable on [k0, k0 +1] for any k0.

It turns out to be more convenient, and of course equivalent, to consider intervals


of the form [k0!, k0J. In this setting we drop the subscript 0 and rewrite the
reachability matrix R (k0, k1) for consideration of 1-step reachability as follows. For any
integer 1 1 let
R1(k) =R(k!, k)
= {

B (ki)

c1(k, k 1 )B (k 2)

...

'1(k, kI + 1 )B (k _!)J

(22)

and similarly evaluate the corresponding reachability Gramian to write


kI

c1(k, jl)B(j)BT(j)ctT(k, j+l)

W(kI, k) =
j=k!

Then from Theorem 25.3 and Theorem 25.4 we conclude the following characterizations

of i-step reachability in terms of either R1(k) or W (kI, k).

26.12 Theorem The linear state equation (1) is i-step reachable if and only if

rankR,(k)=n
for all k, or equivalently W (kI, k) is invertible for all k.

Minimal Realization

485

observability we propose an analogous setup, with a minor difference in the


form of the time interval so that subsequent formulas are pretty.
For

26.13 Definition The linear state equation (1) is called i-step observable if 1 is a
positive integer such that (I) is observable on [kr,, k(,+lJ for any k<,.
It is convenient to rewrite the observability matrix and observability Gramian for
consideration of i-step observability. For any integer / 1 let
O,(k) = 0 (k, k +1)

C(k)

C(k+l)c1(kl, k)
(23)

C(k+ll)c1(k+l1, k)
and

evaluate the observability Gramian to write


k+II

cIT(j, k)CT(j)C(j)cti(j, k)

M(k, kl)
j=k

26.14

Theorem

The linear state equation (1) is I-step observable if and only if

rank 0,(k) = n
for all k, or equivalently M (k, k +1) is invertible for all k.

It should be clear that if (I) is /-step reachable, then it is (l +q)-step reachable for
any integer q 0. The same is true of /-step observability, and so for a particular linear
state equation we usually phrase observability and reachability in terms of the largest of
the two /'s to simplify terminology. Also note that by a simple index change a linear
state equation is i-step reachable if and only if W (k, k +1) is invertible for all k. We
sometimes shift the arguments of I-step Gramians in this way for convenience in stating
results. Finally reachability and observability for a time-invariant, dimension-n linear
state equation are the same as n-step reachability and n-step observability.
26.15 Theorem Suppose the linear state equation (4) is a realization of the unit-pulse
response G (k, j). If there is a positive integer / such that (4) is both i-step reachable
and I-step observable, then (4) is a minimal realization of G (k, j).
Proof Suppose G (k, j) has a dimension-n realization (4) that is i-step reachable
and i-step observable, but is not minimal. Then we can assume there is an (n 1)-

dimensional realization

Chapter 26

486

z(k+l) =A(k)z(k)

Discrete Time: Realization

+ B(k)u(k)

y(k) = C(k)z(k)

(24)

and write

G(k, j)
for all k, j such that k

C(k)'iA(k, j+l)B(j) = C(k)bA(k, j+l)B(j)

j + 1.

These matters can

be

arranged in matrix form. For any k

we use the composition property for transition matrices to write the Ip x Im partitioned-

matrix equality
G(k,kl)

G(k,kI)

G(k+!l,kl)

G(k-i-I1,kl)

C(k)B(kl)

kI+l)B(kI)

k)B(kl)

kI+l)B (kI)

= 01(k)R,(k)

(25)

(This is printed in a sparse format, though it should be clear that the (i,j)-partition is the
matrix equality

p >< ni

kj+l)B(kf)

G(k+il, kf) =

= C(k+il)DA(k+il, k)c1(k, kj+1)B(k--j),


the

i, j I

row of O,(k) multiplying the j"-block column of


course a similar matrix arrangement in terms of the coefficients of the

right side of which is the

R1(k).) Of

realization (24) gives, in the obvious notation,


O,(k)R1(k) = 01(k)R,(k)

for all k. Since O,(k)

has ni columns and R,(k) has ni rows, we conclude that


rank [01(k)R1(k)] ni. This contradiction to the hypotheses of I-step reachability and
I-step observability of (4) completes the proof.

DOD
Another troublesome aspect of the discrete-time minimal realization problem,
illustrated in Exercise 26.1, also is avoided by considering only realizations that are Istep reachable and I-step observable. Behind an orgy of indices the proof of the
following result is similar to the proof of Theorem 10.14, and also similar to a proof
requested in Exercise 11.9. (We overlook a temporary notational collision of G's.)

487

Minimal Realization
26.16

Theorem

Suppose the discrete-time linear state equations (4) and

:(k+I)=F(k):(k) + G(k)u(k)
=
both

H(k):(k)

are /-step reachable and I-step observable (hence minimal) realizations of the same

unit-pulse response. Then there is a state variable change :(k)

relating

j+l)G(j)

(26)

the two realizations.

Proof By assumption,
C(k)c1,1(k. j+l)B(j) =

for all k, j such that k j+l. As in the proof of Theorem 26.15, (25) in particular, this
data can be arranged in partitioned-matrix form. Since I is fixed throughout the proof,
we use subscripts on the i-step reachability and /-step observability matrices to keep
track of the realization. Thus, by assumption,
(27)

for all k. Now define the ii x

ii

matrices

Pr(k) = Ra(k)R3(k)[Rj{k)Rj(k)11
= [OJ(k)O,(k) 1'
Using (27) yields P0(k)P,(k) = I for all k, which implies invertibility of both matrices

for all Ic. The remainder of the proof involves showing that a suitable variable change is

P(k) = Pr(k),

From (27) we obtain


=

=R1(k1)

(28)

the first block column of which gives

= G(k)

for all Ic. Similarly,


=

=
the

(29)

first block row of which gives

C(k)P(k) =H(k)
forall

Ic.

Chapter 26

488

Discrete Time: Realization

It remains to establish the relation between A(k) and F(k), and for this we

rearrange the data in (26) as

C(k+l)4A(k+l,k)
C(k+2)'tA (k+2, k)

+2, k)
Ra(k) =

C(k+l)tA(k+l, k)

k)

(This corresponds to deleting the top block row from (27) and adding a new block
bottom row.) Applying the composition property of the transition matrix, a more
compact form is
(30)

Oa(k+l)A(k)Ru(k) =
From (28) and (29) we obtain

Multiplying on the left by OJ(k + 1) and on the right by RJ(k) gives that
= F(k)

for all k.

ODD
A sufficient condition for realizability and a construction procedure for an 1-step
reachable and I-step observable (hence minimal) realization can be developed in terms
of matrices defined from a specified unit-pulse response G (k, j). Given positive
integers 1, q we define an (Ip) x (qrn) behavior matrix corresponding to G (k, j) as

G(k,j)

riq(k,j)=

G(k+l,j)

G(k,jl)
G(k +l,jl)

G(k+11,j) G(k+11,j1)
for all k, j

such

G(k,jq+l)
G(k +l,jq +1)

that k j

G(k+l1,jq+1)

+1. This can be written more compactly as

['iq(k, J) = O,(k)dI)(k, j+l)RqCi+l)

In particular for j = ki, similar to (25),

k1) = Oi(k)Rq(k)

(32)

Analysis of two consecutive behavior matrices for suitable F, q, corresponding to


a specified G(k, j), leads to a realization construction involving submatrices of

Minimal Realization

489

1Tiq(k, kI). This result is based on elementary matrix algebra, but unfortunately the
hypotheses are rather restrictive. More general treatments based on more sophisticated
algebraic tools are mentioned in Note 26.2.
A few observations might be helpful in digesting proofs involving behavior
matrices. A suh,nat,-Lv, unlike a partition, need not be formed from entries in adjacent
rows and columns. For example one 2 x 2 submatrix of a 3 x 3 matrix A, with entries
clii, is

a11 a13

a31 a33

is useful to contemplate the properties of a large, rank-n matrix in regard to an n x n


invertible submatrix. In particular any column (row) of the matrix can be uniquely
expressed as a linear combination of the n columns (rows) corresponding to the
It

columns (rows) of the invertible submatrix.

Matrix-algebra concepts associated with rjq(k, j) in the sequel are applied


pointwise in k and j (with k j + 1). For example linear independence of rows of

r,q(k. j) involves linear combinations of the rows using scalar coefficients that depend
on k and j. Finally it is useful to write (31) in more detail on a large sheet of paper, and
use sharp pencils in a variety of colors to explore the geography of behavior matrices
developed in the proofs.

26.17 Theorem

Suppose for the unit-pulse response G (k,

integers I, q, a such that I, q

j)

there exist positive

and

rank r,,,(k, J) = rank F/+l q+l (/c, j) =

(33)

for all k, j with k j + 1. Also suppose there is a fixed n x a submatrix of r,q(k, J) that
is invertible for all k, j with k j + 1. Then G (k, j) is realizable and has a minimal
realization of dimension a.
Proof Assume (33) holds and F(k. j) is an a x a submatrix of
j) that is
invertible for all k, j with k j +1. Let F((k, j) be the p x a matrix comprising those
columns of
j) that correspond to columns of F(k, j), and let

Cjk, j)

F,.(k, j)F'(k, j)

(34)

Then the coefficients in the i'1'-row of C,.(k, j) specify the linear combination of rows of

F(k, j) that gives the i"-row of F, (k, j). Similarly let F, (k, j) be the n x m matrix
formed from those rows of r11 (k, j) that correspond to rows of F(k, j), and let
Br(k, J) =

F'(k, J)Fr(k, f)

The il/I_column of B,.(k, j) specifies the linear combination of columns of F(k, j) that
gives the i"-column of F,(k, j). Then we claim that

Chapter 26

490
G(k, f) =

Discrete Time: Realization

f)Fr(k, j)
j)F(k, f)Br(k, f)

(35)

for all k, j with k f + I. This relationship holds because, by (33), any row of Fjq(k, f)
can be represented as a linear combination of those rows of Fjq(k. f) that correspond to
rows of F(k, f). (Again, throughout this proof, the linear combinations resulting from
the rank property (33) have scalar coefficients that are functions of k and f defined for
k

j+l.)

in particular consider the single-input, single-output case. If in =j' = I, the


hypotheses imply I = q = n, F(k, j) = F,111(k, j), and Fjk, f) is the first row of
r,,,1(k, f). Therefore
f) = ef, the first row of I,,. Similarly Br(k, f) = e1, and (35)
turns out to be the obvious

G(k, j) = efr',,,,(k, j)e1 = F11(k, j)


(At various stages of this proof, consideration of the in = p = case is a good way to
ease into the admittedly-complicated general situation.)
j) is independent of j. From (34) we can write
The next step is to show that
1

fi) = F((k,

fI)

in riq+i(k. j) each column of F(k, fi) occurs in columns to the right of the
corresponding column of F(k, j). And the columns of
fi) have the same
relative locations with respect to columns of F1.(k, f). Thus the rank condition (33)
But

again implies that the i'1'-row of


j) specifies the linear combination of rows of
+ 1(k, f) corresponding to rows of F (k, j 1) that yields the i'1' -row of
(k, ji) in
j). Since the rows of
f) are extensions of the rows of fiq(k, I),
follows that

fl) =

j)

and, with some abuse of notation, we let


=

ki) =

ki)F'(k, ki)

A similar argument can be used to show that B,(k,


with more of the same abuse of notation we let

f)

(36)

is independent of k. Then

(37)

and rewrite (35) as

G(k, f) =

f)Br(f)

(38)

for all k, f with k j+l.


The remainder of the proof involves reworking the factorization of the unit-pulse
response in (38) into a factorization of the type provided by a realization. To this end the
notation

Minimal Realization
F(k, J)

F(k+i,j)

is temporarily convenient. Clearly F(k, j) is an n x n submatrix of r,+i.q+i(k, j), and


each entry of F5(k, f) occurs exactly p rows below the corresponding entry of F(k, j).
j) can be written as a
Therefore the rank condition (33) implies that each row of
linear combination of the rows of F(k, j). That is, collecting these linear combination
coefficients into an

x ii matrix A (k, j),

j) = A(k, j)F(k, j)
However we can show that A(k, f) is independent of j as follows. Each entry of
F(k, fI) = F(k +1, jl) occurs nz columns to the right of the corresponding entry in
F (k +1, j), and the rank condition implies
A(k,

Fr(k, fi)

j)F(k,

f1)

Also

fI) =A(k, jl)F(k, fI)


and using the invertibility of F(k, fi) gives

A(k, j) =A(k, ji)


Therefore we let

A(k)=F(k,
the transition matrix

corresponding to A (k) is given by

f) = F(k, i)F'(j, I)
as is easily verified by checking, for any k, f with k f,
=F(k-i-i,

i)F'(j,

i)

i)

(40)

_A(k)C1A(k,j),

In this calculation the parameter i must be no greater than either k 1 or j i.

To continue we show that F '(k, i)F(k, j) is not a function of k. Let


E(k,

I, f)

i)F(k, j)

for example, the first column of E(k, 1, j) specifies the linear combination of
columns of F(k, I) that yields the first column of F(k, f). Each entry of F(k+l, i)
occurs in
(k, i) exactly p rows below the corresponding entry of F(k, i), and a
similar statement holds for the first-column entries of F (k +1, f). Therefore the first
Then,

Chapter 26

492

Discrete Time: Realization

column of E (k, i,

j) also specifies the linear combination of columns of F (k + 1, i) that


gives the first column of F (k + 1, j). Of course we also have
E(k+l, i, j)

(k+1, i)F (k+1, j)

from this we conclude that the first column of E (k + 1, i, j) is identical to the first
column of E (k, i, j). Continuing this argument in a column-by-column fashion shows
that E(k +1, i, j) = E(k, i, j), that is, E(k, i, j) is independent of k. We use this fact to
and

set

i)F(k,j)=F'(j+l,i)F(j+l,j)
which gives

F(k, j) = F(k,
=

i)F'(j+1, i)F(j+l, j)

j+l)F(j+l, j)

Then applying (36) and (37) shows that the factorization (38) can be written as

j)Br(J)

G(k, I) =

all k, j with k j + 1. Thus it is clear that an n-dimensional realization of G (k,


specified by
for

A(k)
B(k)

j)

is

kI)
Fr(k+l, k)

C(k)=F((k, kl)F'(k, kI)

(42)

j+l.

T,1,,(k, I) has rank at least n for all k, j such that k


has rank at least ii for all k. Then (32) gives that the realization
we have constructed is n-step reachable and n-step observable, hence minimal.

Finally since!, q

Therefore f,11, (k,

k 1)

26.18 Example

Given the unit-pulse response

G(k,

= 2k sin [it(kj)14]

realizability test in Theorem 26.17 and realization construction in the proof begin
with rank calculations. With drudgery relieved by a convenient software package, we
the

find that
r72(k, .i) =

is invertible for all k,

2ksin [7t(kj)/4]

2Asin [ic(kj+l)14]

sin {rc(kj + 1)14] 2k+1 sin [7t(kj +2)/4]

with k j + 1. On the other hand further calculation yields

det r33(k, J) = 0 on the same index range. Thus the rank condition (33) is satisfied with
1 = k = n = 2,

and we take F(k, j) = r22(k, j). Then

Time-Invariant Case

493

2Asinit/2

2Asinic/4

k k I)

Straightforward calculation of F(k, j) = F (k + 1,

j)

leads to

Fjk, kI) is the first row of f22(k, ki), and F,(k +1, k)
r,,(k +1, k), the minimal realization specified by (42) is

Since

x(k+l)=
y(k)=

x(k) +

[1

is

the first column of

u(k)

01x(k)

Time-Invariant Case
The issue of characterizing minimal realizations is simpler for time-invariant systems,
and converse results missing from the time-varying case, Theorem 26.15, faIl neatly into
place. We offer a summary statement in terms of the standard notations
k

G(kj)u(j)

y(k)=

(43)

j=O

for time-invariant input-output behavior (with G (0) = 0), and

x(k+l) Ax(k) + Bu(k)


y(k)Cx(k)

(44)

for a time-invariant realization. Completely repetitious parts of the proof are omitted.

A time-invariant realization (44) of the unit-pulse response G (k) in


(43) is a minimal realization if and only if it is reachable and observable. Any two
26.19 Theorem

minimal realizations of G (k) are related by a (constant) change of state variables.

Proof If (44) is a reachable and observable realization of G (k), then a direct


specialization of the contradiction argument in the proof of Theorem 26.15 shows that it
is a minimal realization of G (k).
Now suppose (44) is a (dimension-n) minimal realization of G (k), but that it is not
reachable. Then there exists an n x I vector q 0 such that

qT[B AB ... An_IB]=0


Indeed qTAkB = 0 for all k

0 by the Cayley-Hamilton theorem. Let P' be an

Chapter 26-

494

Discrete Time: Realization

invertible n x n matrix with bottom row qT, and Jet z(k) =

to obtain the linear

state equation

z(k+i)=Az(k) +Bu(k)
y(k)=Cz(k)

(45)

which also is a dimension-n, minimal realization of G (k). We can partition the


coefficient matrices as
B1

C=CP=

C2]

A21 A22

where A is (ni) x (ni), B is (ni) x I, and C, is 1 x (ni). In terms of these


partitions we know by construction of P that
=

Aii&
0

A21B,

Furthermore since the bottom row of P' A kB

is

zero for all k 0,


,

kO

(46)

Using this fact it is straightforward to produce an (n 1)-dimensional realization of


G(k) since

kO
Of course this contradicts the original minimality assumption. A similar argument leads
to a similar contradiction if we assume the minimal realization (44) is not observable.
Therefore the minimal realization must be both reachable and observable.
Finally showing that all time-invariant minimal realizations of a specified unitpulse response are related by a constant variable change is a simple specialization of the
proof of Theorem 26.16.

DOD
We next pursue a condition that implies existence of a time-invariant realization
for a unit-pulse response written in the time-varying format G (k, j). The discussion of
the zero-state response for a time-invariant linear state equation at the beginning of
Chapter 21 immediately suggests the condition

G(k, j) = G(kj, 0)

(47)

k j + 1. A change of notation helps to simplify the verification of this


suggestion, and directly connects to the time-invariant context. Assuming G (k, j)
satisfies (47) we replace kf by the single index k and further abuse the overworked
for all k, j with

Time-Invariant Case

495

G-notation to write
G (k) = G (k, 0), k 1

(48)

This simplifies the notation for an associated behavior matrix JT1q(k, I) = riq(kj, 0),

defined for k j

+ 1,

to

G(k)
G(k+l)
G(k+l) G(k+2)
r,q(k) =

G(k+1l) G(k+l)

G(k-i-q--l)
G(k+q)
:

(49)

G(k+1+q2)

Of course if a unit-pulse response G (k), k 1, is specified in the context of the inputoutput representation (43), then behavior matrices of the form (49) can be written
directly.
Continuing in the style of Theorem 26.17, we state a sufficient condition for time-

invariant realizability of a unit-pulse response and a construction for a minimal


realization. The proof is quite similar, employing linear-algebraic arguments pointwise
in k, but is included for completeness.
26.20 Theorem Suppose the unit-pulse response G (k, j) satisfies (47) for all k, j with
k j + 1. Using the notation in (48), (49), suppose also that there exist integers 1, q, n
such that 1, q n and

rankFiq(k)=rankl',iq+t(k)=n, kl

(50)

Finally suppose that there is a fixed n x n submatrix of r,q(k) that is invertible for all
k 1. Then the unit-pulse response admits a time-invariant realization of dimension n,
and this is a minimal realization.
Pi-oof Let F (k) be an n x n submatrix of F'iq(k) that is invertible for all k 1. Let
FL(k) be the p x n matrix comprising those columns of Fiq(k) that correspond to

columns of F (k), and let F, (k) be the n x m matrix of rows of F11 (k) that correspond to
rows of F(k). Then let
CL(k) =

B,.(k) =

of
gives the coefficients in the linear combination of rows of F (k)
that produces the i'1'-row of F((k). Similarly the i"-column of B,.(k) specifies the
linear combination of columns of F (k) that produces the i"-column of Fr(k). Also the
i'1'-row of C (k) gives the coefficients in the linear combination of rows of F,.(k) that
The

gives the i'1'-row of

1(k) = G (k). That is,

Chapter 26

496
G

Discrete Time: Realization

(k) = Cc(k)Fr(k) = Ce(k)F(k)Br(k), k I

Next we show that


(k) is a constant matrix. In r,,q + 1(k) each entry of
F (k +1) occurs in columns to the right of the corresponding entry of F (k). By the rank
property (50) the linear combination of rows of F (k + 1) specified by the i '1'-row of
gives (uniquely by the invertibility of F (k + 1)) the row of entries that occurs rn
This is precisely the
columns to the right of entries of the ifh_row of
of
+1), which also can be uniquely expressed as the i'1'-row of C((k +1) multiplying
for k I and write, with some
Thus we conclude that Ce(k) =

abuse of notation,
C,

From a similar argument it follows that Br(k) is a constant matrix, and we write

B, =
Then (51) becomes

G(k) = CcF(k)Br = Fc.(l)F1(l)F(k)F'(l)Fr(l) ,

(52)

The remainder of the proof is devoted to converting this factorization into a form
from which a time-invariant realization can be recognized. Consider the subrnatrix
= F(k+l) of F'ii.q(k). Of course there is ann x n matrix A(k) such that
F(k) = A(k)F(k)

(53)

However arguments similar to those above show that A (k) is a constant matrix, and we
let A = F(I)F1 (1). Then from (53), written in the form F(k 1) = AF(k), we conclude

F(k)=Ak_IF(1), k 1
and thus rewrite (52) as

G(k) = [Fc(l)F_I(l)IAk_lFr(l)
Now it is clear that a realization is specified by

A =F3(l)F'(l)
B

Fr(l)
(54)

The final step is to show that this realization is minimal. However this follows in a
now-familiar way by writing (49) in terms of the realization as

= Oi(k)Rq(k), k I
and

invoking the rank condition (50) to obtain rank O,(k) = rank Rq(k) =

n,

for k 1.

Time-Invariant Case

497

Thus the realization is reachable and observable, hence minimal by Theorem 26.19.
26.21

Example

Consider the unit-pulse response


[2(2L)

G(k)=
a

real parameter. inserted for illustration. Then F11(k) = G(k), and


2

a(2L_l)

4
2

For a = 0,

rankr11(k)=rankr,,(k)=2, kl
so a minimal realization of G (k) has dimension two. Clearly a suitable fixed, invertible
submatrix is

F(k)=F'11(k)=2k

20
1

Then
F(k) =

F(k+l)
F(k)

Fr(k) =

and the prescription in (54) gives the minimal realization (a = 0)

.v(k+l)=

y(k)=

x(k)

(56)

For the parameter value a = 2, it is left as an exercise to show that minimal


realizations again have dimension two. If a 0, 2, then matters are more interesting.
Calculations with the help of a software package yield

rank r,1(k) = rank r'13(k) = 3, k 1

The upper-left 3 x 3 submatrix of r',,(k)

is obviously not invertible, but selecting


columns 1, 2, and 4 of the first three rows of r,-,(k) gives the invertible (for all k 1)

matrix

Chapter 26

498
2

F(k)=2L

a(2k_1)

2a(2kl_1)
2

Discrete Time: Realization

(57)

4a(2L12_1)

This specifies a minimal realization as follows. From

F(k +1) we get

12cc 56cc

4
4
8
16 56cc 240cc

and, from F(1),


16cc

16a(a+2)

8cc2

4cc

32cc 4+6cc
4+6cc 8cc 2a
828cc

Columns 1, 2 and 4 of r12(1) give


12cc]

and the first three rows of F21 (1) provide

Fr(1)

2cc

12cc

Then a minimal realization is specified by (a 0,

001

0 20

2)

BFr(l)

8 0 6
?

42cc
2

12cc

(58)

This realization can be verified by computing CAk_IB, k 1, and a check of reachability


and observability confirms minimality.

Realization from Markov Parameters


There is an alternate formulation of the realizability and minimal-realization problems in

the time-invariant case that, in contrast to Theorem 26.20, leads to a necessary and
sufficient condition. We use exclusively the time-invariant notation, and first note that
the unit-pulse response G (k) in (43) comprises a sequence of p x matrices with
G (0) = 0 since state equations with D = 0 are considered. Simplifying the notation to
Gk = G (k), the unit-pulse response sequence

Realization from Markov Parameters

499

G0 = 0,

G,, G2,

is called in this context the Markov parameter sequence. From the zero-state solution
formula, it is clear that the time-invariant state equation (44) is a realization of the unitpulse response (Markov parameter sequence) if and only if

G1=CA''B, i=l,2,...

(59)

This shows that the realizability and minimal realization problems in the time-invariant
case can be viewed as the matrix-algebra problems of existence and computation of a
minimal-dimension matrix factorization of the form (59) for a given Markov parameter
sequence.

The Markov parameter sequence also can be obtained from a given transfer
function representation G(z). Since 0(z) is the z-transform of the unit-pulse response,
G(z) = G0 + G1z' + G2z2 + G3z3 +

(60)

taking account of G0 = 0, and assuming the indicated limits exist, we let the complex
variable z become large (through real, positive values) to obtain
G1

zG(z)

G1]

G2

G3=limz[z2G(z)zG1 G21

Alternatively if G(z) is a matrix of strictly-proper rational functions, as by Theorem


26.8 it must be if it is realizable, then this limit calculation can be implemented by
polynomial division. For each entry of 0(z), divide the denominator polynomial into
the numerator polynomial to produce a power series in z
Arranging these power
series in matrix-coefficient form, the Markov parameter sequence appears as the
sequence of p x m coefficients in (60).

The time-invariant realization problem for a given Markov parameter sequence


leads to consideration of the set of what are often called in this context block Hankel
matrices.

G1G2...
G2 G3

Gq

Gq+i
;

G,
Indeed

l,q=l,2,...

Gi+q_i

the form of (61) is not surprising once it is recognized that riq is Fiq( 1) from

(49).

Discrete Time: Realization

Chapter 26

500

Using (59) it is straightforward to verify that the q-step reachability

and F-step

observability matrices

C
Rq =

[B

AB

CA

A"'

for a realization of a Markov parameter sequence are related to the block Hankel
matrices by

I.q=l,2,...

(62)

pattern of entries in (61), when q is permitted to increase indefinitely, captures


essential algebraic features of the realization problem. This leads to a realizability
criterion for Markov parameter sequences and a method for computing minimal
The

realizations.

26.22 Theorem
The unit-pulse response G(k) in (43) admits a time-invariant
realization (44) if and only if there exist positive integers I. q, a with 1. q a such that
rank r,q = rank

=a

j = I,

2,

. .

(63)

If this rank condition holds, then the dimension of a minimal realization of G (k) is a.

Proof Assuming I, q, and a are such that the rank condition (63) holds, we will
construct a minimal realization for G (k) of dimension a by a procedure roughly similar
to that in preceding proofs.
denote the a x qni submatrix formed from the first a linearly independent
Let
rows of FIq. Also let
be another a x qnz submatrix defined as follows. The ihlt_row
of
is the row of r,+i.q that isp rows below the row of
that is the i'1'-row of
Hq. A realization of G (k) can be constructed in terms of related submatrices. Let
(a) F be the invertible a x a matrix comprising the first a linearly independent columns
of Hq,

(b) F1 be the a x a matrix occupying the same column positions in

as does F in Hq,

(c) F, be the p x a matrix occupying the same column positions in


(d) Fr be the a x ni matrix comprising the first in columns of Hq.

as does F in 11q'

Then consider the coefficient matrices defined by


A=

B = Fr, C =

(64)

Since F1 = AF, entries in the i'1'-row of A specify the linear combination of rows of F
that results in the
row of F5. Therefore the j'1'-row of A also specifies the linear
combination of rows of H(, yielding the
of
that is,
= AHq.

Realization from Markov Parameters

501

In fact a more general relationship holds. Let H be the extension or restriction of

That is, each row of Hq, which is a row of r'jq, either is


Hq in r11, j = 1, 2
truncated (if j <q) or extended (if j > q) to match the corresponding row of
in
Then (63)
as the row extension or restriction of
Similarly define
implies

j=l,2,...

(65)

Also

= [Fr
For example

j = 2, 3, ...

(66)

and H2 are formed by the rows in


G1

G1

G7

G2G3
G1 G,.,1

G1

respectively, that correspond to the first n linearly independent rows in Fiq. But then

can be described as the rows of H2 with the first ni

entries

deleted, and from the

4,...

(67)

definition of Fr it is immediate that H2 = [Fr HI ].


Using (65) and (66) gives

= [Fr

AFr

= 3,

and, continuing,
= [F,. AF,.
=

[B

...

Fr]

AJ'B]. j=l,2,...

AB

From (64) the i'1'-row of C specifies the linear combination of rows of F that gives the
i'1'-row of
But then the
of C specifies the linear combination of rows of
that gives f11. Since every row of r11 can be written as a linear combination of rows of
H1, it follows that

r11=CH1= [cB CAB


= [G1

G2

j=l,2,...

Therefore

Gk=CAk_tB,

k=l,2,...

(68)

this shows that (64) specifies an n-dimensional realization for G (k). Furthermore it
is clear from a simple contradiction argument involving (62) and the rank condition (63)
that this realization is minimal.
and

Chapter 26

502

Discrete Time: Realization

To prove the necessity portion of the theorem, suppose that G (k) has a timeinvariant realization. Then from (62) and the Cayley-Hamilton theorem there must exist
integers I, k, ,z, with I, k ii, such that the rank condition (63) holds.

It should be emphasized that the rank test (63) involves an infinite sequence of
behavior matrices and thus the complete Markov sequence. Truncation to finite data is
problematic in the sense that we can never know when there is sufficient data to compute
a realization. This can be illustrated with a simple, but perhaps exaggerated, example.
26.23 Example

The Markov parameter sequence for the transfer function

G()

1/2 + z'(:2)

1/2)

begins innocently enough as

Go=0; G1=l/2'', 1=1,2

99

Addressing Theorem 26.22 leads to Hankel matrices where each column appears to be a
power of 1/2 times the first column. Of course this is based on Hankel matrices of the

form (61) with l+q 100, and just when it appears safe to conclude from (63) that
n = 1, the rank begins increasing as even larger Hankel matrices are contemplated. In

fact the observations in Example 26.9 lead to the conclusion that the dimension of
minimal realizations of G(z) is n = 101.

Additional Examples
The appearance of nonminimal state equations in particular settings can reflect a
disconcerting artifact of the modeling process, or an underlying reality. We indicate the
possibilities in two specific situations.
26.24 Example A particular case of the cohort population model in Example 22.16, as
mentioned in Example 25.14, leads to the linear state equation

01/40
x(k+1) =

1/4

x(k) +

1/2 1/4 1/4

y(k) =

[1

u(k)

1 1x(k)

(69)

This is not a minimal realization since it is not observable. Focusing on input-output


behavior, a reduction in dimension is difficult to 'see' from the coefficient matrices in the
state equation, but computing y (k + I) leads to the equation

y(k+l)=(l/2)y(k) + u(k)

(70)

It is left as an exercise to show that both (69) and (70) have the same transfer function,

Exercises

503

G(z)

Needless to say the state equation in (69) is an inflated representation of the effect of the

immigration input on the total-population output.

26.25 Example When describing a sampled-data system by a discrete-time linear state


equation, minimality can be lost in a dramatic fashion. From Example 25.15 consider
the continuous-time, minimal state equation

(t) is produced by a period-T zero-order hold, then the discrete-time description is

x[(k+1)T] =
y(kT)

x(kT)

[1 cosT]

u(kT)

0]x(kT)

[1

For the sampling period T = it, the state equation becomes

x[(kl)T}= [
)'(kT)

01

0](kT)
0]x(kT)

[1

(72)

This state equation is neither reachable nor observable, and its transfer function is

G(z) =

Worse, suppose T = 2it. In this case the discrete-time linear state equation has transfer
function G(z) = 0, which implies that the zero-state response of (71) to any period-T
sample-and-hold input signal is zero at every sampling instant. Matters are exactly
soeverything interesting is happening between the sampling instants!

EXERCISES
Exercise 26.1

Show that the scalar linear state equations

x(k+l) =x(k)

+ 8(kl)u(k)

= 6(k2).v(k)

and

:(k+l) =:(k) + (kl)u(k)


=
both

are minimal realizations of the same unit-pulse response. Are they related by a change of

state variables?

Chapter 26

504

Discrete Time: Realization

Exercise 26.2 Prove or find a counterexample to the following claim. If a discrete-time, timevarying linear state equation of dimension n is I-step reachable for some positive integer l, then it
is n-step reachable.

Exercise 26.3 Suppose the linear state equations

x(k+I)=Lv(k) +B(k)u(k)
y(k) = C(k)x(k)
and

:(k+l) =Iz(k) + F(k)u(k)


y(k) =H(k):(k)
are I-step reachable and observable realizations of the unit-pulse response G (k, j). Show
such that :(k) =
and provide an
expression for P.
both

that there exists a constant, invertible matrix P


Exercise 26.4

If the time-invariant, single-input, single-output, n-dimensional linear state

equation

x(k+1) =Ax(k)

+ bu(k)

)'(k)C.V(k) +du(k)
is a realization of the transfer function G(z), provide an (17 + 1)-dimensional realization of

H(:) =

G(:) I

that can be written by inspection.

Exercise 26.5 Suppose the time-invariant, single-input, single-output linear state equations

x0(k+I)
y(k)

=Ax0(k) +

bu(k)

and

x,,(k+l)

=Fxh(k) +

gu(k)

y (k) = /zx,,(k)
are

both minimal. Does this imply that the linear state equation

y(k)=
is

[c

h]x(k)

minimal? Repeat the question for the state equation

[0Ju(k)

x(k+1)=
y(k)
Exercise 26.6

[C

O]x(k)

Use Theorem 26.8 and properties of the z-transform to describe a necessary and

Exercises

505

sufficient condition for realizability of a given (time-invariant) unit-pulse response G(k).


Exercise 26.7 Show that a transfer function G(:) is realizable by a time-invariant linear state
equation (with D possibly nonzero)

.v(k+l)=Ax(k) +13u(k)
y(k)=C'.v(k) + Du(k)
il and only if each entry of G(:) is a proper rational function (numerator polynomial degree no
greater than denominator polynomial degree).

Exercise 26.8
Prove the following generalization of an observation in Example 26.9. The
single-input, single-output, time-invariant linear state equation

.v(k+l) =Ax(k) +

r(k) =
is minimal (as a realization of its transfer function) if and only if the polynomials det (:1A) and
c' adj(:1 A )h have no roots in common.

Given any ,i x ii matrix sequence A (k) that is invertible at each k, do there exist
n x I and i x n vector sequences 6(k) and c(k) such that
Exercise 26.9

.v(k+1) =A(k).v(k) + b(k)u(k)


= c(k).v(k)

is a minimal realization? Repeat the question for constant A, 6, and c.


Exercise 26.10

Compute a minimal realization of' the Fibonacci sequence


0.

1,

1, 2, 3, 5, 8. 13.

using Theorem 26.22. (This can be compared with Exercise 21.8.)


Exercise 26.11

Compute a minimal realization corresponding to the Markov parameter sequence

0. I.

1, 1. 1.

I.

I. I..

Then compute a minimal realization corresponding to the 'truncated' sequence

0,1,1, 1,0,0.0,0,...
Exercise 26.12 Suppose first 5 values of the Markov parameter sequence G0, G , G2, ... are
known to be 0. 0, 1. 1/2. 1/2. but the rest are a mystery. Show that a minimal realization of the
transfer function

:2(:_I)
fits the known data. Compute a dimension-2 state equation that also fits the known data. (This
shows that issues of minimality are more subtle when only a portion of the Markov parameter
sequence is known.)

Discrete Time: Realization

Chapter 26

506

NOTES
Note 26.1

The summation representation (2) for input-output behavior can be motivated more-

or-less directly from properties of linearity and causality imposed on a general notion of
'discrete-time system.' (This is more difficult to do in the case of integral representations for a
linear, causal, continuous-time system, as mentioned in Note 10.1.) Considering the single-input
case for simplicity, the essential step is to define G(k, j), k j, as the response of the causal
'system' to the unit-pulse input zi(k) = 3(k j), for each value of j. Then writing an arbitrary
input signal defined for k = k,,, k,,+l, ... as a linear combination of unit pulses,
u(k0)6(kk0) +

linearity implies that the response to this input is

y(k) = G(k, k0)u(k,,) + G(k, k,,+l)u(k0+l) +


=

+ G(k, k)u(k)

G(k,j)u(j), kk.

Going further, imposing the notion of time invariance easily gives

G(kj, 0)u(j), kk,,


j=l."

Additional, technical considerations do arise, however. For example if we want to discuss the
response to inputs beginning at oo, that is, let k,, oo, then convergence of the sum must be
considered. The details of such loftysome might say airyissues of formulation and

representation are respectfully avoided here. For a brief yet authoritative account, see Chapter 2 of
E.D. Sontag, Mathematical Control Theory, Springer-Verlag, New York, 1990
Further aspects, and associated pathologies, are discussed in

A.P. Kishore, J.B. Pearson. 'Kernel representations and properties of discrete-time input-output
systems," Linear Algebra and It s Applications, Vol. 205206, pp. 893908, 1994
Note 26.2 Early sources for discrete-time realization theory are the papers

D.S. Evans, "Finite-dimensional realizations of discrete-time weighting patterns," SIAM Journal


on Applied Mathematics. Vol. 22, No. 1, pp.45 67, 1972
L. Weiss, "Controllability, realization, and stability of discrete-time systems," SIAM Journal on
Control and Optimization, Vol. 10, No. 2, pp. 230 251, 1972

In particular the latter paper presents a construction for a minimal realization of an assumedrealizable unit pulse response based on f/q (k, k 1). Further developments of the basic results
using more sophisticated algebraic tools are discussed in

J.J. Ferrer, "Realization of Linear Discrete Time-Varying Systems," PhD Dissertation, University
of Florida, 1984.
Note 26.3 The difficulty inherent in using the basic reachability and observability concepts to
characterize the structure of discrete-time, time-varying, linear state equations is even more severe
than Example 26.10 indicates. Consider a scalar case, with c(k) = I for all k ,and

507

Notes

a(k) = b(k) =

1, kodd
0, k even

Under any semi-reasonable definition of reachability, nonzero states cannot be reached at time k1
for any odd k1, but can be reached for any even k1. This suggests a bold reformulation where the

dimension of a realization is permitted to change at each time step. Using highly-technical


operator theoretic formulations, such theories are discussed in the article

I. Gohberg, M,A. Kaashoek, L. Lerer, in Time-Variant Systems and Interpolation. 1. Gohberg,


editor, Birkhauser, Basel, pp. 261 295, 1992

and in Chapter 3 of the published PhD Thesis

A.J. Van der Veen, Time-Va,ying System Theory and Computational Modeling, Technical
University of Delft, The Netherlands, 1993 (ISBN 90-53226-005-6)

Note 26.4 The realization problem also can be addressed when restrictions are placed on the
class of admissible state equations. For a realization theory that applies to a class of linear state
equations with nonnegative coefficient entries, see

H. Maeda, S. Kodama, "Positive realizations of difference equations," IEEE Transactions on


Circuits and Systems, Vol. 28, No. I. pp. 39 47, 1981

Note 26.5 The canonical structure theorem discussed in Note 10.2 is more difficult to formulate
in the time-varying, discrete-time case because the dimensions of various subspaces, such as the
subspace of reachable states, can change with time. This is addressed in

S. Bittanti, P. Bolzem, "On the structure theory of discrete-time linear systems," International
Journal of Systems Science, Vol. 17, pp. 33 47, 1986

For the K-periodic case it is shown that the structure theorem can be based on fixed-dimension
subspaces related to the concepts of controllability and reconstructibility. See also

O.M. Grasselli, "A canonical decomposition of linear periodic discrete-time systems,"


International Journal of Control, Vol. 40, No. I, pp. 201 214, 1984
Note 26.6 The problem of system identification deals with ascertaining mathematical models of
systems based on observed data, usually in the context of imperfect data. Ignoring the imperfectdata issue, at this high level of discourse the realization problem is hopelessly intertwined with the

identification problem. A neat separation is effected by defining system identification as the


problem of ascertaining a mathematical description of input-output behavior from observations of
input-output data, and leaving the realization

problem as we have considered

it.

This

unfortunately ignores legitimate identification problems such as determination, from observed


input-output data, of unknown coefficients in a state-equation representation of a system. Of
course the pragmatic remain unperturbed, viewing such problem definition and classification
issues as mere philosophy. In any case a basic introduction to system identification is provided in
L. Ljung, Syste,n Identification: Theory for the User. Prentice Hall, Englewood Cliffs, New Jersey,
1987

27
DISCRETE TIME
INPUT-OUTPUT STABILITY

In this chapter we consider stability properties appropriate to the input-output behavior

(zero-state response) of the linear state equation

x(k+l) =A(k)x(k) + B(k)u(k)


y(k) = C(k).v(k)
That is, the initial state is fixed at zero and attention is focused on boundedness of the

response to bounded inputs. The D(k)u(k) term is absent in (I) because a bounded
D (k) does not affect the treatment, while an unbounded D (k) provides an unbounded
response to an appropriate constant input. Of course the input-output behavior of (1) is
specified by the unit-pulse response
G (k, j) = C (k)cb(k,

j + I )B (j).

+I

and stability results are characterized in terms of boundedness properties of hG (k, 1)11.
For the time-invariant case, input-output stability also can be characterized conveniently
in terms of the transfer function of the linear state equation.

Uniform Bounded-Input Bounded-Output Stability


Bounded-input, bounded-output stability is most simply discussed in terms of the largest

value (over time) of the norm of the input signal, II u (k) hi, in comparison to the largest
value of the corresponding response norm, IIy (k) ii. We use the standard notion of
supremum to make this precise. For example
v= sup hbu(k)hi

Uniform Bounded-Input Bounded-Output Stability


is

509

for k k0. If no such bound

defined as the smallest constant such that IIu(k)II

exits, we write

sup IIu(k)II
A A,,

The basic stability notion is that the input-output behavior should exhibit finite
'gain' in terms of the input and output suprema.

27.1 Definition
The linear state equation (I) is called uniformly hounded-input,
hounded-output stable if there exists a finite constant
such that for any k(, and any
input signal 11(k) the corresponding zero-state response satisfies
sup
LL.

sup

The adjective 'uniform' has two meanings in this definition. It emphasizes the fact
that the same can be used for all values of k0 and for all input signals. (An equivalent
definition is explored in Exercise 27.1; see also Note 27.1.)

27.2 Theorem The linear state equation (1) is uniformly bounded-input, boundedoutput stable if and only if there exists a finite constant p such that the unit-pulse
response satisfies
kI

IIG(k, i)II p
1=1

forall k,j

with

kj+l.

Proof Assume first that such a p exists. Then for any k0 and any input signal 11(k)
the corresponding zero-state response of (1) satisfies
kI

G(k, j)u (I) II

Ily (k) II = II
AI

IIG(k,j)II IIu(j)II

kk0+1

j=k,,

(Of course y (k0) = 0 in accordance with the assumption that D (k) is zero.) Replacing
1111(1)11 by its supremum over j k0, and using (4),
AI

IIy(k)Il

IIG(k, 1)11 sup IIu(k)I)


j=k,,

IIu(k)II, kk0+l
A

A,,

Therefore, taking the supremum of the left side over k k0, (3) holds with

= p. and

the

Discrete Time: Input-Output Stability

Chapter 27

510

state equation is uniformly bounded-input, bounded-output stable.

Suppose now that (1) is uniformly bounded-input, bounded-output stable. Then


there exists a constant ii so that, in particular, the zero-state response for any k0 and
any input signal such that
sup IIu(k)II 1

k,,

set up a contradiction argument, suppose no finite p exists that satisfies (4). In other
words for any constant p there exist
and
such that

To

1)11 > p
, =Jp

application of Exercise 1.l9 implies that there exist


indices r, q such that the r,q-entry of the unit-pulse response satisfies

Taking p =

Jq+l, and

IGrq(kipi)I >11
I =j9

With k0 =
consider an rn x 1 input signal u (k) defined for k k(, as follows. Set
k=
set every component of zi(k) to zero
u(k) = 0 for k
except for the q"-component specified by
1

k) > 0

0,

IC0

l ,G,.(/(kfl,k) <0
This input signal satisfies

lu (k)ll

1, for every IC k0, but the r"-component of the

corresponding zero-state response satisfies, by (5),


k1I

Grq(k,p J)Uq(j)

lGrq(kipf)l
j=k0

Since

I, a

contradiction is obtained that completes the proof.

DOD
The condition on (4) in Theorem 27.2 can be restated as existence of a finite
constant p such that, for all k,

Relation to Uniform Exponential Stability

511

kI

IIG(k, 1)11 p

(6)

In the case of a time-invariant linear state equation, the unit-pulse response is

given by

G(k,j)=CAk_i_IB,

kj+l

Succumbing to a customary notational infelicity, we rewrite G (k, j) as G (k j). Then

a change of summation index in (6) shows that a necessary and sufficient condition for
uniform bounded-input, bounded-output stability is finiteness of the sum
IIG(k)II

(7)

k=I

Relation to Uniform Exponential Stability


now turn to establishing connections between uniform bounded-input, boundedoutput stability, a property of the zero-state response, and uniform exponential stability,
a property of the zero-input response. The properties are not equivalent, as a simple
example indicates.
We

27.3 Example

The time-invariant linear state equation

x(k+l)=

1/2

y(k)= [1

x(k)+

u(k)

O]x(k)

is not exponentially stable, since the eigenvalues of A


are 1/2, 2. However the unitpulse response is given by G(k) = (i/2)k_I, k 1, and therefore the state equation is

uniformly bounded-input, bounded-output stable since (7) is finite.

DJD
In the time-invariant setting of this example, a description of the key difficulty is
that scalar exponentials appearing in
can be missing from G (k). Reachability and
observability play important roles in addressing this issue, since we are considering the
relation between input-output (zero-state) and internal (zero-input) stability concepts.
In one direction the connection between input-output and internal stability is easy
to establish, and a division of labor proves convenient.

27.4 Lemma Suppose the linear state equation (1) is uniformly exponentially stable,
and there exist finite constants and such that
IIB(k)lI

f3,

IIC(k)II

(8)

for all k. Then the state equation also is uniformly bounded-input, bounded-output
stable.

Discrete Time: Input-Output Stability

Chapter 27

Proof Using the transition matrix bound implied by uniform exponential stability,
kI

kI

i=j

i+l)II IIB(i)II

IIC(k)II

IIG(k, 1)11
i=j

kI

kII

i=j

for any k, j with


the bound

k j+l.

Since

< I, we let (kf) 300 on the right side to obtain

-sf, k f+l

IIG(k, i)Il
Therefore

q=O

the state equation is uniformly bounded-input, bounded-output stable by

Theorem 27.2.

DOD
The coefficient bounds in (8) clearly are needed to obtain the implication in
Lemma 27.4. However the simple proof might suggest that uniform exponential stability
is an excessively strong condition for uniform bounded-input, bounded-output stability.
To dispel this notion we elaborate on Example 22.12.

27.5 Example

The scalar linear state equation

x(k+1)=a(k)x(k) + 14(k), x(k0)=x(,


y(k) =x(k)
with

1, k0
k/(k+l), kl
is

not uniformly exponentially stable, as shown by calculation of the transition scalar in

Example 22.12. However the state equation is uniformly stable, and the zero-input
response goes to zero for all initial states. Despite these worthy properties, for k0 =
and

the bounded input u (k) =


kI

y(k)=

1,

k 1, the zero-state response is unbounded:

c1(k,f+l)=1

(j+l)

j=I

j=I
I

LI

k(k+l)
2

k'

k>2

DOD
To develop implications of uniform bounded-input, bounded-output stability for
uniform exponential stability in a convenient way, we introduce a strengthening of the

Relation to Uniform Exponential Stability

513

reachability and observability properties in Chapter 25. Adopting the i-step reachability
and observability properties in Chapter 26 is a start, but we go further by assuming these
i-step properties have a certain uniformity with respect to the time index.
Recall from Chapter 25 the reachability Gramian

17(kf,j+l)B(j)BT(j)DT(kf,j+l)

(10)

j=L,,

For a positive integer 1, we consider reachability on intervals of the form ki

k.

Obviously the corresponding Gramian takes the form


LI

j + 1)8 (

W (kI, k) =

j +1)

j=LI

First we deal with linear state equations where the output is precisely the state
vector (C(k) is the n x ii identity). In this instance the natural terminology is uniform

bounded-input, bounded-state stability.

27.6 Theorem

Suppose for the linear state equation

x(k+I)=A(k)x(k) + B(k)u(k)
y(k) =x(k)
there

exist finite positive constants a, 3, e, and a positive integer I such that


IIB(k)Il <13, el W(kI, k)

for all k. Then the state equation is uniformly bounded-input, bounded-state stable if
and only if it is uniformly exponentially stable.

Proof If the state equation is uniformly exponentially stable, then the desired
conclusion is supplied by Lemma 27.4. Indeed the bounds in (11) involving A (k) and
W (kI, k) are superfluous for this part of the proof.

For the other direction assume the linear state equation is uniformly boundedinput, bounded-state stable. Applying Theorem 27.2, with C (k) = I, there exists a finite
constant p such that
kI

/=]

for all k, j such that k j


finite constant w such that

+ 1.

Our strategy is to show that this implies existence of a

Discrete Time: Input-Output Stability

Chapter 27

for all k, j such that k j+l, and thus conclude uniform exponential stability by
Theorem 22.8.
We use some elementary consequences of the hypotheses as follows. First assume
that a 1, without loss of generality, so that the bound on A (k) implies
(13)

Also the lower bound on the Gramian in (11) together with Exercise 1.15 gives

W'(kl,
for all k, and therefore
II 141

forall

(kI, k)II

k.

Thus prepared we shrewdly write, for any k, i such that k i,

b(k, i) = cD(k,

(ii, i)

i)T4( (ii,

iI

q=iI

next the consequences described above are applied to this expression. In particular,
since 0 iql ii in the summation,
and

q+l)II

q+l)II

<

q=iI,...,il
Therefore
k

i=j+l

1114

ii

IkI)(k,i)II

IkD(k,q+1)B(q)II
i=j+I q=iI

for all k, j such that k j + 1. The remainder of the proof is devoted to bounding the
right side of this expression by a finite constant w
In

r=

the inside summation on the right side of (14), replace the index q by

qi +1.

Then interchange the order of summation to write the right side of(14) as

a 1114

Ii

r=O

i=j+I

r+il+l)B(r+il)II

On the inside summation in this expression, replace the index i by s = r +i l to obtain

Relation to Uniform Exponential Stability

515

k+rI

II

s+l)B(s)II

(15)

r=O s=j+I+rI

Next we use the composition property to bound (15) by


A+rI

II

s+l)B(s)II

k+r/+1)II
r=O
II

II

k+rI

,.=o

s=j+I+i!

s+l)B(s)II

Finally applying (12), which obviously holds with k and

replaced

by k+r!+l

and

j+r/+l, respectively, we can write (14)as


'I'

IIcb(k, 1)11
1=] +

This bound holds for all k,

j such that k j+l. Obviously the right side of this


that establishes uniform

expression provides a definition for a finite constant


exponential stability by Theorem 22.8.

DOD
To address the general case, where C (k) is not an identity matrix, recall that the
observability Gramian for the state equation (1) is defined by
A1 I

k(,)CT(

M(k(,, k1) =

Ice)

use the concept of i-step observability discussed in Chapter 26, that is, observability
k +1, where I is a fixed, positive integer. The
corresponding Gramian is
We

on index ranges of the form k

+1I

M(k, ki) =

k)
j=k

27.7 Theorem
constants a, 13,

Suppose that for the linear state equation (1) there exist finite positive
c1, E7, and a positive integer 1 such that
IIA(k)II
e11

a,

IIC(k)II

IIB(k)II

W(kI, I),

c,I

k+I)

for all k. Then the state equation is uniformly bounded-input, bounded-output stable if
and only if it is uniformly exponentially stable.

Pmof

Again uniform exponential stability implies uniform bounded-input,

Chapter 27

516

Discrete Time: Input-Output Stability

bounded-output stability by Lemma 27.4. So suppose that (I) is uniformly boundedinput, bounded-output stable and i is such that the zero-state response satisfies

sup IIy(k)II

sup

IIu(k)II

(18)

for all k0 and all inputs ,t(k). We first show that the associated state equation with
C(k) =1, namely,

x(k+l) =A(k)x(k)

B(k)u(k)

y0(k)=v(k)

(19)

is uniformly bounded-input, bounded-state stable. To set up a contradiction argument,


assume the negation. Then for the positive constant
there exists a
ka > k0,
and bounded input signal uh(k) such that the zero-state response of (19) satisfies

Furthermore

(20)

IIu,,(k)II

= IIx(k0)II >

we can assume that u,,(k) satisfies uh(k) = 0 for k k0. Applying u,,(k)

to (1), keeping the same initial time k(,, the zero-state response of (1) satisfies
I

sup

IIy(k)112

IIy(j)112
k,, +1I

kc,)X(ka)

XT(ku)M(ka,

ka

Invoking the hypothesis on the observability Gramian, and then (20), gives
I

sup

k,,

Then the elementary property of the supremum


(

sup

IIy(k)II

)2

sup

IIy(k)112

yields

sup IIy(k)II

sup tluh(k)II

(21)

Thus we have shown that the bounded input ub(k) is such that the bound (18) for
uniform bounded-input, bounded-output stability of (1) is violated. This contradiction
implies (19) is uniformly bounded-input, bounded-state stable. Then by Theorem 27.6

Time-Invariant Case
the state equation (19) is uniformly exponentially stable, and hence (1) also is uniformly

exponentially stable.

Time-Invariant Case
Complicated manipulations in the proofs of Theorem 27.6 and Theorem 27.7 motivate
separate consideration of the time-invariant case, where simpler characterizations of
stability, reachability, and observability properties yield relatively straightforward
proofs. For the time-invariant linear state equation

x(k+l) =Ax(k) + Bu(k)

y(k)=Cx(k)

(22)

the main task in proving an analog of Theorem 27.7 is to show that reachability,
observability, and finiteness of (see (7))
(23)
k=l

imply finiteness of (see (12) of Chapter 22)

IlAkl II
k=I

27.8 Theorem Suppose the time-invariant linear state equation (22) is reachable and
observable. Then the state equation is uniformly bounded-input, bounded-output stable
if and only if it is exponentially stable.

Proof Clearly exponential stability implies uniform bounded-input, bounded-output


stability since
II

II

hAt' II

II

k=I

A=I

Conversely suppose (22) is uniformly bounded-input, bounded-output stable. Then (23)


is finite, and this implies
lmi

(24)

/.

A clear consequence is

lim CA'B=O
k

that is,

lim
L400

= lim cAt_lAB

=0

Chapter 27

518

Discrete Time: Input-Output Stability

This can be repeated to conclude

Jim CAIAk_IAJB = 0; i, j = 0,

k ,oo

1. ii

(25)

Arranging the data in (25) in matrix form gives


C
CA

urn

Ak_I [B AB ... AII_IB] =0

(26)

_'
By the reachability and observability hypotheses, we can select ii linearly independent

columns of the reachability matrix to form an invertible, ii x n matrix


and n linearly
independent rows of the observability matrix to form an invertible, n x ii Of,. Then,
from (26),

urn OaAk_IRaO
k

Therefore

urn Ak_t =0
and exponential stability follows by the eigenvalue-contradiction argument in the proof
of Theorem 22.11.
For some purposes it is useful to express the condition for uniform bounded-input,
bounded-output stability of (22) in terms of the transfer function 0(z) = C(zI A)'B.
We use the familiar terminology that a pole of G(z) is a (complex, in general) value of
= oo for some i and j.
say :,,, such that
Suppose each entry of G(:) has magnitude-less-than-unity poles. Then a partialfraction-expansion computation in conjunction with Exercise 22.6 shows that for the
corresponding unit-pulse response

IIG(k)II

(27)

k=I

is finite, and any realization of 0(z) is uniformly bounded-input, bounded-output stable.


On the other hand if (27) is finite, then the exponential terms in any entry of G (k) must
have magnitude less than unity. (Write a general entry in terms of distinct exponentials,
and use a contradiction argumentbeing careful of zero coefficients.) But then every

entry of G(z) has magnitude-less-than-unity poles. Supplying this reasoning with a


little more specificity proves a standard result.

27.9 Theorem The time-invariant linear state equation (22) is uniformly boundedinput, bounded-output stable if and only if all poles of the transfer function
G(z) = C(zI AY1B have magnitude less than unity.

519

Exercises

For the time-invariant linear state equation (22), the relation between input-output
stability and internal stability depends on whether all distinct eigenvalues of A appear
as poles of G(:) = C(:! A)'B. (Review Example 27.3 from a transfer-function
perspective.) Assuming reachability and observability guarantees that this is the case.
Unfortunately eigenvalues of A sometimes are called 'poles of A,' a loose terminology
that at best invites confusion.

EXERCISES
Exercise 27.1

Show that the linear state equation

v(k+l)=A(k).v(k) + B(k)u(k)
y(k) = C(k).v(k)
is unifonnly bounded-input, bounded output stable if and only if given any finite, positive
constant 3 there exists a finite, positive constant such that the following property holds for any
k0. If the input signal satisfies

kk,,
then

the corresponding zero-state response satisfies

kk,,
(Note that c depends only on 3. not on the particular input signal, nor on

Exercise 27.2 Is the linear state equation

1/210
0 0 0
00I

.v(k+l)=
y(k)=

[1

v(k)+

u(k)

uniformly bounded-input, bounded-output stable? Is it uniformly exponentially stable?

Exercise 27.3

Is the linear state equation

.v(k+l)

01
=

2I

v(k)= [1

.v(k)

0
+

u(k)

1]v(k)

uniformly bounded-input, bounded-output stable? Is it uniformly exponentially stable?

Exercise 27.4 Suppose the p x m transfer function G(:) is strictly proper rational with one pole
at = = I and all other poles with magnitude less than unity. Prove that any realization of G(=) is
not uniformly bounded-input, bounded-output stable by exhibiting a bounded input that yields an
unbounded response.

Exercise 27.5 We call the linear state equation (1) hounded-input, hounded-output stable if for
any k,, and bounded input signal ,i(k) the zero-state response is bounded. Try to show that the

Discrete Time: Input-Output Stability

Chapter 27

520

boundedness condition on (4) is necessary and sufficient for this stability property by mimicking
the proof of Theorem 27.2. Describe any difficulties you encounter.
Exercise 27.6 Show that a time-invariant, discrete-time linear state equation is reachable if and
only if there exist a positive constant a and a positive integer I such that for all k

a! W(k!, k)
Give an example of a time-varying linear state equation that does not satisfy this condition, but is
reachable on [k I, k I for all k and some positive integer!.
Exercise 27.7 Prove or provide a counterexample to the following claim about time-varying,
discrete-time linear state equations. If the state equation is uniformly bounded-input, boundedoutput stable and the input signal goes to zero as k co, then the corresponding zero-state
response also goes to zero as k *
What about the time-invariant case?

Consider a uniformly bounded-input, bounded-output stable, single-input, timeinvariant, discrete-time linear state equation with transfer function G(z). If X and are real
Exercise 27.8

constants with absolute values less than unity, show that the zero-state response y (k) to

kO
satisfies

Under

what conditions can

such a relationship hold if the state equation is not uniformly

bounded-input, bounded-output stable?

NOTES
Note

27.1

In Definition 27.1 the condition

IIy(k)II

(3) can be restated as


sup

IIu(k)II

kk0

A a A,,

but two sup's provide a nice symmetry. In any case our definition is tailored to linear systems. The
equivalent definition examined in Exercise 27.1 has the advantage that it is suitable for nonlinear
systems. Finally the uniformity issue behind Exercise 27.5 is discussed further in Note 12.1.

Note 27.2

A proof of the equivalence of uniform exponential stability and uniform boundedinput, bounded-output stability under the weaker hypotheses of uniform stabilizability and
uniform detectability is given in

B.D.O. Anderson, "Internal and external stability of linear time-varying systems," SIAM Journal
on Control and Optimization, Vol. 20, No. 3, pp. 408413, 1982

28
DISCRETE TIME
LINEAR FEEDBACK

theory of linear systems provides the foundation for linear control theory via the
notion of feedback. In this chapter we introduce basic concepts and results of linear
control theory for time-varying, discrete-time linear state equations.
Linear control involves modification of the behavior of a given in-input, p-output,
n-dimensional linear state equation

The

.v(k+l)=A(k).v(k) + B(k)u(k)
(k) = C(k)x(k)

this context often called the plant or open-loop state equation, by applying linear
feedback. As shown in Figure 28.1, linear state feedback replaces the plant input u (k)
in

by

u(k) =K(k)x(k) + N(k)r(k)


where r(k) is the new name for the m x 1 input signal. Default assumptions are that the
m x n matrix sequence K(k) and the in x in matrix sequence N(k) are defined for all k.
Substituting (2) into (1) gives a new linear state equation, called the closed-loop state
equation. described by

.v(k+l) = [A(k) +B(k)K(k) J.v(k) + B(k)N(k)r(k)


v(k)

C(k).v(k)

Similarly linear output feedback takes the form

u(k)=L(k)v(k) +N(k)r(k)
521

Chapter 28

522

Discrete Time: Linear Feedback

28.1 Figure Structure of linear state feedback.

where again the matrix sequences L (k) and N (k) are assumed to be defined for all k.
Output feedback, a special case of state feedback, is diagramed in Figure 28.2. The
resulting closed-loop state equation is described by
x(k +1) = [A (k) + B (k)L (k)C(k) ]x(k) +

(k)N(k)r(k)

y(k) = C(k)x(k)
One important (though obvious) feature of both types of linear feedback is that the
closed-loop state equation remains a linear state equation. The feedback specified in (2)
or (4) is called static because at any k the value of ii (k) depends only on the values of

r(k) and x(k), or v(k), at that same time index. (This is perhaps dangerous
terminology, since the coefficient matrix sequences N(k) and K(k), or L(k), are not in
general 'static.') Dynamic feedback, where ii (k) is the output of a linear state equation
with inputs r(k) and x(k), or y(k), is encountered in Chapter 29. If the coefficientmatrix sequences in (2) or (4) are constant, then the feedback is called time invariant.

.v(k+ 1) = A(k)x(k) + B(k)u(k) I

28.2 Figure

Structure of linear output feedback.

28.3 Remark The absence of D (k) in (I) is not entirely innocent, as it circumvents
situations where feedback can lead to an undefined closed-loop state equation. In a
single-input, single-output example, with D(k) =L(k) = for all k, the output and
1

feedback equations

v(k) = C(k)x(k) + u(k)

ii(k)=y(k) + N(k)r(k)
leave the closed-loop output undefined.

Effects of Feedback

523

Effects of Feedback
We

begin by considering relationships between the closed-loop state equation and the

plant. This is the initial step in describing what can be achieved by feedback. The

available answers turn out to be disappointingly complicated for the general case in that
convenient relationships are not obtained. However matters are more encouraging in the
time-invariant case, particularly when z-transform representations are used. First the
effect of linear feedback on the transition matrix is considered. Then we address the
effect on input-output behavior.
In the course of the development, we sometimes encounter the inverse of a matrix
[I F(:)], where F(z) is a square matrix of strictly-proper rational
functions. To justify invertibility note that det [I F(:)] is a rational function of z,
and it must be a nonzero rational function since IIF(z)II
as I: * oo Therefore
[1 F (z)] -' exists for all but a finite number of values of z, and, from the adjugateover-determinant formula, it is a matrix of rational functions. (This reasoning applies
also to the familiar matrix (:1 A
= (1/:) (1 A
though a more explicit
argument is used in Chapter 21.)

of the form

28.4 Theorem Let 'bA(k, j) be the transition matrix for the open-loop state equation
(1) and
j) be the transition matrix for the closed-loop state equation (3)
resulting from state feedback (2). Then
kI

j) =

j)

cD,1(kl, i)B(i)K(i)ttA+Bx(i,j)

i=j

for all k, j such that k j + 1. If the open-loop state equation and state feedback both are

time-invariant, then the z-transform of the closed-loop transition matrix can be


expressed in terms of the :-transform of the open-loop transition matrix as
z(zI A

Proof For any j


case of k = j + I:

we

BKY'

[I (:1 AY'BK]'z(zl A)'

establish (6) by an induction on k, beginning with the obvious

=A(j) + B(j)K(j)
+

Supposing that (6) holds for k = j +J, where J is a positive integer, write

j)

j)

Using the inductive hypothesis to replace the first

j)

+BK(j +J, j) on the right side,

I +J

i)B(i)K(i)c1fl+BK(i,j)

+
1=1

+ B(j+J)K(j+.J)CbA+BK(j+J, j)

Chapter 28

524

Discrete Time: Linear Feedback

Including the last term as an i = +J summand gives


j+J
+J +1,1) =
+J, i)B
+1,1) +
I

I)

=j

to conclude the argument.

For a time-invariant situation, rewriting (6) in terms of powers of A. with j =

0,

gives
(A

BK)k =Ak +

kI

+BK)', k 1

(8)

1=0

both sides can be interpreted as identity matrices for k =


summation term as a one-unit delay of the convolution
and

0.

Also we can view the

+ BK)'

Then the z-transform, using in particular the convolution and delay properties, yields
z(zl A

BK)'

=z(zI

A)'

+ z'z(zl A)'BKz(zI

BK)'

an expression that easily rearranges to (7).

ODD
It is a simple matter to modify Theorem 28.4 for linear output feedback by
replacing K(k) by L(k)C(k).
Convenient relationships between the input-output representations (unit-pulse
responses) for the plant and closed-loop state equation are not available for either state
or output feedback in general. However explicit formulas can be derived in the time-

invariant case for output feedback.

28.5 Theorem

If G(k) is the unit-pulse response of the time-invariant state equation

x(k+l)=Ax(k) + B,,(k)
y(k) = Cx(k)
and

G(k)

is

the unit-pulse response of the time-invariant, closed-loop state equation

x(k+l) = [A + BLC]x(k) + BNr(k)


= Cx(k)
obtained by time-invariant linear output feedback, then

kO

G(k)=G(k)N+
j=0

Also the transfer function of the closed-loop state equation can be expressed in terms of

the transfer function of the open-loop state equation by

G(z) = [1

G(z)L

State Feedback Stabilization

525

Proof Recalling that

G(k)=

k=O
CAk_IB,

k1

G(k)=

k=0

we make use of (8) with k replaced by k I, and K replaced by LC, to obtain


k 2

C(A

CA

(A +BLC)'BN, k 2

i =0

Changing the summation index I to j = I +1 gives


kI

G(k j)LG(j), k

G(k) = G(k)N +

j=I

As a consequence of the values of G(k) and G(k) at k = 0, 1, this relationship extends


to (9). Finally the z-transform of (9), making use of the convolution property, yields

G(:) = G(:)N + G(:)LG(:)


from which (10) follows easily.

DOD
An alternate expression for G(z) in (10) can be derived using a matrix identity

posed in Exercise 28.1. This Exercise verifies that

G(z)=G(z)[I
Of course in the single-input, single-output case, both (10) and (11) reduce to

G(z)

G(:)= l_G(z)LN
In a different notation, with different sign conventions for feedback, this is a familiar
formula in elementary control systems.

State Feedback Stabilization


of the first specific objectives that arises in considering the capabilities of feedback
involves stabilization of a given plant. The basic problem is that of choosing a state
feedback gain K (k) such that the resulting closed-loop state equation is uniformly
exponentially stable. (In addressing uniform exponential stability, the input gain N(k)
plays no role. However we should note that boundedness assumptions on N(k), B(k),
and C(k) yield uniform bounded-input, bounded-output stability, as discussed in
Chapter 27.) Despite the complicated, implicit relation between the open- and closedloop transition matrices, it turns out that exhibiting a control law to accomplish
One

stabilization is indeed manageable, though under strong hypotheses.

Chapter 28

526

Discrete Time: Linear Feedback

Actually somewhat more than uniform exponential stability can be achieved. For
this discussion it is convenient to revise Definition 22.5 on uniform exponential stability
by attaching nomenclature to the decay rate and recasting the bound.

28.6 Definition The linear state equation (1) is called uniformly exponentially stable
with rate A., where A. is a constant satisfying A.> 1, if there exists a constant y such that
for any k0 and x0 the corresponding zero-input solution satisfies
IIx(k)

IIx<, II

II

Ic k0

Lemma Suppose A. and a are constants larger than unity. Then the linear state
equation (1) is uniformly exponentially stable with rate A.a if the linear state equation
28.7

z(k+l)

= cxA(k)z(k)

is uniformly exponentially stable with rate A..

Proof

It

is easy to show that x(k)

satisfies

x(k + I) = A (k)x

if and only if z(k) =

a _ko)x(k)

(Ic) ,

x (k(,)

satisfies

:(k+l)=aA(k)z(k), z(k0)=x0

(12)

Now suppose A., a> 1, and assume there is a y such that for any x0 and
solution of (12) satisfies
IIz(k)

II

11x0 II ,

k0 the

resulting

Ic Ic0

Then, substituting for z (Ic),

II =

Multiplying through by

IIx(k) II

11x0 II

we conclude that (1) is uniformly exponentially stable

with rate A.ct.

DOD
In this terminology a higher rate implies a more-rapidly-decaying bound on the

zero-input response. Of course uniform exponential stability in the context of our


previous terminology is uniform exponential stability at some unspecified rate A.> 1.
The stabilization result we present relies on an invert ibility assumption on A (Ic),
and on a uniformity condition that involves 1-step reachability for the state equation (1).

These strong hypotheses permit a relatively straightforward proof. The invertibility


assumption can be circumvented, as discussed in Notes 28.2 and 28.3, but at substantial
cost in simplicity.
Recall from Chapter 25 the reachability Gramian

State Feedback Stabilization

527
k1t

W(k0,

(13)

k1) =

We impose a uniformity condition in terms of W (k, k +1), which of course relates to the

I-step reachability discussed in Chapters 26 and 27. In an attempt to control notation, we


use also the related symmetric matrix
An

Wa(ko,

for

k1) =

j+1)

a4

(14)

a> 1. This definition presumes invertibility of the transition matrix, and is not

recognizable as a reachability Gramian. However


k +1) can be loosely described
as an a-weighted version of c1(k, k+I)W(k,
k+!), a quantity further
interpreted in Note 28.1.
In the following lengthy proof A_T(k) denotes the transposed inverse of A(k),
equivalently the inverted transpose of A (k). Properties of the invertible transition matrix
for invertible A (k) are freely used. One example is in a calculation providing the
identity
A(k)Wa(k, k+l)AT(k)=B(k)BT(k) +
the

k+I)

validation of which is recommended as a warm-up exercise for the reader.

28.8 Theorem For the linear state equation (1), suppose A(k) is invertible at every k,
and suppose there exist a positive integer I and positive constants
and E2 such that
c1(k, k+I)W(k, k +1)DT(k, k +1) <

(16)

for all k. Then given a constant a> 1 the state feedback gain

k+l)

is such that the resulting closed-loop state equation is uniformly exponentially stable
with rate a.
Proof To ease notation we write the closed-loop state equation as

x(k+1) =A(k)x(k)
where

A(k) = A (k) B (k)BT(k)A_T(k)W;I (k, k +1)

The strategy of the proof is to show that the state equation

z(k+1) = aA(k)z(k)
is

uniformly exponentially stable by applying the requisite Lyapunov stability criterion

Chapter 28

528

Discrete Time: Linear Feedback

with the choice

Q(k)=W;'(k,k!)

(18)

Then Lemma 28.7 gives the desired result.


To apply Theorem 23.3 we first note that Q (k) is symmetric. Also
k +i)W(k, k !)dI)T(k, k +1) <

k + 1)

4(k, k1)W(k, k!)clT(k, k-i-i)


for all k, so (16) implies
k + 1) C2!

(19)

for all k. In particular existence of the inverse in (17) and (18) is obvious, and Exercise
1.15 gives
1

414

(20)

for all k. Therefore it remains only to show that there is a positive constant v such that

[aA(k)]TQ(k+1)[UA(k)] Q(k) vJ
for all k.
We begin with the first term, writing
[czA(k) ]TQ (k + 1)[aA(k)]

(k)BT(k)A_T(k)

= a2

A (k)[1

(k)B

k+l+!)

(k, k +1)1

Making use of(15), rewritten in the form

[1

(k, k +1)] = a A' (k)Wa(k +1, k +i)A_T(k)

(k)B

(k, k +1)

and the corresponding transpose, gives

[aA(k) ITQ(kl)[A(k)1

k+!)
We commence bounding this expression using the inequality
k +1

Wa(k+1,k+1+I)=
j=kl

(21)

State Feedback Stabilization


=

529

kI)

k+1)
which implies

w;' (k+1, k+l+1)

(k+1, k+/)

Thus (21) gives

[aA(k) ]TQ(k +1)[aA(k)]


k+l)A_T(k)]

cC6W'(k,

k+1)

Applying (15) again yields

[aA(k) ]TQ (k +l)[aA(k)]


(k)B (k)BT(k)A_T(k)] w;' (k, k +1)

(k, k +1) [a4 Wa(k, k

k+/)
Therefore

{aA(k)]TQ(k+l)[aA(k)] Q(k)

for all k. Since

a> 1 this defines the requisite v, and the proof is complete.

DOD
For a time-invariant linear state equation,

x(k+1) =Ax(k) + Bu(k)

y(k)=Cx(k)

(22)

it is an easy matter to specialize Theorem 28.8 to obtain a constant linear state feedback

gain that stabilizes in the invertible-A case. However a constant stabilizing gain that
does not require invertibility of A can be obtained by applying results special to timeinvariant state equations, including an exercise on the discrete-time Lyapunov equation
from Chapter 23. This alternative provides a constant state-feedback gain described in
terms of the reachability Gramian
'I I

W,,

AkBBT(AT)k

=
k =0

(23)

Chapter 28

530

Discrete Time: Linear Feedback

Theorem Suppose the n-dimensional, time-invariant linear state equation (22) is


reachable. Then the constant state feedback gain
28.9

K=

(24)

is such that the resulting closed-loop state equation is exponentially stable.

Proof First note that W,, + indeed is invertible by the reachability hypothesis. We
next make use of the easily verified fact that the eigenvalues of a product of square
matrices are independent of the ordering in the product. Thus the eigenvalues of

A + BK = A

[I

are the same as the eigenvalues of


A [1

J= A
=

[I

]A

which in turn are the same as the eigenvalues of

]=A

A [i

this commutation process, it can be shown that all eigenvalues of A + BK


have magnitude less than unity by showing that all eigenvalues of
Repeating

F=A

have magnitude less than unity. For this we use a Lyapunov stability argument that is set
up as follows. Begin with
FW,,+1FT =

[A

I W,,.1.1

= AW,,+IAT

IT

[A

14fl+IBBT(AT)fl+I +

Simple manipulations on (23) provide the identity


A[

A?IBBT(AT)IZ ]AT = W,H.I

BBT

so that
FW,,+IFT =

BBT

This can be written in the form

FWI,+IFT W,,1 = M

(25)

State Feedback Stabilization

531

where M is the symmetric matrix


M = BBT +

With the objective of proving M 0, Exercise 28.2 can be used to obtain


+

M = BBT +

(26)

Clearly [1 + BT(AT)IIW;IAPIBI is positive definite, and the inverse of a positivedefinite, symmetric matrix is a positive-definite, symmetric matrix. Therefore M 0.

We complete the proof by applying Exercise 23.10 to (25) to show that all
eigenvalues of F have magnitude less than unity. This involves showing that for any
,i x 1 vector z the condition
ZTFLM(FT)L:

= 0, k 0

(27)

implies
(28)

=0

urn

From (26), and positive definiteness of [1 + BT(AT)IIWIAIIB}_I, it follows that (27)


gives

.rFkAn+1B0

kO

that is,

=0, k0

ZT[A _Ah1

Evaluating this expression sequentially for k = 0, k = 1, and so on, it is easy to prove


that

jl

..TAII+JBO
This implies
..TAII+I

[B

AB

...

Ahl_IB]

=o

Invoking the reachability hypothesis gives


ZTAS1f

=0

(29)

But then it is clear that

= limzT[A
= lim

=0
and we have finished the proof.

DOD

]k
1L

Chapter 28

532

Discrete Time: Linear Feedback

If the linear state equation (22) is I-step reachable, in the obvious sense, with
i <ii, the above result and its proof can be restated with n replaced by 1.

Eigenvalue Assignment
Another approach to stabilization in the time-invariant case is via results on eigenvalue

placement using the controller form in Chapter 13. Of course placing eigenvalues can
accomplish much more than stabilization, since the eigenvalues determine some basic
characteristics of both the zero-input and zero-state responses. Invertibility of A is not
required for these results.
Given a set of desired eigenvalues, the objective is to compute a constant state
feedback gain K such that the closed-loop state equation

x(k+1) =

(A

+BK)x(k)

(30)

has precisely these eigenvalues. In almost all situations eigenvalues are specified to
have magnitude less than unity for exponential stability. The capability of assigning
specific values for the magnitudes directly influences the rate of decay of the zero-input

response component, and assigning imaginary parts influences the frequencies of


oscillation that occur.

Because of the minor, fussy issue that eigenvalues of a real-coefficient state


equation must occur in complex-conjugate pairs, it is convenient to specify, instead of
eigenvalues, a real-coefficient, degree-n characteristic polynomial for (30). That is, the
ability to arbitrarily assign the real coefficients of the closed-loop characteristic
polynomial implies the ability to suitably arbitrarily assign closed-loop eigenvalues.
28.10 Theorem Suppose the time-invariant linear state equation (22) is reachable and
rank B = rn. Then for any monic, degree-n polynomial p (A.) there is a constant state
feedback gain K such that det (A.! A BK) = p (A.).

Proof Suppose that the reachability indices of (22) (a natural terminology change
from Chapter 13) are
p,,,, and the state variable change to controller form in
Theorem 13.9 is applied. Then the controller-form coefficient matrices are
PAP

= A0 +

BOUP', PB = B0R

+
a feedback gain KCF for the new state
and given p (A.) = A." + p,, - 1A." - +
equation can be computed as follows. Clearly

PAP' +PBKCF=AO +B0UP' +BORKCF


= A0 + BQ(UP -1 + RKCF)

Reviewing the form of the integrator coefficient matrices A,, and B0, the ilk_row of
UP -' + RKCF becomes row
+
+ p, of PAP -' + PBKCF. With this observation
there are several ways to proceed. One is to set

Noninteracting Control

533

+p,+I

R'

Kcp =

P0 Pi

where ej denotes the I" -row of the n x n identity matrix. Then from (31),

PAP

+ PBKCF

=A0 +

B0

P0

o
o

1...

...

(32)

=
o
Po

0
P1

Pni

Either by straightforward calculation or review of Example 26.9 it can be shown that


PAP + PBKCF has the desired characteristic polynomial. Of course the characteristic
polynomial of A BKCFP is the same as the characteristic polynomial of
P( A +

BKCFP )

= PAP -' + PBKCF

Therefore the choice K KCFP is such that the characteristic polynomial of A + BK is

100
The

input gain N(k) does not participate

in

stabilization, or eigenvalue

placement, obviously because these objectives pertain to the zero-input response of the
closed-loop state equation. The gain N (k) becomes important when zero-state response
behavior is an issue. One illustration is provided by Exercise 28.6, and another occurs in
the next section.

Noninteracting Control
The stabilization and eigenvalue placement problems employ linear state feedback to

change the dynamical behavior of a given plantasymptotic character of the zero-input


response, overall speed of response, and so on. Another capability of feedback is that
structural features of the zero-state response of the closed-loop state equation can be

changed. As an illustration we consider a plant of the form (1) with the additional

Chapter 28

534

Discrete Time: Linear Feedback

that p = in, and discuss the problem of noninteracting control. Repeating


the state equation here for convenience,
assumption

x(k+l) =A(k)x(k)

+ B(k)u(k)

y(k) = C(k)x(k)

(33)

this problem involves using linear state feedback


zi(k) = K(k)x(k) + N(k)r(k)

to achieve two input-output objectives on a specified time interval k0


closed-loop state equation

(34)

k1. First the

x(k+l)= [A(k)+B(k)K(k)]x(k) + B(k)N(k)r(k)


y(k) = C(k)x(k)

(35)

the
component
has no effect on the i"-output
component y1(k) for k = kr,,..., k1. The second objective, imposed in part to avoid a
trivial situation where all output components are uninfluenced by any input component,
is that the closed-loop state equation should be output reachable in the sense of Exercise

should be such that for i

25.7.
It is clear from the problem statement that the zero-input response is not a
consideration in noninteracting control, so we assume for simplicity that x(k0) = 0.
Then the first objective is equivalent to the requirement that the closed-loop unit-pulse

response

j+l)B(j)N(j)

G(k, j) =

a diagonal matrix for all k and j such that k0 j <k k1. A closed-loop state
equation with this property can be viewed from an input-output perspective as a
be

collection of in independent, single-input, single-output linear systems. This simplifies


the output reachability objective: from Exercise 25.7 output reachability is achieved if
none of the diagonal entries of G(k1, j) are identically zero for j = k0
k1l. (This
condition also is necessary for output reachability if rank C (k1) = rn.)
To further simplify the analysis, the closed-loop input-output representation can be
rewritten to exhibit each output component. Let C1 (k)
denote the rows of
the iii x ii matrix C(k). Then the i'1'-row of G(k, j) is
G,(k, j) = CI(k)CDA+BK(k, j+l)B(j)N(j)

(36)

and the i"-output component is described by


kI

y1(k) =

G(k, j)r(j), k k()-i-l

In this format the objective of noninteracting control is that the rows of G (k, j) have the

Noninteracting Control

535

form
G1(k,

j) = g,(k,

i = 1.

(37)

in

for all k, j such that k1, j <k <k1, where e, denotes the i'1'-row of 1,,,. Furthermore
each g,(kj, j) must not be identically zero on the range j =
kfl.

It is convenient to adopt a special notation for factors that appear in the unit-pulse
response of the plant (33). Let

A(k+l). j=O, 1,2,...

(38)

where the j = 0 case is


=

C(k+l)

A property we use in the sequel is


j=O, 1,2,...

(This notation can be interpreted in terms of recursive application of a linear operator on


i x ii matrix sequences that involves an index shift and post-multiplication by A (k).

While such an interpretation emphasizes similarities to the continuous-time case in


Chapter 14, it is neither needed nor helpful here.)
We use an analogous notation in relation to the closed-loop linear state equation
(35):
=

C(k+j +1)[A (k-i-f) + B(k+f)K(k-l-j)J

[A(k-i-l)B(k+l)K(k1)],
It

j=0,l,...

is easy to verify that


I = 1,2, ..

G,(k+I, k) =

(39)

We next introduce a basic structural concept for the plant (33). The underlying
calculation is a sequence of time-index shifts of the
of the zero-state
response of (33) until the input ii (k) appears with a coefficient that is not identically
zero on the index range of interest. Begin with

y'(k-t-l) = C(k+1)x(k-i-l)

=C1(k+l)A(k)x(k)
If C(k+l)B(k)=0 fork=k,

C(k+l)B(k)u(k)

k1l,then

y,(k+2)= C(k+2)A(k+l)x(k+l)
=C1(k+2)A(k+l)A(k)x(k) +
In

C,(k+2)A(k+1)B(k)u(k),

k =k(,,..., k12

continuing this calculation the coefficient of U (k) in the 1th index shift is

Chapter 28

536

Discrete Time: Linear Feedback

[C,](k +1)B (k)

up to and including the shifted index value where the coefficient of the input signal is
nonzero. The number of shifts until the input appears with nonzero coefficient is of main
interest, and a key assumption is that this number does not change with the index k.

28.11 Definition

The linear state equation (33) is said to have constant relative degree
ic,,, are finite positive integers such that

on [k0, k1] if

K1

k=k(

,kjll,/=0,...,1c12

fori=l

(40)

kjK1

k =k,

m.

We emphasize that, for each i, the constant K' must be such that the relations in
(40) hold at every k in the index ranges shown. Implicit in the definition is the

requirement k1 k,, + max [K1, . ., K,,,]. Application of (40) provides a useful identity
relating the open-loop and closed-loop L-notations, the proof of which is left as an easy
exercise.

28.12 Lemma

Suppose the linear state equation (33) has constant relative degree
ks]. Then for any state feedback gain K(k),and i = 1,..., ni,

K,,, on [k0,

K1

k =k(,

k1ll

1= 0,..., Kl

(41)

Conditions sufficient for existence of a solution to the noninteracting control


problem on a specified time-index range are proved by intricate but elementary
calculations involving the open-loop and closed-loop L-notations. A side issue of
concern is that N (k) could fail to be invertible for some values of k, so that the closedloop state equation ignores portions of the reference input yet is output reachable on
[k0, k1}. However our proof optionally involves use of an N (k) that is invertible at each
k = k0,..., k1 1. In a similar vein note that the following existence condition cannot
be satisfied unless rankC(k) = rankB(k) =
k = k0
k1 mm [K1
K,,,].

Theorem Suppose the linear state equation (33) with p = has constant relative
degree K1,. .., K,,, Ofl [k,,, k1J, where kf k,, + max [K1,. .., K,,,]. Then there exist
28.13

feedback gains K(k) and N(k), with N(k) invertible for k =

k,,,

Ic1 1, that

provide noninteracting control on [k0, Ic1] if the m x in matrix


[C 1](k + 1)B (Ic)

(42)

[C,,,](kl)B(k)
is

invertible at each Ic =

k,,

k1 mm [K1,.. .,

IC,,,].

Noninteracting Control

537

Proof We want to choose gains K(k) and N(k) to satisfy (37) for k(, j <k
i =1
in. This can be addressed by considering, for an arbitrary

and for each

i,

G(k+!,k)

for
Beginning with
applied to obtain

1 /

(39), Lemma 28.12, and the definition of

can be

+l)B (k)N(k)

G1(k 1, k) =
=

=0; k=k,,

k11,

1=1

Continuing for I = K,,and using Lemma 28.12 again, gives


k) =

G,(k

+1)B (k)N(k)

k =k,,,...,
The invertibility condition on A(k) in (42) permits the gain selection

k =k,,,..., kjmin[K1

N(k)
where of course k1 1c1

k1 mm

i,], regardless of i. This yields

[K1

k=k0
and

a particular implication

reachability.
Next, for I =

(43)

K,,,]

kfK,

a condition that proves

is G,(k1,

consider

G,(k+K1+l, k)

k =k,,,..., k1ic1l

where we can write, using a property mentioned previously, and Lemma 28.12,

+B(k+1)K(k+l)]

(44)

Choosing the gain

K(k) = A' (k)

[C,,,](Ic)

yields

k = k0

k1 mm [K1

K,,,]

(45)

Chapter 28

538

[C,}(k+2)A (k+1)

Discrete Time: Linear Feedback

[C,j(k2)B(k+l)Lc'(k+1)

This gives
= 0, k =

k0

(46)

k1ic1 1

so, interestingly enough,

The next step is to consider 1 = K, + 2, that is

(k)N(k), k = k0,...,

G1(k +K;+2, k) =

k1i1 2

Making use of (46) we find that


2)[A (k + 1) B (k 1)K(k +1)]

=0, k=k0
and

kjK12

continuing for successive values of I gives

G1(k+l,k)=0;kk0,...,k1I,l=K1+l,...,kjk
holds regardless of the values of K(k) and N(k) for the index range
k = k1,nin
k1 1. Thus we can extend the definitions in (43) and
(45) in any convenient manner, and of course maintain invertibility of N (k).
In summary, by choice of K(k) and N(k) we have satisfied (37) with
This

0,

g(k, j) =

I, k =J+ic1

(47)

0,

Noting that the feedback gains (43) and (45) are


for all k, j such that k0 j <k
independent of the index i, noninteracting control is achieved for the corresponding
closed-loop state equation (35).

ODD
There are features of this proof that deserve special mention. The first is that
explicit formulas are provided for gains N (k) and K (k) that provide noninteracting

Noninteracting Control

539

control. (Typically many other gains also work.) It is interesting that these gains yield a

closed-loop state equation with zero-state response that is time-invariant in nature,


though the closed-loop state equation usually has time-varying coefficient matrices.
Furthermore the closed-loop state equation is uniformly bounded-input, bounded-output
stable, a desirable property we did not specify in the problem formulation. However it is
not necessarily internally stable.
Necessary conditions for the noninteracting control problem are difficult to state
for time-varying, discrete-time linear state equations unless further requirements are
placed on the closed-loop input-output behavior. (See Note 28.4.) However Theorem
28.13 can be restated as a necessary and sufficient condition in the time-invariant case.

For a time-invariant linear plant (22), the k-index range is superfluous, and we set
k0 = 0 and let k1

oo Then the notion of constant relative degree reduces to existence

of finite positive integers

;,, such that

C1A'B=O, 1=0,...,K12
(48)

fori =

rn.

28.14 Theorem Suppose the time-invariant linear state equation (22) with p = m has
relative degree
Then there exist constant feedback gains K and invertible N
that achieve noninteracting control if and only if the ni x m matrix

(49)

is

invertible.

P,-oof We omit the sufficiency proof, because it follows directly as a specialization


of the proof of Theorem 28.13. For necessity suppose that K and invertible N achieve
noninteracting control. Then from (37) and Lemma 28.12, making the usual notation
change from G,(k +i, k) to G1(K1) in the time-invariant case,
= C(A +
=
=

Arranging these row vectors in a matrix gives


= diagonal

It follows immediately that

is

g (K1)

invertible.

g,,1(K,,1) I

Chapter 28

540
28.15

Example

Discrete Time: Linear Feedback

For the plant

0100
x(k+1)=

h(k)

1101
y(k)=
simple

x(k) +

u(k)

00]x(k)

(50)

calculations give

LA[CI](k+l)B(k)=

[0

0]

[i

1]

[h(k)

0]

Suppose
k1] is an interval such that b (k) 0 for k = k(,,..., k1l, with
k0 +
Then the plant has constant relative degree
= 2, 12 = 1 on [k(,, k1]. Furthermore

2.

0]
is

invertible fork = k(,,..., k1l. The gains in (43) and (45) yield the state feedback

?]x(k)

[? ?

(51)

+ [?

and the resulting noninteracting closed-loop state equation is

10

1001
x(k+1)=

x(k) +

0000
100 1 0]
i

10
k

(This is a time-invariant closed-loop state equation, though typically the result will be
such that only the zero-state response exhibits time-invariance.) A quick calculation

shows that the closed-loop zero state response is

y(k)=

r1(k2)

r2(kl)

(52)

(interpreting input signals with negative arguments as zero), and the properties of
noninteraction and output reachability obviously hold.

Additional Examples

541

Additional Examples
We

return to familiar examples to further illustrate the utility of linear feedback for

modifying the behavior of linear systems.

28.16 Example

For the cohort population model introduced in Example 22.16,

10 u(k)

001

a1ct7cz3

y(k)= [1

I]x(k)

(53)

consider specifying the immigrant populations as constant proportions of the


populations according to

u(k)

k12

k23

x(k)

k32 k33

k31

Then the resulting population model is

x(k+l)=

0
132+k12
0
33+k23

x(k)

x1+k11 a,+k3, a3+k31

y(k)=

[1

l]x(k)

(54)

and we see that specifying the immigrant population in this way is equivalent to
specifying the survival and birth rates in each age group. Of course this extraordinary
flexibility is due the fact that each state variable in (53) is independently driven by an
input component.
Suppose next that immigration is permitted into the youngest age group only. That

000

u(k)=

x(k)

k1 k2 k3
This yields
0
0

x(k+1)=

132

x(k)

a1+k1 a7+k7 a3+k3

y(k)= [1

l}x(k)

(55)

Thus the youth-only immigration policy is equivalent to specifying the birth rate in each

Discrete Time: Linear Feedback

Chapter 28

542

age group. A quick calculation shows that the characteristic polynomial for (55) is

+k,)
It is clear that, assuming

f33 > 0,

the immigration proportions can be chosen to obtain

any desired coefficients for the closed-loop characteristic polynomial. (By Theorem
28.10 such a conclusion also follows from checking the reachability of the linear state
equation (53) with the first two inputs removed.) This immigration policy might be of
interest if (53) is exponentially stable, leading to a vanishing population, or has a pair of
complex (conjugate) eigenvalues, leading to an unacceptably oscillatory behavior.
Other single-cohort immigration policies can be investigated in a similar way.

28.17 Example As concluded in Example 25.13,


national economy in Example 20.16
.v5(k+l)

the

state equation describing the

=
[1

1 ] x8(k) +

(56)

reachable for any coefficient values in the permissible range 0 < a < 1, 3> 0.
Suppose that we want a strategy for government spending g5(k) that will return
deviations in consumer expenditure x8l (k) and private investment x82(k) to zero
is

(corresponding to a presumably-comfortable nominal) from any initial deviation. For a


linear feedback strategy

g8(k)=
the closed-loop

[k1

k2]x(k)

state equation is

a(k1+1)

cL(k2+l)

I3ci(ki+l)13

x6(k)

(57)

with characteristic polynomial


+ [a(k1 +

1)

+ 1)]X +

+ l)(k7 + 1)f3a(k2 + 1)]

An inspired notion is to choose k1 and k2 to place both eigenvalues of (57) at zero.


This leads to the choices k1 =

= 1,

and the closed-loop state equation becomes

x5(k+1)=

x5(k)

(58)

for any initial state x5(0) we obtain x8(2) = 0, either by direct calculation or a
more general argument using the Cayley-Hamilton theorem on the zero-eigenvalue state
equation (58). (See Note 28.5.)
Thus

Exercises

543

EXERCISES
Exercise 28.1

Assuming existence of the indicated inverses, show that

where P is n x m and Q

is

QP)'

(1,,

PQ

ni x n. Use this identity to derive (11) from (10), and compare this

approach to the block-diagram method used to compute (11) in Chapter 14.

Exercise 28.2 Specialize the matrix-inverse formula in Lemma 16.18 to the case of a real matrix
V. Derive the so-called matrix inversion lemma
(V,1

V22 V21

by assuming invertibility of both V,, and V22. computing the 1.1-block of

from

V=

1,

and comparing.

Exercise 28.3 Given a constant a> 1, show how to modify the feedback gain in Theorem 28.9
so that the closed-loop state equation is uniformly exponentially stable with rate a.
Exercise 28.4 Show that for any K the time-invariant state equation

x(k+1)

y(k) =
is

BK)x(k) + Bu(k)

= (A +

Cx(k)

reachable if and only if

x(k+l)=Ax(k)
y(k) =

+Bu(k)

Cx(k)

reachable. Repeat the problem in the time-varying case. Hint: While an explicit argument can
be used in the time-invariant case, apparently an indirect approach is required in the time-varying
is

case.

Exercise 28.5 In the time-invariant case show that a closed-loop state equation resulting from
static linear output feedback is observable if and only if the open-loop state equation is
observable. Is the same true for static linear state feedback?

Exercise 28.6 A time-invariant linear state equation

x(k+1)=Ax(k) +Bu(k)
y(k) = Cx(k)
with p =

is said to have identity dc-gain if for any given rn x 1 vector


vector i such that
That is, for all

5=

there exists an ii x

Under the assumption that

AI B

CO

is invertible, show that


(a) if an ni x n K is such that (IA BK) is invertible, then C (IA BK)'B is invertible,
(b) if K is such that (IABK) is invertible, then there exists an m x ni matrix N such that the
closed-loop state equation

Chapter 28

544

Discrete Time: Linear Feedback

x(k4-l) = (A + BK)x(k) + BNr(k)


y(k) = Cx(k)
has identity dc-gain.

Exercise 28.7 Repeat Exercise 28.6 (b), omitting the hypothesis that (IA BK) is invertible.

Exercise 28.8 Based on Exercise 28.6 present conditions on a time-invariant linear state
equation with p = m under which there exists a feedback u(k) = Kx(k)+Nr(k) yielding an
exponentially stable closed-loop state equation with transfer function G(:) such that G( I) is
diagonal and invertible. These requirements define what is sometimes called an asymptotically
noninteracting closed-loop system. Justify this terminology in terms of input-output behavior.

Exercise 28.9

Consider a variation on the cohort population model of Example 28.16 where the

output is the state vector (C = I). Show how to choose state feedback (immigration policy)
u (k) = Kx (k) so that the output satisfies y (k) = y (0), k 0. Show how to arrive at your result by
computing, and then modifying, a noninteracting control law.

Exercise 28.10 For the time-invariant case, under what condition is the noninteracting state
equation provided by Theorem 28.14 reachable? Observable? Show that if +
+ ic,,, = n.
then the closed-loop state equation can be rendered exponentially stable in addition to
noninteracting.

NOTES
Note 28.1 The state feedback stabilization result in Theorem 28.8 is based on

V.H.L. Cheng, "A direct way to stabilize continuous-time and discrete-time linear time-varying
systems," IEEE Transactions on Automatic Control, Vol. 24, No. 4, pp. 641 643. 1979
Since invertibility of A (k) is assumed, the uniformity condition (16) can be rewritten as a uniform
i-step controllability condition

e11Wc(k,
where the controllability Gramian Wc(k0, k1) is defined in Exercise 25.10.

Note 28.2

Results similar to Theorem 28.8 can be established without assuming A (k)

is

invertible for every k. The paper


J.B. Moore, B.D.O. Anderson, "Coping with singular transition matrices in estimation and control
stability theory," International Journal of Control, Vol. 31, No. 3, pp. 571 586, 1980

does so based on a dual problem of estimator stability and a clever reformulation of the stability
property. This paper also reviews the history of the stabilization problem. Further stabilization
results under hypotheses weaker than reachability are discussed in

B.D.O. Anderson, J.B. Moore, "Detectability and stabilizability of time-varying discrete-time


linear systems," SIAM Journal on

and Optimi:ation, Vol. 19, No. 1, pp. 20 32, 1981

Note 28.3 The time-invariant stabilization result in Theorem 28.9 is proved for invertible A in

D.L. Kleinman, "Stabilizing a discrete, constant, linear system with application to iterative

Notes

545

methods for solving the Riccati equation," iEEE Transactions on Automatic Control, Vol. 19, No.
3. PP. 252 254, 1974

Our proof for the general case is borrowed from

E.W. Kamen, P.P. Khargonekar, 'On the control of linear systems whose coefficients are functions
of parameters," IEEE Transactions on Autoniatic Control, Vol. 29, No. 1, pp. 25 33, 1984

Using an operator-theoretic representation. this proof has been generalized to time-varying


systems by P.A. Iglesias, thereby again avoiding the assumption that A (k) is invertible for every k.

The noninteracting control problem is most often discussed in terms of continuoustime systems, and several sources are listed in Note 14.7. An early paper treating a very strong
form of noninteracting control in the time-varying, discrete-time case is
Note 28.4

V. Sankaran, M.D. Srinath, "Decoupling of linear discrete time systems by state variable
feedback," Journal of Mathe,natical Anah'sis and Applications, Vol. 39, pp. 338

345, 1972

From a theoretical viewpoint, differences between the discrete-time and continuous-time versions
of the time-invariant noninteracting control problem are transparent, and indeed the treatment in
Chapter 19 encompasses both. For periodic discrete-time systems, a treatment using sophisticated
geometric tools can be found in

O.M. Grasselli, S. Longhi, 'Block decoupling with stability of linear periodic systems," Journal
of Mathematical Systems, Estimation, and Control, Vol. 3, No. 4, pp. 427 458, 1993

The important notion of deadbeat control, introduced in Example 28.17, involves


linear feedback that places all eigenvalues at zero. This results in the closed-loop state being
driven to zero in finite time from any initial state. For a detailed treatment of this and other
Note 28.5

aspects of eigenvalue placement, consult


V. Kucera, Anal','sis and Design of Discrete Linear Control Systems, Prentice Hall, London, 1991

A deadbeat-control result for I-step reachable, time-varying linear state equations is in

P.P. Khargonekar, K.R. Poolla, 'Polynomial matrix fraction representations for linear timevarying systems," LinearAlgebra and Its Application.s, Vol. 80, pp. 1 37, 1986
Note 28.6 The controller-form argument used to demonstrate eigenvalue placement by state
feedback is not recommended for numerical computation. See

P. Petkov, N.N. Christov, M. Konstantinov, "A computational algorithm for pole assignment of
linear multi-input systems," IEEE Transactions on Automatic Control, Vol. 31, No. II, pp. 1044
1047, 1986

G.S. Miminus, C.C. Paige, "A direct algorithm for pole assignment of time-invariant multi-input
systems," Auto,natica. Vol. 24, pp. 242256, 1988

Note 28.7 A highly-sophisticated treatment of feedback control for time-varying linear systems,
using operator-theoretic representations and focusing on optimal control, is provided in
A. Halanay, V. lonescu, Time-Varying Discrete Linear Systems, Birkhauser, Basel, 1994

29
DISCRETE TIME
STATE OBSERVATION

An important variation on the notion of feedback in linear systems occurs in the theory

of state observation, and state observation in turn plays an important role in control
problems involving output feedback. In rough terms state observation involves using
current and past values of the plant input and output signals to generate an estimate of
the (assumed unknown) current state. Of course as the time index k gets larger there is
more information available, and a better estimate is expected. A more precise
formulation is based on an idealized objective. Given a linear state equation
x(k +1) = A (k).v(k) + B (k)u (k)

x(k0) =

y(k) = C(k)x(k)
with the initial state

unknown, the goal is to generate an ii x I vector sequence

(k)

that is an estimate of x (k) in the sense

lim [x(k) (k)] =0


It is assumed that the procedure for producing (ka) at any ka
can make use of the
values of u(k) and y(k) for k = ku,...,
as well as knowledge of the coefficient
matrices in (I).
If (1) is observable on [k0, ka], a suggestion in Example 25.13 for obtaining a state
estimate is to first compute the initial state from knowledge of u (k) and y (k),
k = k(,,..., kg,. Then solve (I) fork k0, yielding an estimate that is exact at any k k0,
though not current. That is, the estimate is delayed because of the wait until ka, the time
required to compute x0, and then the time to compute the current state from x0. In any
case observability is a key part of the state observation problem. How feedback enters
the problem is less clear, for it depends on a different idea: using another linear state

547

Observers
equation, called an observer, to generate an estimate of the state of (1).

Observers
The standard approach to state observation for (1), motivated partly on grounds of
hindsight, is to generate an asymptotic estimate using another linear state equation that
accepts as inputs the plant input and output signals, u (k) and y (k). As diagramed in
Figure 29.1, consider the problem of choosing an n-dimensional linear state equation of
the form

+ G(k)n(k) + H(k)y(k),

(3)

A natural requirement to
with the property that (2) holds for any initial states .v0 and
impose is that if
=x0, then for every input signal u(k) we should have

for all k k(,. Forming a state equation for .v(k)i(k), simple algebraic manipulation
shows that this requirement is satisfied if coefficients of (3) are chosen as

F(k)=A(k)

H(k)C(k)

G(k) = B(k)
Then (3) becomes

i(k+l) =

+ B(k)u(k) +

(k0)

5(k) = C(k)i(k)
where for convenience in writing the observer stare equation we have defined the output

estimate 5(k). The only remaining coefficient to specify is the n xp matrix sequence
H (k), the observer gain, and this step is best motivated by considering the error in the
state estimate. (Also the observer initial state must be set, and in the absence of any
better information we usually let

29.1 Figure

= 0.)

Observer structure for generating a state estimate.

From (I) and (4) the estimate error

e(k) =.v(k)

Chapter 29

548

Discrete Time: State Observation

satisfies the linear state equation

e(k+l) = [A(k) !-I(k)C(k)]e(k),

=x(,

(5)

Therefore (2) is satisfied if H (k) can be chosen so that (5) is uniformly exponentially

stable. Such a selection of H (k) completely specifies the linear state equation (4) that
generates the estimate. Of course uniform exponential stability of (5) is stronger than
necessary for satisfaction of (2), but we prefer this strength for reasons that will be clear
when output-feedback stabilization is considered.
The problem of choosing an observer gain H(k) to stabilize (5) bears an obvious
resemblance to the problem of choosing a stabilizing state-feedback gain K (k) in
Chapter 28, and we take advantage of this in the development. Recall that the
observability Gramian for the state equation (1) is given by
I

M(k0, k1) =

kr,)

j=L.

where c1(k, j) is

the transition matrix for A (k). For notational convenience an a-

weighted variant of M (k(,, k1) is defined as

Ma(ko, k1) =

k1,)CT(j)C(j)c1(j,

The explicit hypotheses we make involve M (kI +

1,

k + I),

k(,)

and this connects to the

notion of I-step observability in Chapters 26 and 27. See also Note 29.2.

29.2 Theorem For the linear state equation (1), suppose A (k) is invertible at each k,
and suppose there exist a positive integer I and positive constants 6,
and e2 such
that

cl ctT(k_ll, k+l)M (kI+l, k+l)c1(k/+l,


for all k. Then given a constant a> 1 the observer gain
(7)

such that the resulting observer-error state equation (5) is uniformly exponentially
stable with rate a.
is

Proof Given a> 1, first note that (6) implies

dt)T(k!+l, k +l)Ma(k1+l,

k+l)

for all k, so that existence of the inverse in (7) is clear. To show that (7) yields an error

state equation (5) that is uniformly exponentially stable with rate a, we will apply
Theorem 28.8 to show that the gain _HT(_k) is such that the linear state equation

Observers

549

)f(k)

f(k+l) = { AT(k) + CT(_k)[


is

(8)

uniformly exponentially stable with rate a. Then the result established in Exercise

22.7 concludes the proof.


To simplify notation let

A(k)=AT(k), B(k)=CT(k), K(k)= HT(k)


and consider the linear state equation

z(k+1) =A(k)z(k) + B(k)u(k)

(9)

From Exercise 20.11 it follows that the transition matrix for A(k) is given in terms of the

transition matrix for A (k) by

J)

k+l)

up Theorem 28.8 for (9), we use (13) of Chapter 28 to write the i-step

Setting

reachability Gramian as
k+11

W(k,

k+I) =

ci(k+i,j+1)B(j)B

(k+!,j+l)

j=k
=

ki+l)
j=k

A change of summation variable from j to q = j gives


W(k, k+1) =

ki+l)
(1kI+I

= M (ki + 1, k + 1)

Then replacing k by k in (6) yields, for all k,

(k l + 1, k + 1 )W(k, k

k + 1) e21

and this can be written as

E11ctl(k, k+1)W(k, ki)c17(k,


Thus the hypotheses of Theorem 28.8 are satisfied, and the gain

K(k) =

T(k)Wa' (k, k +1)

(10)

with Wa(k, k +1) specified by (14) of Chapter 28 renders (9) uniformly exponentially
stable with rate a.
The remainder of the proof is devoted to disentangling the notation to verify H (k)
given in (7). Of course (10) immediately translates to

Chapter 29

550

_HT(_k)=

Discrete Time: State Observation


k) Wa1 (k,

k+l)

from which

11(k) =

Wa'(k,

_k+1)A_T(k)CT(k)

(11)

Using (14) of Chapter 28. we write


k+!I

j+l)

k+I) =
j =A
k+!I

k+1)
j=k

k+1)CT(q)C(q)t1(q, k+I)

=
cjk!+1

The composition property

k+l)

b(q, k+1) =
gives

and

substituting this into (11) yields (7) to complete the proof.

Output Feedback Stabilization


An important application of state observation arises in the context of linear feedback

when not all the state variables are available, or measured, so that the choice of state
feedback gain is restricted to have certain columns zero. The situation can be illustrated
in terms of the stabilization problem for (1) when stability cannot be achieved by static
output feedback. Our program is to first demonstrate that this predicament can occur and
then proceed to develop a general remedy involving dynamic output feedback.

29.3 Example The time-invariant linear state equation

x(k+l)=

[?]u(k)

y(k)= [0 lIx(k)
with static output feedback

u(k) = Ly(k)
yields the closed-loop state equation

Output Feedback Stabilization

551

0 Il
x(k+1)= [

L]

v(k)

The closed-loop characteristic polynomial is

and, since the product of roots is I for every choice of L, the closed-loop state
equation is not exponentially stable for any value of L. This limitation of static output
feedback is not due to a failure of reachability or observability. Indeed state feedback,
involving both .v1(k) and .v7(k), can be used to arbitrarily assign eigenvalues.

ODD
A natural intuition is to generate an estimate of the plant state and then try to
stabilize by estimated-state feedback. This vague but powerful notion can be directly
implemented using an observer, yielding stabilization by linear dynamic output
feedback. Based on (4) consider

i(k1)=A(k)1(k) + B(k)u(k) + H(k)[y(k)


i,(k) =

N(k)r(k)

The resulting closed-loop state equation, shown in Figure 29.4, can be written as a
partitioned 2,z-dimension linear state equation,

x(k+l)
(k+l)

B(k)K(k)

A(k)
=

H(k)C(k) A(k)-H(k)C(k)+B(k)K(k)

y(k)= [C(k)

opxfll

B(k)N(k)

.v(k)
(k)

B(k)N(k)

Ix(k) 1
I

29.4 Figure

Observer-based dynamic output feedback.

The problem is to choose the feedback gain K(k), now applied to the state estimate, and

the observer gain H(k) to achieve uniform exponential stability for (15). (Again the

Chapter 29

552

Discrete Time: State Observation

gain N (k) plays no role in the zero-input response.)

29.5 Theorem
For the linear state equation (1) with A (k) invertible at each k,
suppose there exist positive constants
and a positive integer / such
that

a1!

1< +1)W(k, k +l)bT(k, k +1) a2!

a'

k+l)M (kI+1, k+l)'t'(kl+l, k+l)

for all k, and


ki

11A' (i) II 13i + 132(ki 1)

for all k, j such that k j + 1. Then for any constants a> 1 and
observer gains

> 1 the feedback and

K(k) = _BT(k)A_T(k)W1;i (k, k+1)


(16)
are such that the closed-loop state equation (15) is uniformly exponentially stable with
rate a.

Proof In considering uniform exponential stability for (15), r (k) can be ignored (or
set to zero). We first apply the state variable change, using suggestive notation,

x(k)

x(k)

0,,

e (k)

I,, I,,

( 17

(k)

It is left as a simple exercise to show that (15) is uniformly exponentially stable with rate
a if the state equation in the new state variables,

x(k+l)
e(k+l)

A(k)+B(k)K(k)

B(k)K(k)

0,,

A(k)H(k)C(k)

x(k)
e(k)

(18)

uniformly exponentially stable with rate a. Let cb(k, j) denote the transition matrix
corresponding to (18), and let
j) and
j) denote the n X n transition
matrices for A(k)+B(k)K(k) and A(k)H(k)C(k), respectively. Then the result of
is

Exercise 20.12 yields


ki

I +1)B (I)K(I)'De(I, J)

J)
1=]

0,,

kj+l

J)

Writing
j) as a sum of three matrices, each with one nonzero partition, the triangle
inequality and Exercise 1.8 provide the inequality

________

Reduced-Dimension Observers

553

II'D(k, j)II

j)II + 1k1,(k, 1)11


AI

i+l)B

+ II

j)II

k j+l

(19)

i =1

and (presumably not large) 11 > 1, Theorems 28.8 and 29.2 imply
the feedback and observer gains in (16) are such that there is a constant y for which
For constants a>

1)11

/)11

II

kj
Then
AI

AI

I +l)B

II

j)II

IIB(i) II IIK(I) II

y2(iia)

+1

Using an inequality established in the proof of Theorem 28.8,


IlK (i) II

II BT(i) Ill Ar(l) III


118(1)11

"(i;T: (I, 1+1)1

(1)11

This gives
AI

lIE

1)11

a)"4

Then the elementary bound (Exercise 22.6)

eln(ii)

k0

(20)

yields
AI

II

i + )B (i)K
1

1)11

13

+ e 1,1(11)

a(AJ)

j+1

For k = j the summation term in (19) is replaced by zero, and thus we see that each term
on the right side of (19) is bounded by an exponential decaying with rate a for k j.
This completes the proof.

Reduced-Dimension Observers
The above discussion of state observers ignores information about the state of the plant

that is provided directly by the plant output signal. For example if output components are
state variableseach row of C (k) has a single unity entrythere is no need to estimate

what is available. We should be able to make use of this information and construct an
observer only for state variables that are not directly known from the output.

Chapter 29

554

Discrete Time: State Observation

Assuming the linear state equation (1) is such that rank C(k) = p at every k, a
state variable change can be employed that leads to the development of a reduced-

dimension observer with dimension n p. Let

P'(k)=

(21)

is an (ii p) x n matrix that is arbitrary at this point, subject to the


invertibility requirement on P(k). Then letting z (k) = P - '(k)x (k) the state equation in
the new state variables can be written in the partitioned form
where P,,(k)

z,,(k+l)

F11(k) F12(k)

Za(k)

F21(k) F22(k)

Zh(k)

y(k)= [Ip

G1(k)
G2(k)

u(k),

za(ko)
zh(k0)

=P l (k0)x0

(22)

Opx(np)]

where F11(k) isp xp, G1(k) isp xni, Za(k) isp xl, and the remaining partitions have
corresponding dimensions. Obviously z11(k) = y (k), and the following argument shows
how to obtain an asymptotic estimate of the (np) x 1 state partition zh(k). This is all
that is needed, in addition to y (k), to obtain an asymptotic estimate of x (k).
Suppose for a moment that we have computed an (n p)-dimensional observer for
z,,(k) of the form (slightly different from the full-dimension case)

z((k+l)=F(k)z4(k)

G0(k)u(k) +

+H(k)z0(k)

(23)

That is, for known u (k), but regardless of the initial values zh(k0), z(.(k0), Za(ko), and

the resulting za(k) from (22), the solutions of (22) and (23) are such that

lim [zh(k)2h(k)]=O
k ,oo

an asymptotic estimate for the state vector z (k) in (22), the first p components of
which are perfect estimates, can be written in the form

Then

2a(k)

1p

Zh(k)

H(k)

Opx(np)

y(k)

Pursuing this setup we examine the problem of computing an (n p)-dimensional


observer of the form (23) for an n-dimensional state equation in the special form (22).

Of course the focus in this problem is on the (n p) x 1 error signal


e,,(k) = Zh(k) 7(k)
that satisfies the error state equation

Reduced-Dimension Observers

:e(k+1)

555

H(k+I)Za(k+l)

+ F22(k)z,)(k) + G2(k)u(k)

= F21

Gh(k);,(k)

H(k+l)Fii(k):a(k)

F(k);.(k)

H(k+l)F17(k):,,(k)

H(k+1)G1(k)u(k)

and rearranging gives

Using (23) to substitute for

e1,(k+l) = F(k)e,,(k) [F,,(k) H(k+l)F12(k)


+

[F,1(k) + F(k)H(k)

+ [G2(k)

F(k)]zh(k)

G,,(k)

H(k+1)G1(k)] zi(k) ,

e1,(k0)

Again a reasonable requirement on the observer is that, regardless of u (k), Za(k(,), and
= zb(k0) should yield eh(k) = 0 for all
the resulting Za(k), the lucky occurrence
k k0. This objective is attained by making the coefficient choices

F(k) = F,,(k) H(kl)F,2(k)


G,,(k) = F,,(k) + F(k)H(k)

= G2(k)

H(k+l)F11(k)
(24)

with the resulting error state equation

eh(k+l) = [F,,(k) H(k l)F12(k) ]

(25)

To complete the specification of the reduced-dimension observer in (23), we


consider conditions under which a (np) xp gain H(k) can be chosen to yield uniform

exponential stability at any desired rate for (25). These conditions are supplied by
Theorem 29.2, where A (k) and C(k) are interpreted as F22(k) and F,2(k)
respectively, and the associated transition matrix and observability Gramian are
correspondingly adjusted.
Return now to the state observation problem for the original state variable x (k) in
(1). The observer estimate for (k) obviously leads to an estimate

-P

O,,X(,,_,,)

y(k)

H(k)

(26)

The,, x 1 estimate error e(k) = x(k) (k) is given by


e,,( )

Thus if (25) is uniformly exponentially stable with rate a> I, and if there exists a finite

constant p such that !IP(k)il p for all Ic (thereby removing some arbitrariness from
Ph(k) in (21)), then lie (Ic) ii goes to zero exponentially with rate a.

Chapter 29

556

Discrete Time: State Observation

Statement of a summary theorem is left to the dedicated reader, with reminders


that the assumption on C (k) used in (21) must be recalled, boundedness of P (k) is
required, and F72(k) must be invertible. Collecting the various hypotheses makes
obvious an unsatisfying aspect of our treatmenthypotheses are required on the newvariable state equation (22). as well as on the original state equation (1). However this
situation can be neatly rectified in the time-invariant case, where the simpler
observability criterion can be used to express all the hypotheses in terms of the original
state equation.

Time-Invariant Case
When specialized to the case of a time-invariant linear state equation,

.v(k+l)=Ax(k)

Bu(k), .v(O)=v0

)'(k)CX(k)

(27)

the full-dimension state observation problem can be connected to the state feedback
stabilization problem in a much simpler fashion than is the case in Theorem 29.2. The
form we choose for the observer is, from (4),

(k+l)=Ai(k)+Bu(k)+H[y(k)(k)],
(28)

and the error state equation is, from (5),

e(k + 1) = (A HC)e (k)

e (0) =

i0

Now the problem of choosing H so that this error equation is exponentially stable with
prescribed rate, or so that A HG has a prescribed characteristic polynomial, can be
recast in a form familiar from Chapter 28. Let

A=AT, B=GT, K=_HT


Then the characteristic polynomial of A HG is identical to the characteristic
polynomial of
(A

-HG)T =A +8K

Also observability of (27) is equivalent to the reachability assumption needed to apply


either Theorem 28.9 on stabilization, or Theorem 28.10 on eigenvalue assignment.
(Neither of these require invertibility of A.) Alternatively observer form in Chapter 13
can be used to prove more directly that if rank C = p and (27) is observable, then H

can be chosen to obtain any desired characteristic polynomial for the error state
equation. An advantage of the eigenvalue-assignment approach is the capability of
placing all eigenvalues of A HC at zero, thereby guaranteeing e (n) = 0.

Specialization of Theorem 29.5 on output feedback stabilization to the timeinvariant case can be described in terms of eigenvalue assignment, and again the
invertibility assumption on A is avoided. Time-invariant linear feedback of the

Time-Invariant Case

557

estimated state yields a dimension-2n closed-loop state equation of the form (15):

x(k+l)
(k+l)

x(k)

BK

HG AHG BK

(k)

BN
BN

r(k)

x(k)

= [C

(29)

The state variable change (17) shows that the characteristic polynomial for (29) is the

same as the characteristic polynomial for the linear state equation


x(k +

A + BK

1)

e(k+l)

x(k)
e(k)

BK
AHC

0,,

y(k) = [C

BN
+

r(k)

(30)

Taking advantage of block triangular structure, the characteristic polynomial of (30) is

+HC)
This calculation has revealed a remarkable eigen value separation property. The
2iz eigenvalues of the closed-loop state equation (29) are given by the n eigenvalues of
the observer and the ii eigenvalues that would be obtained by linear state feedback
(instead of linear estimated-state feedback). If (27) is reachable and observable, then K
and H can be chosen such that the characteristic polynomial for (29) is any specified
monic, degree-2n polynomial.
Another property of the closed-loop state equation that is equally remarkable
concerns input-output behavior. The transfer function for (29) is identical to the transfer
function for (30), and a quick calculation, again making use of the block-triangular
structure in (30), shows that this transfer function is

G(:) = C(:1

BK)'BN

(31)

That is, linear estimated-state feedback leads to the same input-output (zero-state)
behavior as does linear state feedback.

29.6 Example

For the reachable and observable linear state equation encountered in

Example 29.3,

v(k+l)=
1

x(k)+

v(k)= [0 lIx(k)

u(k)
1

(32)

Chapter 29

558
the

Discrete Time: State Observation

full-dimension observer (28) has the form

i](k)

9(k)= 10

The resulting estimate-error equation is


0

e(k+l)=

1h1

e(k)

By setting h1 = 1,
= 0 to place both eigenvalues of the error equation at zero, we
obtain the appealing property that e (k) = 0 for k 2, regardless of e (0). That is, the
state estimate is exact after two time units. Then the observer becomes

(kl)=

[?]u(k)+

(33)

To achieve stabilization consider estimated-state feedback of the form

u(k)=K(k) + r(k)
where r (k) is

(34)

the scalar reference input signal. Choosing K = [k1

k2] to place both

eigenvalues of

A + BK

at zero leads to K =

[ 1

[i k1 Ic2

01. Then substituting (34) into the plant (32) and observer

(33) yields the closed-loop description

x(k+1)=

[?]r(k)

{i
y(k)+

r(k)

y(k)= [0 l]x(k)
This can be rewritten as the 4-dimensional linear state equation

0100

x(k+l)
(k+1)

0 l 0
1

0000

y(k)= [0

0]

x(k)
(k)

r(k)

(35)

Time-Invariant Case

559

Easy algebraic calculations verify that (35) has all 4 eigenvalues at zero, and that
.v(k) =(k) = 0 for k 3. regardless of initial state. Thus exponential stability, which

cannot be attained by static state feedback, is achieved by dynamic output feedback.


Finally the transfer function for (35) is calculated as

100-' 0

G(:)= [0

0]

10

l = 0

0 0:

:2

=[o

001

0
0

and (31) is readily verified. Indeed the zero-state response of (35) is simply a one-unit
delay of the reference input signal.

Specialization

of our treatment of reduced-dimension observers to the time-

invariant case also proceeds in a straightforward fashion. We assume rank C = p and


choose P,,(k) in (21) to be constant. Then every time-varying coefficient matrix in (22)
becomes a constant matrix. This yields a dimension-(np) observer described by
e(k + I) = (F22 HF
+

(F1, + F,,H

(G,

HF,2H

HG, ) u(k)

HF,, );,(k)

h(k) = :, (k) + H;,(k)

i(k)=P

(36)

typically with the initial condition


=
:,,(k) is the obvious specialization of (25):
eh(k + I) = (F1,

0.

HF,, ) eh(k)

The error equation for the estimate of

e,,(0) =

2,,(0)

(37)

For the reduced-dimension observer in (36), the (ii p) x p gain matrix H can be
chosen to provide exponential stability for (37), or to provide any desired characteristic
polynomial. This is shown in the proof of the following summary statement.

Chapter 29

560

Discrete Time: State Observation

Theorem Suppose the time-invariant linear state equation (27) is observable and
rank C = p. Then there is an observer gain H such that the reduced-dimension
observer defined by (36) has an exponentially-stable error state equation (37).
29.7

Proof Selecting a constant (n p) x ,i matrix P,, such that the constant matrix P
defined in (21) is invertible, the state variable change

:(k) =
yields an observable, time-invariant state equation of the form (22). Specifically the
coefficient matrices of main interest are

CP=
where F11 isp xp and F22 is (ii

p)

0]

x (ii p). In order to prove that H can be chosen

to exponentially stabilize (37), or to yield

det(XIF2,
where q (A) is any degree-(n p) monic polynomial, we need only show that the (np )dimensional state equation

+1) = F21;,(k)

w(k) =

(39)

is observable.

Proceeding by contradiction, suppose (39) is not observable. Then there exists a


nonzero (n p) x I vector v such that
F1,
F12F,2
0

F11v

F12F,,r

Furthermore the Cayley-Hamilton theorem gives


straightforward iteration shows that
F11 F12

pXI
V

and

therefore

= 0

pxl
/.
r2,V

for all k 0. But then a

A Servomechanism Problem

P2]

0]
Interpreting this

561

=0,

k = 0,..., n -1

in terms of the block rows of the np x n observability matrix

corresponding to (38) yields a contradiction to the observability hypothesis on (27).

29.8 Example To compute a reduced-dimension observer for the linear state equation
(32) in Example 29.6,

x(k+1)=

[?]u(k)
[0

3'(k)

I]x(k)

we begin with a state variable change (21) to obtain the special form of the output matrix
in (22). Letting

P=P-'=
gives

z,,(k+l)

y(k)=

[1

a(k)
z,,(k)

ii(k)

(40)

0]
h/,(

The reduced-dimension observer in (36) becomes the scalar state equation


Ze(k + 1) =

H:jk)

HII (k) + (1 H2 )y (k)

+ Hy(k)

The choice H =
e (k + 1) = 0,

defines an observer with zero-eigenvalue, scalar error equation

k 0, from (37). Then from (36) the observer can be written as

=y(k)
i(k) =

(42)

is an estimate
of x1(k), while y(k) provides x2(k) exactly. Note
that the estimated state from this observer is exact for k 1, as compared to the estimate
obtained from the full-dimension observer in Example 29.6 which is exact for k 2.
Thus

Discrete Time: State Observation

Chapter 29

562

A Servomechanism Problem
another illustration of state observation and estimated-state feedback, consider a
plant effected by a disturbance and pose multiple objectives for the closed-loop state
As

equation. Specifically consider a time-invariant plant of the nonstandard form

x(k+l)=Av(k) + Bu(k) + Ew(k), .v(O)=x(,


y(k) = Cx(k) + Fw(k)

(43)

We assume that w (k) is a q x I disturbance signal that is unavailable for use in


feedback. For simplicity suppose p = in. Using output feedback, the first objective for
the closed-loop state equation is that the output signal should track constant referenceinput signals with asymptotically-zero error in the face of unknown constant disturbance
signals. Second, the coefficients of the characteristic polynomial should be arbitrarily
assignable. This type of problem often is called a
problem.

The basic idea in addressing this problem is to use an observer to generate


asymptotic estimates of both the plant state and the constant disturbance. As in earlier
observer constructions, it may not be apparent at the outset how to do this. But writing
the

plant (43) together with the constant disturbance

w (k)

in

the form of an

'augmented' plant provides the key. Namely we describe the constant disturbance as the
linear state equation w(k +1) = (k) (with unknown w (0)) to write

x(k+l)
w(k 1)

AE
=

y (k) = [C

.r(k)

u (k)

F]

(44)

and then adapt the observer structure suggested in (28) to this (n +q)-dimensional linear

state equation. With the observer gain partitioned appropriately, this leads to the
observer state equation

] u(k)

(k) = [C F

(45)

Since

AE

01

H,

1C F1

A-H,C E-H,F
H7C

IH2F

the augmented-state-estimate error equation, in the obvious notation, satisfies

A Servomechanism Problem

563

e1.(k+l)
However

AH1C EH1F

H7C 1H,F

( 46

instead of separately considering this error equation and feedback of the

augmented-state estimate to the input of the augmented plant (44), we directly analyze
the closed-loop state equation.
With linear feedback of the form
ii(k) = K1I(k) +

+ Nr(k)

(47)

the closed-loop state equation can be written as


BK1
BK,
ABK1H1C E-i-BK,H1F
H,C
H,C
IH,F
A

.v(k)

I-!1C

BN
BN

r(k) +

111F

w(k)

H,F

= [C

It

v(k)
1(k)

0]

(48)

+ F%1'(k)

is convenient to use the .v-estimate error variable and change the sign of the

disturbance estimate to simplify the analysis of this complicated linear state equation.
With the state variable change
x(k)

I,,

.v(k)
=

On

qxn qxn
the

xq

'q

closed-loop state equation becomes


A+BK1

.v(k+l)
=

BK1

A 111C EH1F

H,C IH,E

BN
+

BK,

r(k) +

E
EH1F

x(k)

w(k)

.v(k)

)'(k)

[C

0]

ejk)

+ Fl4(k)

(49)

The

characteristic polynomial of (49) is identical to the characteristic polynomial of

Chapter 29

564

Discrete Time: State Observation

(48). Because of the block-triangular structure of (49), it is clear that the closed-loop
characteristic polynomial coefficients depend only on the choice of gains K1, H, and
H2. Furthermore from (46) it is clear that a separation of the augmented-state-estimate
error eigenvalues and the eigenvalues of A + BK1 has been achieved.
Temporarily assuming that (49) is exponentially stable, we can address the choice
of gains N and K, to achieve the input-output objectives of asymptotic tracking and
disturbance rejection. A careful partitioned multiplication verifies that
A+BK1
0
o

BK1

BK,

A 111C EI-11F

H,C

IH,F
BK,]

:IA+H1C EH1F
H,C
:/I+H,F

-l

and another gives

Y(:) =

[C(:IABK1Y'BKi

C(:IABK1Y'BK2]

W(:) + FW(:)

(50)

Constant reference and disturbance inputs are described by

where r0 and w0 are in x I and q x 1 vectors, respectively. The only terms in (50) that
contribute to the asymptotic value of the response are those partial-fraction-expansion
terms for Y(:) corresponding to denominator roots at = 1. Computing the coefficients
of such terms using the partitioned-matrix fact

1-A+H1C EH1F
H,C
H,F

E-H1F
H,F

gives

limv(k) = C(iABK1)'BNr0
+ [c(iABK1Y'E +

+ F]w0

Alternatively the final-value theorem for :-transfornis can be used to obtain this result.

Exercises

565

We are now prepared to establish the eigenvalue assignment property using (48).
and the tracking and disturbance rejection property using (51).
29.9 Theorem Suppose the plant (43) is reachable for E = 0. the augmented plant (44)
is observable, and the (n +,n) x (ii +m) matrix

[A_I

(52)

invertible. Then linear dynamic output feedback of the form (47), (45) has the
following properties. The gains K1. H1. and H2 can be chosen such that the closed-loop
state equation (48) is exponentially stable with any desired characteristic polynomial
is

coefficients. Furthermore the gains


N = [CuA

K2 = NC(IA BK1)'E

NF

(53)

are such that for any constant reference input r(k) = r0 and constant disturbance
= w0 the response of the closed-loop state equation satisfies
lim
A

(54)

,oo

Proof
By the observability assumption in conjunction with (46), and the
reachability assumption in conjunction with A + BK1, we know from previous results
that K1, H1, and H2 can be chosen to achieve any specified degree-2n characteristic
polynomial for (49), and thus for (48). Then Exercise 28.7 can be applied to conclude,
under the invertibility condition on (52). that
is invertible.
Therefore the gains N and K2 in (53) are well defined, and substituting (53) into (51)
gives (54).

EXERCISES
Exercise 29.1
For the time-varying linear state equation (1). suppose the (np) x n matrix
sequence P,,(k) and the uniformly exponentially stable. (np)-dimensional state equation

:(k+l) = F(k):(k) + G,,(k)u(k) +


satisfy the following additional conditions for all k:
rank

C (k)
P,,(k)

F(k)P,,(k) + G,,(k)C(k) =P,,(k+1)A(k)

G0(k) =P,,(k+l)B(k)
Show that the (np) x I error vector e,,(k) = :(k) P,,(k)x(k) satisfies

Chapter 29

566

Discrete Time: State Observation

= F(k)e,,(k)

Writing

= [11(k) J(k)[
where 11(k) is ii x p. show that under an appropriate additional hypothesis

H(k)v(k) + J(k):(k)

provides an asymptotic estimate for.v(k).

Exercise 29.2 Apply Exercise 29.1 to a linear state equation of the form (22). selecting (slight
abuse of notation)
Ph(k) = [1-1(k)
Compare the resulting reduced-dimension observer with (23).

Exercise 29.3 In place of (3) consider adopting an observer of the form

+ G(k)u(k) + H(k)v(k+I)
the estimated state is computed in terms of the
output value, rather than the
previous output value. Show how to define F(k) and G(k) to obtain an unforced linear state
equation for the estimate error. Can Theorem 29.2 be used to obtain a uniformly exponentially
where

stabilizing gain H (k) for the estimate error of this new form of observer?

Exercise 29.4 For the plant

x(k+l)
+ {

v(k) = [0

] u(k)

1 [x(k)

compute a dimension-2 observer that produces a


estimate for k 2. Then compute a
reduced-dimension observer that produces a zero-error estimate for k I.
Exercise 29.5 Suppose the time-invariant linear state equation

:(k+l) =A:(k) + Bu(k)


y(k) =

[I,,

0,,X(,,_,,)[ :(k)

is reachable and observable. Consider dynamic output feedback of the form

u(k)

Nr(k)

is an asymptotic state estimate generated via the reduced-dimension observer specified


by (36). Characterize the eigenvalues of the closed-loop state equation. What is the closed-loop
transfer function? Apply this result to Example 29.8, and compare to Example 29.6.
where

Exercise 29.6 Consider a time-invariant plant described by

Notes

567

.v(k+l)=Ax(k) + Bu(k)
v(k)

Suppose

the vector r(k)

is

= C1x(k) + D1ii(k)

a reference input signal. and


= C2.v(k) +

is

a vector signal available for feedback.

D,1r(k)

+ D22u(k)

For the time-invariant.

-dimensional dynamic

feedback

:(k+l)=F:(k)
n(k)=H:(k)

Gt'(k)

+Jv(k)

compute. under appropriate assumptions. the coefficient matrices A.

B. C. and

for

the (n + ,i, )-

dimensional closed-loop state equation.

Exercise 29.7 Continuing Exercise 29.6. suppose


D has full column
rank. D2, has full row rank, and the dynamic feedback state equation is reachable and observable.
Define matrices B,, and C',, by setting B = B,D and
=
For the closed-loop state
equation, use the reachability and observability criteria in Chapter 13 to show:
(a) If tile complex number
is such that rank {
A B I <n + a,, then X,, is an eigenvalue of
A. (h) If the complex number X,, is such that
C
rank

then

xoIA

<n +n

X, is an eigenvalue of A B0C

NOTES
Note 29.1

Reduced-dimension observer theory for time-varying, discrete-time linear state

equations is discussed in the early papers

E. Tse, M. Athans, "Optimal minimal-order observer-estimators for discrete linear time-varying


systems,'' IEEE Transactions on Auton,atic Control. Vol. 15, No. 4, pp. 416 426, 1970

T. Yoshikawa. H. Kohayashi. "Comments on 'Optimal minimal-order observer-estimators for


discrete linear time-varying systems'.'' IEEE Transactions on Auto,natic Control, Vol. 17, No. 2,
pp. 272273, 1972
C.T.

Leondes, L.M. Novak. "Reduced-order observers for linear discrete-time systems," IEEE
on Automatic Control. Vol. 19, No. I. pp.42 46. 1974

Transactions

The discrete-time case also is covered in the book

J. O'Reilly. Observers/or Linear Systems. Academic Press. London. 1983

Note 29.2 Using tile notion of reconstructibility presented in Exercise 25.12. the uniformity
hypothesis involving the /-step Gramian in Theorem 29.2 can be written more simply as a uniform
reconstructibility condition

This

observation and Note 28.1 lead to similar recastings of the hypotheses of Theorem 29.5.

568

Chapter 29

Discrete Time: State Observation

Note 29.3
The use of an exogenous system assumption to describe a class of unknown
disturbance signals is a powerful tool in control theory. Our treatment of the time-invariant
servomechanism problem assumes an exogenous system that generates constant disturbances, but
generalizations are not difficult once the basic idea is in hand. The discrete-time and continuoustime theories are quite similar, and references are cited in Note 15.7.

Author Index

Ackermann. J., 262


Aeyels. D., 156
Agarwal, R.P., 403. 436
Ailon. A.. 156
Aling. H.. 355
Amato, F.. 46 1

Anderson, B.D,O,, 130, 202. 217, 288. 327,


404. 449. 520. 544
Antoulas, A.C.. 202
Apostol, T.M., 73
Arbib, M.A., 181, 202, 288
Ascher. U.M.. 57
Astrom. K.J.. 405
Athans, M., 567

Bittanti. S.. 157.475. 507


Blair, W.B., 73
Blanchard, J., 436
Blomberg, H., 311
Boley. D., 181
Bolzern. P., 507
Bongiorno. LI.. 287. 288
Brockett, R.W.. 21. 56, 156. 261

Bruni.C.. 181. 202


Brunovsky, p., 157, 263
Bucy, R.S.. 22, 239
Burrus. C.S., 422
Byrnes. C.!.. 263

C
Callier, F.M.. 327, 403, 475

Bahill, A.T., 403


Baratchart. L., 156
Barnett. S.. 113.311
Barmish. B.R., 113
Basile, G., 354, 355, 381, 382
Bass,R.W., 261
Bauer, P., 461
Belevitch, V.. 238
Bellman, R., 56, 57. 113, 141
Bentsman. J.. 141
Berlinski. D.J., 39
Berman. A., 98
Bernstein, D.S.. 96

Bertram. i.E., 129,449


Bhattacharyya, S.P., 289, 380

Campbell, S.L.. 157


Celentano, G.,461
Champetier, C., 263
Chen. CT.. 156. 327
Cheng, V.H.L.. 261.544
Christov. N.N.. 21. 545
Chua, L.O.. 38
Colaneri, P., 157
Commault. C.. 380
Coppel, WA., 113. 141

D
D'Alessandro, P.. 180

Dai, L.. 39, 404


Damen, A.A.H.. 202
D'Angelo. H.. 97
569

Author Index

570

Davison, E.J., 263, 289


DeCarlo, R.A.. 22, 262

Delchamps,D.F.,21,181,311
Desoer, C.A., 21, 38, 39, 56, 140, 217, 289,
327, 403, 460,475
Dickinson, B.W., 262

Doyle, J.C., 289


Duran, J., 461

Hara, S.. 289


Harris. C.J.. 113
Hariman, p.. 56
Hautus, M.L.J., 238, 355
Helmke, U., 201
Heymann. M.. 262
Hinrichsen, D., 141
Hippe, p., 327
Ho, Y.C., 476

Ho,B.L.,20l
Engwerda. J.L.. 475
Evans, D.S., 506

Hong, KS.. 461


Horn. R.A.. 21. 96. 141.422.460

Hou,M..289

Fadavi-Ardekani. J., 404


Faib, P.L., 22, 181, 202, 263, 288, 381
Fang. C.H., 326
Fanti, M.P., 475

Iglesias, PA., 545

Farison, J.B., 461

Isidori,A., 180, 181,202,381,382

Farkas, M., 97
Ferrer. J.J., 506
Fliess, M., 39, 404
Francis. B.A.. 38

Ikcda.M..261,288
llchmann. A.. 140. 239.311.356
lonescu, V., 545

J
Johnson.

C.R., 21,96, 141.422. 460

Freund,E.,263
Fulks,W.,22

Johnson. C.D.. 73. 288


Jury, E.I., 436

Furuta, K., 289

K
Kaashoek, M.A., 507

Gantmacher, F.R., 21
Garofalo, F., 461
Gilbert, E.G., 180,381
Godfrey, K., 98
Gohberg. 1., 507
Golub, G.H., 21
Grasse, K.A., 157
Grasselli, O.M.. 327, 507. 545
Grimm, J., 156
Grizzle, J.W., 381
Gronwall, T.H., 56
Grotch, H., 38
Guardabassi, G.. 157

H
Hagiwara, T., 476

Hahn. W., 129, 238


Hajdasinski, A.K., 202
Halanay, A., 545
Halliday, D., 38

Kaczorek, 1., 327


Kailath. 1., 21, 39, 73, 201. 239, 262. 310, 326
Kajiya, F.. 181
Kalman, D., 181
Kalman, R.E,, 129. 180. 181. 201, 202. 238,
261, 288. 449. 476
Kamen. E.W., 97, 201, 261, 327, 422, 436, 545
Kano,H.. 157
Kaplan. W.. 97, 113
Karnopp, B.H., 38
Kelley, W.G.. 403
Kenney. C., 239
KhaIil, H., 129
Khargonekar, p_p.. 130, 217,261,262.311,422,
436, 545
Kimura, H., 262
Kimura, M., 476
Kishore, A.P.. 506
Klein, G.. 262

Kleinman, DL., 261,544


Kiema, V.C.. 22. 380

Author Index
Kobayashi. H.. 567
Kodama, S., 181, 261, 288. 507
Kolla, S.R., 461
Konstantinov, M.M.. 21. 545
Kowalczuk, Z., 405
Kriendler. E., 264, 475
Kucera, V.V.. 327. 545
Kuh, E.S., 38
Kuijper, M., 405

Lakshmikanthum, V., 403

Langholz,G..I56
Lau,G.Y., 140
Laub, A.J., 22, 239, 380
Lee. J.S., 436.
Leondes, C.T., 567
Lerer. L., 507
Lewis. F.L., 39
Linnemann, A.. 380
Ljung, L., 507
Longhi, S.. 327. 545

Luenberger, D.G., 98, 239, 287, 404,405


Lukes, D.L., 56, 73. 97, 141

Maeda, H., 181, 261, 288, 507


Magni, J.F.. 263
Maiione, B.. 475
Mansour, M., 461
Marino, R., 356
Markus, L., 140
Marro, G., 354, 355, 381, 382
Mattheij, R.M.M., 57
McKelvey, J.P., 38
Meadows, HE., 157
Meerkov,S.M.. 141
Meyer. R.A., 422
Michel, A.N., 56, 96
Miles, J.F.. 113
Miller, R.K.. 56. 96
Miminus, G.S.. 545
Mitra, S.K., 404
Moler, C.. 98
Moore, B.C., 201, 262, 380
Moore, J.B.. 130, 217, 288,449, 544
Mon. S., 289
Mon. T., 461
Morales, C.H., 73

571

Morse, A.S., 263, 354, 380. 381


Moylan. P.J., 217
Muiholland, R.J., 97
Muller. P.C., 289

N
Nagle. H.T., 405
Narendra, K.S., 476
Neumann. M., 98
Newmann, M.M., 287
Nichols, N.K., 157
Nijmeijer, H., 381, 382
Nishimura. T., 157
Novak, L.M., 567
Numberger, I., 311

0
Ohta, Y., 181

O'Reilly, J.. 287. 288, 567


Owens, D.H., 140
Ozguler, A.B., 422

P
Paige, CC., 545
Pascoal, A.M., 130
Payne, H.J.,2l7, 381
Pearson, J.B., 506
Peterson, A.C., 403
Petkov. P.H., 21. 545
Phillips, C.L., 405
Polak,E..311
Poljak, S., 475
Poolla, K.R., 311, 422,436, 545
Popov. V.M., 238. 239
Porter, W.A., 263
Pratzel-Wolters, D., 140
Pritchard. A.J., 141

R
Ramar, K., 239

Ramaswami, B.. 239

Ravi,R., 130,217
Reid, W.T., 57
Resnick, R., 38
Respondek, W., 356
Richards. J.A., 97
Rosenbrock, H.H., 262. 327
Rotea, M.A., 262

572

Author Index

Ruberti, A., 180, 181, 202


Rugh, W.J., 264
Russel, R.D., 57

Weinert. H.L., 217


Weiss, L., 22, 180, 436, 475, 506
Wilde, R.W., 289
Willems, J.C., 39, 356, 380
Willenis, J.L.. 113

Sam, M.K, 327


Sandberg, I.W., 180
Sankaran, V.. 545
Sarachik, P.E., 264, 475
Schrnale, W., 311
Schrader, C.B., 327
Schulman, J.D.. 327
Schumacher, J.M., 355, 381

Shaked, U., 217


Shokoohi, S.. 201
Silverman, L.M.,22, 157, 201, 217, 239, 38!
Skoog, R.A., 140
Smith. H.W., 289
Solo. V., 141
Sontag, ED., 38. 156. 288, 476. 506
Soroka. E., 217
Srinath. M.D.. 545
Stein, G., 289
Stein. P.. 449
Stem. R.J.. 98

Wang, S.H.. 263


Wang, Y.T., 289

Wittenmark. B.. 405


Wolovich, W.A., 263.311,381

Wonham, W.M., 262. 354. 355, 380. 381. 382


Wu,J.W., 461
Wu,M.Y., 14!

V
Yamabe,H., 140
Yang, F.. 289
Yedavalli, R.K., 461

Ylinen,R.,311
Yoshikawa. T., 567
Youla, D.C.. 180, 217
Yuksel, Y.O., 287, 288

Strang, G., 21

Zadeh, L.A., 2!, 38, 56,97

Szidarovszky. F., 403

Zhu, J.J.. 73

T
Tannenbaurn. A.. 26

Terrell,W.J., 157
Thomasian, A.J., 217
Trigiante, D., 403
Tse, E. 567
Turchiano, B., 475

U
Unger,A., 181

V
Van den Hof, P.M.J., 202
Van der Schaft, A.J., 356, 382

VanderVeen. A.J.,507
Van Dooren, P.M.. 201
Van Loan, C.F.. 21,98

Vardulakis, A.I.G., 311


Verriest, E.I., 201, 405
Vidyasagar. M., 39. 96. 129.311,382

Subject Index

C
(A,B) invariant, 354 see Controlled invariant
Absolute convergence. 13, 43, 46,59
Adapted basis, 335

Adjoint state equation. 62. 69. 73


discrete-time, 396, 402
Adjugate, 3, 77, 94, 291, 319, 408
Almost invariant. 356
Analytic function, 13. 14, 22, 59, 76, 77, 156
Augmented plant. 280. 562

Canonical form, 239


Canonical structure theorem, 180, 238, 339, 355
discrete-time, 507
Cauchy-Schwarz inequality. 2
Causal, 49, 159, 160, 180
discrete-time. 393, 477, 506
Cayley-Hamilton theorem. 4.76, 192. 196, 197.
331, 338,419. 466
Change of state variables, 66. 70. 72. 75. 78.
107, 162, 173, 179, 200, 219, 222, 231.
233. 236, 237, 248, 272, 330, 335

discrete-time, 397,402411.434.478.483.
Balanced realization, 201
Basis. 2, 328
adapted. 335
Behavior matrix, 184, 189, 201
discrete-time, 488, 495
Behavioral approach. 39, 404
Bezout identity, 297, 301, 338
Bilinear state equation, 37, 93
Binomial expansion. 18, 76
Biproper. 310
Block Hankel matrix, see Hankel matrix
Blocking zeros, 326
Bounded-input, bounded-output stability, 216
discrete-time. 519
uniform, see Uniform...
Brunovsky form, 263
Bucket system, 87, 109. 150, 175, 213

487. 493, 504, 532, 552, 554


Characteristic
exponents. 97
multipliers, 97
polynomial. 4.76. 113, 247. 275. 349. 367.
408, 532, 556, 563
Closed-loop state equation. 240, 247, 249, 270, 275,
280, 324, 342, 345. 349. 358, 362, 368
discrete-time, 521, 532, 534, 551, 557, 563
Closed-loop transfer function. 243, 258, 276.
283. 324. 358, 368
discrete-time, 524, 557, 564
Cofactor. 3. 65
Cohort population model, 432, 470, 502, 541
Column degree, 303, 309
coefficient matrix, 304
Column Hermite form, 300

573

Subject Index

574
Column reduced, 304, 305, 307, 309
Common left divisor, 299
Common right divisor, 292
Commutative, 3, 59, 73, 75, 93, 96
Compartmental model, 98, 181
Compatible subspaces, 369
Complete solution, 48, 68, 80
discrete-time, 393, 407
Complex conjugate, 3
conjugate-transpose, 2. 7, 9

1,1,410
Component state equation, 336
Composition property, 63, 75, 103, 161, 209
discrete-time, 396, 427,486,515
Computational issues, 21, 22, 57, 98, 239, 376.
380
discrete-time, 403, 545
Conditioned invariant subspace, 354, 355
Controllability. 142, 156, 162, 172, 219, 226,
248, 334, 463
index, 237
indices, 223, 226, 236, 259, 316
instantaneous, 183, 199, 239
matrix, 146, 172, 190, 195, 212, 219, 222, 332
output, 154, 249, 264. 353, 368
path, 156
PBH tests, 238
periodic. 157
rank condition, 145, 146, 156, 183, 221, 238
uniformity condition, 208, 245, 270
Controllability, discrete-time, 463. 474, 475
1-step. 544

output, 474

uniformity condition, 544


Controllability Gramian, 144, 146. 153, 163,
207, 245, 258, 268, 285, 332
discrete-time, 474, 544
Controllability subspace, 345
compatible, 369
maximal, 363. 367, 369, 378
Controllable state, 156, 331, 333
Controllable subspace, 331, 342
Controllable subsystem, 341
Controlled invariant subspace, 341
compatible, 369
maximal, 358, 376, 378, 380
Controller form, 171, 222, 239, 247, 259,
263. 316,483, 532, 545
Convergence, II, 22, 44

absolute, 13,44,46,59
uniform, 12, 13, 22,4244,46, 55, 59, 156
Convolution, IS, 80. 81
discrete-time, 17,408,411
Coprime
polynomial fraction description, 298. 301,
302, 309, 313, 317
polynomial matrices, 292, 297, 299, 300
polynomials, 338

D
dc-gain, 36, 284
discrete-time, 543, 565
Deadbeat, 430, 542, 545, 556
Decoupling, see Noninteracting control
Default assumptions, 23
discrete-time, 383. 392, 394
Degree

McMilIan, 313, 315, 317


polynomial, 77, 303
polynomial fraction description. 291
Delay, 16, 390.420,461
Descriptor state equation, 39, 157
discrete-time, 404
Detectability. 130, 217, 286, 352, 520, 544
Determinant, 36.9, 15, 65, 242, 290, 301, 304,
318, 408. 523
Difference equation, 403
matrix, 395, 396,401,408
386
,,
Difference, first, 437
Differential equation
matrix, 61, 62, 67. 69, 70. 72, 73, 153
27, 34. 35, 69, 138
Direct sum, 329, 338, 370, 376
Direct transmission. 38
discrete-time, 404
Disturbance rejection, 280, 289, 381
discrete-time, 562, 568
Disturbance decoupling, 357. 362, 379, 380

E
Economic model, 384, 397,432,470, 542
Eigenvalue, 4,8, 10, 1820
pointwise. 10,71, 131, 135, 140,450,452,456
Eigenvalue assignment, 247, 258, 259, 262, 270,
275, 278, 280, 324, 349, 355, 362
discrete-time, 532, 545, 556

Subject Index
Eigenvalue separation property, 276, 284
discrete-time, 557
Elgenvector, 4, 105, 221, 232. 429,473
assignment. 262
Electrical circuit model, 25. 38. 92. 177, 398
Elementary column operations, 300, 305
Elementary row operations. 294, 296
Elementary polynomial fraction, 291, 301
Empty product. 392. 395, 452
Equilibrium state, 35, 389
Euclidean norm, see Norm
Existence of solutions, 41,46, 47, 56, 62, 68, 77
discrete-time. 391. 403
Existence of periodic solutions, 84, 85, 87, 9497
discrete-time. 414, 416, 418. 421
Exogenous system, 289, 568
Exponential of a matrix. 59. 7479. 81, 98.
179, 330

bounds on, 59, 72, 104,128,138,140


integral of,7l,93, 104
Exponential stability. 104, 124, 211, 238
discrete-time. 238, 429, 445, 517
uniform, see Uniform exponential stability

575

G
Golden ratio. 419

Gramian, controllability, 144, 146, 153, 163,


207. 245. 258, 268. 285. 332
discrete-time, 474, 544
Gramian. 1-step observability, 469.485. 515.
548, 552
Gramian. 1-step reachability. 469, 484, 513, 515.
527, 552
Gramian. observability, 149. 163. 167. 210, 267.
285. 337
discrete-time. 468. 515, 548
Gramian, output reachability, 155
discrete-time, 473
Gramian, reachability, 155
discrete-time, 465, 473, 513, 515, 527, 529
Gramian. reconstructibility, 288
discrete-time, 474, 567
Greatest common divisor.
left, 299, 300
right 292, 293
Gronwall inequality, 56
Gronwall-Bellman inequality, 45. 54, 56, 134.
139

discrete-time. 452. 454, 455, 459


Feedback, dynamic, 241, 262, 269, 275, 281,
284, 285, 287, 327
discrete-time, 522, 551, 556, 563. 566,567
Feedback, output, 240. 243. 260, 262, 269. 275.
284, 285, 288, 327
discrete-time. 521, 524, 550, 556, 563, 566
Feedback stabilization, see Stabilization
Feedback, state, 36, 236, 237. 240. 242, 244, 247,
249, 258263, 323, 341, 345, 355,
358, 362, 367
discrete-time. 521. 523, 525, 532, 534,

H
Hankel matrix, 194,201,202,499,502
Harmonic oscillator. 78, 96, 117
Hermite form
column, 300
row, 295
Hermitian matrix, 9
Hermitian transpose, 2, 7, 9
Hill equation. 97

543545

Feedback, static, 241, 262


discrete-time, 522
Fibonacci sequence. 200.419, 505
Final value theorem, l5, 283
discrete-time, 17, 564
First-order hold, 476
Floquet decomposition, 81. 95, 97, 108
discrete-time, 413
Frequency response, 95
Friend, 343
Functional reproducibility, 156,475
Fundamental matrix, 56, 69

1,1,410
Identification, 507
Identity dc-gain, 36, 284
discrete-time, 543, 565
Identity matrix, 3
Image, 5, 329
Impulse response, 49, 80. 159, 181, 182. 194.
197, 202, 249, 253
Inclusion, 329, 352
Induced norm, 68, 1921, 101, 106.426,432
Initial value theorem, 15, 194

576

Subject Index

discrete-time, 17.499
Input-output behavior. 48. 80. 81. 158. 169. ISO
203, 237, 249. 276. 280. 331
discrete-time. 393, 407. 477, 481, 493, 508.
534, 557, 562
Input signal, 23. 48. 49. 80. 85. 143. 156. 321.
322

discrete-time. 383. 393. 408. 416. 464. 508


Instability, 51. 110, 122, 337. 375
discrete-time,418, 432, 443, 539
Instantaneously controllable. 183, 199, 239
Instantaneously observable. 183. 199. 239
Integrating factor, 61, 68
Integrator coefficient matrices. 225. 226, 248.
263, 314.315317, 323
Integrator polynomial matrices. 314318. 323
Inicresi rate, 419
Intersection. 329. 336. 352. 353
Invariant factors. 262
Invariant subspace. 330. 352
Inverse
image. 329, 336. 352. 353
Laplace transform, 14. 18. 77, 171
matrix. 3.4. 10. 15. 17. 1920, 242. 259. 291
301. 523, 527, 543, 564
system. 216. 217
:-transform. 16. 18. 409
Iteration. 387. 391. 403

Jacobian, 29. 388, 389


Jordan form. 78. 84. 85. 96. 235, 410. 476
real. 78. 96
Jury criterion, 436

Left divisor. 299


Leibniz rule, 11.47. 60
Liapunov. see Lyapunov
Lifting, 422
Limit, II. IS, 17, 41. 43, 98, lOS, 106, 128,
212. 265. 283, 305. 430.431 434,
499, 546. 564, 565
Linear independence. 2.4. 144. 156
Linearization, 28. 39
discrete-time, 387
Linear input-output. 49, 80. 81. 158, 169, 180
discrete-time. 393, 408. 478. 506
Linear state equation, 23. 39. 49, 160. 330
causal. 49. 159. 160. 180
periodic, 81. 84. 85. 9597, 157. 164
time invariant, 23. 50, 80
time varying, 23, 49
Linear state equation, discrete-time. 383, 393.
404. 406.420, 479
causal. 393. 477. 506
periodic. 416. 421. 422. 475. 507. 545
time invariant, 384. 402. 406
time varying. 384
Logarithm of a matrix. 81. 95. 96.405
Logistics equation, 389
I-step controllability, 544
I-step observability. 469.485, 487,515.548.
552. 567
/-step reachability, 469,484, 513, 527, 532, 545,
549. 552
Lyapunov equation. 124. 127. 135, 139. 153,
154. 246

discrete-time, 445, 448. 449, 456, 473, 529


Lyapunov function, 115. 129
discrete-time, 438, 440, 448

Lyapunov transformation, 107, III. 113


discrete-time, 434, 528

Kalman filter, 288


Kernel. 5. 329
K-periodic. see Periodic
Kronecker product, 135. 141, 456, 460

Laplace expansion. 3. 65. 66. 304

Laplace transform, 14.18,77,81,87.97.169


171, 194, 241.283. 290,319, 322, 355
table. 18

Leading principal minors. 9


Left coprime. 299, 301

M
Magnitude. 3
Markov parameters, 194
discrete-time, 481, 498

Matrix, I
adjugate. 3.77.94.291. 319. 408
Behavior. 184. 189, 201, 488. 495
calculus. 10,43, 60
characteristic polynomial, 4,76, 113, 247,
275. 367, 408, 532, 556, 563
cofactor. 3, 65

Subject Index
computation, 21, 22, 57, 98. 239. 376, 380.
403. 545
determinant. 36,9, IS, 65. 242, 290. 301. 304.
318.408, 523
diagonal, 4.47. 141. 147. 339
difference equation, 395, 396,401,408
differential equation. 61. 62. 67. 69. 70. 72.
73. 153
eigenvalue. 4.8. 10. 1820
eigenvector, 4. 105, 221, 232, 429. 473
exponential, 59, 7479, XI, 98, 179, 330
function, 10
fundamental. 56. 69
Hankel. 194. 201. 202. 499. 502
Hermitian, 9
Hermitian transpose. 2, 7. 9
identity. 3
image, 5, 22. 329
induced norm, 68, 1921. 101, 106. 426.432
inverse. 3.4, 10, 15, 17, 242, 259. 291. 301.
523, 527, 543, 564
inversion lemma. 543
Jacobian, 29. 388, 389
Jordan form. 78. 84. 85, 96. 235. 410.476
kernel, 5, 329
Kronecker product. 135, 141.456.460
leading principal minors. 9
logarithm. 81. 95. 96.4(15
measure, 141

negative (scrni)delinite, 8. 9, 114,437


nilpotent, 3. 18. 79.411.431,556
null space. 5. 329
page. 202

paranieterized. 10
partition, 6, 19. 70, 153, 170. 185. 282, 301,
435. 552. 564
polynomial, 15, 290
positive (serni)delinite. 8.9. 115.438
principal minors, 8, 9
range space. 5. 22. 329
rank, 5. 6. 22. 320

rational, 1417,77.242,290,301.408.523
root of. 413.420.422
similarity, 4,75,219,231.248,330,366,410
singular values. 22, 380
spectral norm, 68. 1921

spectral radius, 19
submatrix. 185. 489
symmetric. 8. 1821

577
trace. 3. 4. 8. 64. 69. 75. 95. 138
transpose. 2. 3. 5. 7. 8. 527

Maximal
controllability suhspace, 363, 367, 369, 378
controlled invariant suhspace. 358. 376.
378. 380
McMillan degree. 313. 315. 317
Minimal realization. 160. 162. 183. 185. 190.
195. 312

discrete-time. 479. 483. 493. 498


Modulus, see Magnitude
Monic polynomial. 169. 295. 481

N
Natural logarithm, see Logarithm
Negative (semi)delinite. 8.9. 114,437

Nilpotent. 3.18.79.411.431,556
Nominal solution. 28, 29
discrete-time, 387
Noninteracting control, 249. 263, 367. 380
asymptolic. 54-4
discrete-time. 533. 545
Nonlinear state equation. 28, 33. 3537.46. 93.
113. 140
discrete-time. 387. 400. 436, 460

Nonsingular polynomial matrix. 290. 301


Norm
Euclidean. 2. 10
induced. 68. 1921
spectral, 68, 1921

supremum of, 129. 203. 216. 434. 448. 508.


516, 520
Null space. 5. 329

0
Observahility, 148, 156. 231, 337
index. 237
259, 318
instantaneous. 183, 199. 239
matrix. 150. 189. 195. 218. 231. 337
PBH tests, 238

uniformity condition. 210. 267. 270. 285. 288


rank condition, 149, 150. 183, 232
Observahility Gramian. 149. 163. 167. 210.
267. 285. 337
discrete-time, 468. 515, 548
Observability. discrete-time. 467. 476. 483
/-step. 469. 485,487. 515. 548, 552. 567
matrix, 467. 468500. SIX. 560

Subject Index

578
rank condition. 467, 469
uniformity condition, 515, 548, 552. 567
Observable subsystem, 341
Observer, 266, 275, 281, 287
gain, 267, 271, 274, 275, 278, 287

initial state. 266, 287. 547


reduced dimension. 272, 278, 285. 287
robust, 289
with unknown input. 288
Observer, discrete-time, 547, 553, 556, 562
gain, 548, 552, 555, 556, 560
reduced-dimension, 553, 559, 566. 567
Observer form, 232, 239, 275, 318, 556
Open-loop state equation, 240
discrete-time, 521
Operational amplifier, 34
Output controllability. 154. 249, 264. 353, 368
discrete-time, 474
Output feedback, 240, 243, 260, 262, 269, 275.
284.285, 287,288. 327
discrete-time, 327, 521, 524 550, 556, 563, 566
Output injection, 263, 286. 352
Output reachability. 155
discrete-time, 473, 534
Output regulation, see Servomechanism
Output signal. 23, 48, 272
discrete-time, 383. 553
Output variable change, 263

Periodic solutions. 84. 85. 87, 94. 95, 97


discrete-time. 414, 416,418,421
Perturbed state equation, 133, 139141
discrete-time, 454. 455. 461
Piecewise continuous, 23, 48, 85, 86
Plant. 240, 249. 270. 280, 323. 341. 351.
357, 362, 367
discrete-time, 521, 525. 533, 551, 553, 562
Pole, 97, 213, 318, 326. 327
discrete-time, 518, 519
Pole multiplicity, 318
Polynomial
characteristic, 4, 76, 113, 247, 275, 349, 367,
408. 532. 556. 563
coprime, 338
degree, 77, 303
monic. 169, 295, 481
Polynomial fraction description. 290, 312
coprime, 298, 301, 302, 309, 313, 317
degree. 291
elementary, 291, 301

left. 291. 318


right. 291. 316
Polynomial matrices, 15, 290
common left divisor. 299
common right divisor. 292
greatest common left divisor, 299, 300
greatest common right divisor. 292, 293
integrator, 3143 18, 323

P
Page matrix, 202
Partial fraction expansion, 14, 16, 77, 104,
171, 213, 408, 429, 518, 564
Partial realization. 202, 505
Partial sums, 12,41
Partitioned matrix, 6, 19, 70, 153. 170. 185, 282,
301, 435, 552, 564

Path controllability. 156


PBH tests, 238
Peano-Baker series. 44, 46, 53, 56, 58, 63
Pendulum, 90, 96
Perfect tracking. 379
Period. 81, 412
Periodic linear state equation, 81, 84. 85, 9597,
157, 164

discrete-time. 415, 416, 421.422, 475,


507, 545

Periodic matrix functions, 81


Periodic matrix sequences. 412. 413

left coprime, 299, 300


left divisor, 299
right coprime. 292, 297
right divisor, 291
Polynomial matrix, 15. 290
column degree. 303, 309
column degree coefficient matrix. 304
column Hermite form, 300
column reduced, 304, 305, 307, 309
left divisor, 299
nonsingular, 290, 301
right divisor, 291
row degree, 303, 309
row degree coefficient matrix. 308
row Hermite form, 295
row reduced, 308
Smith form, 311
unimodular. 290. 291. 294. 296. 298. 300,
301, 306
Positive linear system, 98, 181, 405. 475. 507

Subject Index
Positive (semi )deflnite. 8.9. ItS. 438
Power series, 13, 59, 73, 74
Precompensator. 259

Principal minors. 8. 9
Product convention, 392, 395, 452

Proper rational function. 16,77.81.408.411.

579
Realization, discretetime. 477
minimal. 479. 483. 493. 498
lime invariant. 483. 493. 495

Reconstructibility. 156. 288


discrete-time. 474. 567
Reduced-dimension observer. 272. 278. 285.
287

505

Pseudo-state, 323
Pulse response, see Unit-pulse response
Pulse-width modulation. 399. 400

Quadratic form. 8. 115. 438

sign definiteness. 8,9, 115.438


Quadratic Lyapunov function, see Lyapunov

Range space. 5. 22. 329


Rank. 5. 6, 22. 320
Rate of exponential stability. 244
discrete-time. 526
Rational 6jnction. 1417. 77. 242, 523
biproper, 310
proper. 16. 77, 81, 408. 411. 505
strictly proper. 14, 77, 81. 169. 291, 307, 308.

310, 313, 315, 317,411,481.523


Rayleigh-Ritz inequality. 8. 132. 451
Reachability. 155. 334. 353
Reachability, discrete-time, 462, 469, 475. 483
I-step, 469.484. 513. 527. 532. 545. 549. 552
matrix, 463, 466, 493. 500. 518
rank condition, 463, 466
uniformity condition. 513. 526
Reachability Gramian, 155

discrete-time, 465,473.484.513.515,
527. 529
Realizable. 160. 181
impulse response, 184, 185, 195
transfer function. 169, 194, 202
weighting pattern. 160, 171. 178
Realizable, discrete-time. 479. 506. 507
unit-pulse response. 479. 489, 495, 500
transfer function, 481
Realization, 160
balanced, 201
minimal, 160. 162, 183. 185. 190. 195.312
partial, 202
periodic, 164
time invariant. 167. 169. 189. 194. 202

discrete-time. 553. 559, 566. 567


Relative degree. 251. 254. 260
discrete-time. 536. 539
Right coprime. 292. 297. 298. 302. 309
Right divisor. 291
Robust observer. 289
Robust stability. 113. 141
discrete-time. 461
Rocket model. 24. 29. 38. 51
Rouih-Hurwitz criterion. 113
Row degree. 303. 309
coefficient matrix. 308
Row l-lermite form. 295
Row reduced. 308

S
Sampled data, 385. 405. 471. 476, 503

Satellite model. 31. 38. 50. 110. 151. 256


Sensitivity. 33
Servomechanism problem. 280. 289. 381
discrete-time, 562. 568
Similarity transformation. 4. 75. 78. 219. 231.
248. 330. 366. 410
Singular state equation. 39. 157
discrete-time. 404
Singular values. 22. 380
Smith form. 311
Smith-McMillan form. 311
Span. 2. 328
Spectral norm, 68. 1921

Spectral radius. 19
Stability
bounded-input, bounded-output. 216
discrete-time. 519
eigenvalue condition. 104. 112. 113. 124. 131.
135, 140. 153. 154
discrete-time, 429. 436. 445. 448. 450. 452.
456. 460. 473
exponential. 104. 124.211, 238
discrete-time. 238. 429. 445. 517
finite time. 431.436
total. 215

Subject Index

580
unitorm,99, hO, 113, 116, 122, 133
discrete-time. 423, 425, 433, 435,438,448.
452, 454, 460

uniform asymptotic. 106


discrete-time, 431, 436
uniform bounded-input bounded-output, 203,
206, 211, 244, 319
discrete-time, 508. 515, 517, 520. 525
uniform bounded-input bounded-state. 207,
215

discrete-time, 513
uniform exponential. 101, 106. 112. 117. 124.
130, 133135

discrete-time, 425, 431. 434.435. 440,

449,452,455,511
130, 259, 351, 352. 449, 476.
520, 544
Stabilizability subspace, 352, 355. 380
Stabilization
output feedback, 269, 275, 288
state feedback, 244, 247, 258, 261, 262
Stabilization, discrete-time
output feedback, 550
state feedback, 525. 544
Stable subspace, 337. 351
State equation, 23
adjoint, 62. 69, 73
bilinear. 37, 93
closed loop, 240. 247. 249, 270, 275,
280, 324. 342, 358, 362, 368
linear, 23. 39,49. 160. 330
nonlinear, 28, 33, 3537. 46, 93. 113. 140
open loop. 240
periodic, 81, 84, 95, 97, 157, 164
time invariant, 23, 50. 80
time varying, 23. 49
State equation, discrete-time, 383
adjoint, 396, 402
closed loop, 521, 532. 534, 551, 557, 563
linear, 383, 393, 404, 406, 420, 479
Stabilizability,

nonlinear, 387. 400. 436,460


open loop, 521
periodic, 415, 416, 421, 422, 475. 507, 545
structured, 475
time invariant, 384, 406
time varying, 384
State feedback, 36. 236, 237, 240, 242, 244, 247,
249, 258263, 323, 341, 345, 351, 355,
358, 362. 367

discrete-time, 521, 523, 525, 532, 534,


543545

observer, see Observer


State space. 330
State variables, 23
change of, 66. 70, 72, 75, 78, 107, 162, 173,
179, 200. 219. 222, 231. 233. 236, 237.
248. 272. 330. 335. 339
State variables, discrete-time, 383, 553
change of. 397. 402 411, 434, 478. 483, 487,
493. 504, 532, 552, 554
State variable diagram. 34. 35. 39. 226
discrete-time, 390. 391
State vector. 23, 323
discrete-time, 383
Static feedback. 241. 262
discrete-time, 522
Stein equation. 449
Strictly proper rational function, 14,77. 81,
169, 242, 290, 307, 312, 481, 523
Structured state equation, 475
Submatrix. 185,489
Subspace
(A.B) invariant, 354
almost invariant, 356
compatibility, 369
conditioned invariant, 354, 355
controllability. 345
controllable, 331, 342
controlled invariant, 341
direct sum, 329, 338. 370, 376
inclusion. 329. 352
intersection, 329, 336. 352, 353
invariant, 330, 352
inverse image, 329, 336. 352, 353
State

stabilizability.

352. 355. 380

337. 351
sum, 329, 331, 352, 353
stable.

unobservable, 336, 343, 352


unstable, 337, 351
Sum of subspaces. 329. 331. 352, 353

Supremum. 129,203,211.216,434.448,
508.

516, 520

Symmetric matrix, 8, 1821

System
exogenous, 289, 568
identification, 507
inverse, 216, 217
matrix. 327

Subject Index

581

Uniform exponential stability. 101. 106. 112.


Taylor series. 13, 28, 59, 73, 74, 388
Total stability. 215
T-periodic. see Periodic

Trace. 3,4,8.64,69,75,95. 138


Tracking
asymptotic, 280, 381, 562
perfect. 379
Transfer function. 8!, 169. 194. 213. 243. 291.
302. 312, 316, 318,323. 341
McMillan degree. 313. 315, 317
closed loop. 243. 258. 276, 283. 324. 358. 368
Transfer function, discrete-time. 412, 481, 499.
518. 524
closed loop, 524. 557. 564
Transition matrix. 43,58
commuting case. 59. 73. 82. 141
derivative, 53. 62. 74
determinant. 64. 75
Floquet decomposition. SI
inverse, 66, 75
open/closed-loop. 242
partitioned. 71. 271
power series. 73
time-invariant case, 59. 74
Transition matrix, discrete-time. 392, 395
Floquet decomposition, 413
inverse, 396
open/closed-loop, 523
partitioned. 402. 552
time-invariant case, 406
Transmission zero, 320, 321. 325, 355

Transpose. 2,3,5,7, 8. 527


Triangle inequality. 2, 11.271.552

117, 124. 130, 133135

discrete-time. 425. 431. 434. 435 440. 449.

452.455.511
rate, see Rate of uniform exponential stability
Uniform stability, 99. 110. 113. 116, 122, 133
discrete-time. 423. 425. 433. 435. 438. 448,
452. 454, 460
Unimodular polynomial matrix. 290. 291. 294.
296. 298. 301. 306
Uniqueness of solutions. 45. 48. 56. 62
discrete-time. 391. 392. 395, 396.
Unit delay. 390
Unit pulse. 393. 408

Unit-pulse response. 393. 408. 412.478.494.


498, 506, 509, 518
Unobservable subspace. 336. 343. 352
Unobservable subsystem. 341
Unstable subspace, 337. 351, 352
Unstable system. 51. 110. 122. 337. 375
discrete-time. 418.432.443.539

V
Vec. 136,457
Vector space. 1. 328

w
Weierstrass M-Test, 13. 42. 59
Weighting pattern. 160. 180. 181.479
open/closed-loop. 243

Zero-input response. 48. 80. 99. 148. 2(16, 319.


321. 330
discrete-time. 393. 407. 423. 467
Zero matrix. 3
Zero-order hold. 385, 471. 476
Zeros
Uncontrollable subsystem. 341
blocking, 326
Uniform asymptotic stability, 106
of analytic functions, 156
discrete-time. 431, 436
transmission. 320. 325. 327. 355
Uniform bounded-input, bounded-output stability,
Zero-state response. 48, 80. 158, 18(1. 203, 206,
203,206.211. 244, 319
249. 321
discrete-time. 508, 515. 517. 520. 525
discrete-time, 393, 407. 462, 466, 477, 508
Uniform bounded-input, bounded-state stability.
:-transform. 16. 408. 411. 481.499.
207. 215
524. 564
discrete-time. 513
table, 18
Uniform convergence. l2, 13, 22,4244,46, 55.
59. 156

Two-point boundary conditions, 55, 57


discrete-time, 401. 403

You might also like