You are on page 1of 30

Appendix A

Mathematical Preliminaries
A.1 Algebraic Structures
Basic set theory is assumed.
A.1.1 Groups, Rings, and Fields
A binary operation in a set o is a map o o o.
Denition A.1 A nonempty set ( with a binary operation , denoted by
((, ), is said to be a group if
1. (a b) c = a (b c) for all a, b, c (, (associative law)
2. there exists e ( such that e a = a e = a for all a (, (existence of
identity)
3. for each a (, there exists x G such that a x = x a = e. (existence
of inverse)
Denition A.2 A group ((, ) is said to be abelian if a b = b a for all
a, b (.
Example A.1
1. Let R
nm
denote the set of n m real matrices. Then (R
nm
, +) is an
abelian group. The identity element is the zero matrix and the inverse is
just the negation. Similarly (C
nm
, +), the set of nm complex matrices
together with the addition operation, is an abelian group with the identity
being the zero matrix and the inverse being the negation.
2. Let (L(n, R) and (L(n, C) denote the sets of nn nonsingular real matri-
ces and complex matrices respectively. Then ((L(n, R), ) ((L(n, C), )
are groups. In both groups, the identity element is the identity matrix
and the inverse is the usual matrix inverse. These two groups are called
the general linear groups.
173
174 APPENDIX A. MATHEMATICAL PRELIMINARIES
3. Let O(n) denote the set of nn real orthogonal matrices. Then (O(n), )
is a group. The identity element is the identity matrix and the inverse is
the usual matrix inverse (which is equal to matrix transpose). Similarly,
let |(n) denote the set of nn complex unitary matrices. Then (|(n), )
is a group. The identity element is the identity matrix and the inverse is
the usual matrix inverse (which is equal to matrix conjugate transpose).
The groups O(n) and |(n) are called the orthogonal group and the unitary
group respectively.
4. Let oO(n) denote the set of n n real orthogonal matrices with deter-
minant equal to 1. Then (oO(n), ) is a group. This group is called the
special orthogonal group. Clearly, oO(n) is a subgroup of O(n) which is in
turn a subgroup of (L(n, R). Similarly, let o|(n) denote the set of nn
complex unitary matrices with determinant equal to 1. Then (o|(n), )
is a group. This group is called the special unitary group. Clearly, o|(n)
is a subgroup of |(n) which is in turn a subgroup of (L(n, C).
5. The set of all subsets of set o is denoted by 2
S
. Then (2
S
, ) and (2
S
, )
satisfy the associative law and the existence of identity. The identity
elements are empty set and the whole set o respectively. However, they
do not satisfy the existence of inverse. Hence neither of them is a group.
Denition A.3 A nonempty set with binary operation + and binary op-
eration multiplication is said to be a ring if
1. (, +) is an abelian group,
2. (ab)c = a(bc) for all a, b, c , (associative law of multiplication)
3. a(b +c) = ab +ac, (b +c)a = ba +ca for all a, b, c . (distributive law)
A ring is said to be commutative if the multiplication is commutative,
i.e., ab = ba for all a, b .
A ring is said to be a ring with unity if there exists u such that
ua = au = a for all a .
Example A.2
1. The set of all n n real (or complex) matrices is a ring with unity.
2. The set of integers Z is a ring with unity.
3. The set of all univariate polynomials with the usual addition and multi-
plication is a commutative ring with unity.
Denition A.4 A commutative ring F with unity is said to be a eld if every
nonzero element of F has a multiplicative inverse in F.
Example A.3
1. The set of real numbers R, the set of complex numbers C, the set of
rational numbers Q are all elds.
2. The set of all univariate rational functions is a eld.
A.1. ALGEBRAIC STRUCTURES 175
A.1.2 Equivalence Relations
A relation in a set o is a map o o 0, 1.
Denition A.5 A relation in a set o is said to be an equivalence relation
if
1. a a for all a o, (reexive)
2. a b implies b a. (symmetric)
3. if a b and b c imply a c. (transitive)
Example A.4
1. Let o be the set of all individuals in a country, a city, or a university.
Then having the same surname denes an equivalence relation, i.e., we
can say a b if a and b have the same surname. Also having the same
gender denes another equivalence relation, i.e., we can say a b if a and
b are both male or both female.
2. Let F be either R or C. In the set F
nn
of n n matrices, matrices A
and B are said to be similar, denoted by A B, if there is a nonsingular
matrix T (L(n, F) such that A = T
1
BT. Then is an equivalence
relation.
3. Denote the set of all n n real symmetric matrices by o
n
. Two matrices
A, B o
n
are said to be congruent, denoted by A

= B, if there is a
nonsingular matrix T (L(n, R) such that A = T
t
BT. Here the prime
means matrix transpose. Then

= is an equivalence relation. Similarly,
denote the set of all n n complex Hermitian matrices by H
n
. Two
matrices A, B H
n
are said to be congruent, denoted by A

= B, if there
is a nonsingular matrix T (L(n, C) such that A = T

BT. Here the star


means matrix conjugate transpose. Then

= is an equivalence relation.
4. Two matrices A, B R
nn
are said to be orthogonally similar, denoted by
A B, if there is an orthogonal matrix U O(n) such that A = U

BU.
Then is an equivalence relation. Two matrices A, B C
nn
are said
to be unitarily similar, denoted by A B, if there is a unitary matrix
U |(n) such that A = U

BU. Then is an equivalence relation.


Apparently A B implies A B.
A subset of o containing all equivalent elements is called an equivalent class.
Each pair of equivalent classes are either identical or disjoint, and o is simply the
union of all equivalent classes. In other words, the set o can be partitioned into
disjoint subsets called equivalent classes by grouping the equivalent elements
together. The set of all equivalent classes is called the quotient of o with respect
to , and is sometimes denoted by o/ .
Example A.5
1. Let o be the set of all individuals in a country and let the equivalence
relation be having the same gender, i.e., we say a b if a and b are
176 APPENDIX A. MATHEMATICAL PRELIMINARIES
both male or both female. Then o is divided into two equivalence classes
containing all male individuals and all female individuals respectively.
The quotient set o/ contains two members: the male class and the
female class.
2. For C
nn
with equivalence relation , it follows from the Jordan canonical
form theorem that each equivalent class consists matrices with the same
Jordan canonical form. The quotient set C
nn
/ consists all possible
n n Jordan matrices.
3. For o
n
with equivalence relation

=, it follows from Sylvesters law of
inertia that each equivalent class consists symmetric matrices with the
same inertia. The quotient o
n
/

= is the set of all possible inertias.
4. For a set o, the power set o
n
consists of all ordered n elements of o. Now
let us dene a relation in o
n
in the following way: x y if they contain
the same elements of o possibly ordered in dierent way. Apparently
is an equivalence relation. The quotient set o
n
/ is the set of bags of n
elements, possibly with repetitions, of o.
A.1.3 Partial Order
Denition A.6 A relation in a set o is said to be a partial order if
1. a a for all a o, (reexive)
2. a b and b a imply a = b, (anti-symmetric)
3. a b and b c imply a c. (transitive)
A set with a partial order is called a partially ordered set or poset, denoted
by (o, ). We read a b as a precedes b or b supercedes a.
Example A.6
1. For A, B o
n
, dene A B if x

Ax x

Bx for all x R
n
. Then
(o
n
, ) is a poset. For A, B H
n
, dene A B if x

Ax x

Bx for all
x C
n
. Then (H
n
, ) is a poset.
2. For A = [a
ij
], B = [b
ij
] R
nm
, we say A B if a
ij
b
ij
for all
1 i n, 1 j m. Then (R
nm
, ) is a poset.
3. For x, y R
n
, we say that x is majorized by y, denoted by x y, if
max
1in
x
i
max
1in
y
i
max
1i
i
<i
2
n
x
i
1
+x
i
2
max
1i
i
<i
2
n
y
i
1
+y
i
2
.
.
.
max
1i
i
<i
2
<<i
n1
n
x
i
1
+x
i
2
+ +x
i
n1
max
1i
i
<i
2
<<i
n1
n
y
i
1
+y
i
2
+ +y
i
n1
x
1
+x
2
+ +x
n
= y
1
+y
2
+ +y
n
.
A.1. ALGEBRAIC STRUCTURES 177
We say that x is weakly majorized by y, denoted by x
w
y, if the last
equality above is changed to inequality . For example,
_
2 2 2

t

_
1 2 3

t
. We can see that orders the level of uctuation when
the average is the same. Also notice that
_
1 2 3

t

w
_
3 3 3

t
.
This shows that
w
orders the level of uctuation and the average in an
combined way. Neither the majorization nor the weak majorization

w
is a partial order in R
n
since neither is anti-symmetric. Nevertheless,
both and
w
are partial orders in R
n
/ .
4. In a set o with two equivalence relations
1
and
2
, we say that
1
is
ner than
2
, denoted by
1

2
, if a
1
b implies a
2
b. When
1
is
ner than
2
, then we also say that
2
is coarser than
1
. For example,
in F
nn
, equivalence relation is ner than . In H
n
, is ner than

=. For a set o with a set c of equivalence relations, denes a partial


order, i.e., (c, ) is a poset.
Let T be a subset of a poset (o, ). An element a o is said to be a lower
bound of T if a b for all b T . Similarly, an element a o is said to be an
upper bound of T if b a for all b T . A lower bound a of T is said to be the
greatest lower bound or inmum of T if c a for any other lower bound c of
T . The greatest lower bound of T is denoted by inf T . An upper bound a of
T is said to be the least upper bound or supremum of T if a c for any other
up bound c of T . The least upper bound of T is denoted by supT . It is easy
to see that inf T and sup T , if exist, are unique.
Let a, b o. Then infa, b is also denoted by a b, called the meet of a, b;
and supa, b is also denoted by a b, called the join of a, b.
Denition A.7 A poset (o, ) is called a lattice if a b and a b exist for all
a, b o.
Example A.7
1. Let N be the set of natural numbers 1, 2, . . .. We say a[b if a divides b.
Then [ is a partial order and (N, [) is a lattice. Here ab is the greatest
common divisor and a b is the least common multiple.
2. Another partial order in N is given by , the usual order in terms of
magnitude. This partial order is special since for a, b N, either a b
or b a. Hence this partial order is actually a total order. In this case,
a b = mina, b and a b = maxa, b.
3. Let T be the set of all nonzero monic polynomials. We say a[b if a divides
b. Then [ is a partial order and (o, [) is a lattice. Here a b is the
greatest common divisor and a b is the least common multiple.
4. For a given set o, (2
S
, ) is a lattice. Here the meet and join are the
intersection and union respectively.
5. For the set of real-valued continuous functions on [0, 1], denoted by ([0, 1],
we say f g if f(x) g(x) for all x [0, 1]. (([0, 1], ) is a lattice with
(f g)(x) = minf(x), g(x) and (f g)(x) = maxf(x), g(x).
178 APPENDIX A. MATHEMATICAL PRELIMINARIES
6. (o
n
, ) and (H
n
, ) are not lattices. We leave it as an exercise to check
this.
7. For x, y R
n
/ , if their entries have dierent total sums, they it is
impossible to nd their meet or join under partial order . Hence (R
n
/
, ) is not an lattice. On the other hand, (R
n
/ ,
w
) is an lattice. We
leave it as an exercise to nd the meet and join under partial order
w
for each pair in R
n
/ .
A lattice o is said to be modular if the modular distributive law is satised,
i.e., for all a, b, c o,
a (b c) = (a b) (a c).
We leave it as an exercise for the readers to verify whether each of the lattices
in Example A.7 is modular.
A.1.4 Linear Spaces and Linear Transformation
Denition A.8 A set A is said to be a linear space (vector space) over a eld
F, denoted by (A, F), if there is a binary operation +, called addition, in A
and there is a map F A A, called multiplication by scalars, such that
1. (A, +) is an abelian group,
2. ()x = (x) for all , F and x A,
3. (x +y) = x +y for all F and x, y A
4. ( +)x = x +x for all , F and x A,
5. 1x = x for all x A, where 1 is the unity in F.
Example A.8
1. The set of n-vectors over F, denoted by F
n
, is a linear space over F.
2. The set of n m matrices with entries in F, denoted by F
nm
, is a linear
space over F.
3. The set of all polynomials with coecients in F is a linear space.
4. The set of bilateral sequences of the form
. . . , x(1), [x(0), x(1), . . .
is a linear space.
A subset | of a linear space is said to be a subspace if itself is a linear space.
Let | and 1 be subspaces of A. Then the intersection and the sum of | and
1:
| 1 = x : x | and x 1
| +1 = u +v : u | and v 1
A.1. ALGEBRAIC STRUCTURES 179
are also subspaces of A. These operations make the set of subspaces of A,
partially ordered by the set inclusion , a lattice.
If | 1 = 0, then | and 1 are said to be linearly independent. In this
case, we write | + 1 as | 1. More generally, a number of linear subspaces
|
1
, , |
m
of A are said to be linearly independent if
x
1
+x
2
+ +x
m
= 0
for some x
i
|
i
, i = 1, 2, . . . , m, implies x
i
= 0 for all i. If |
i
, i = 1, 2, . . . , m,
are linearly independent, we write
|
1
+|
2
+ +|
m
or
m

i=1
|
i
as
|
1
|
2
|
m
or
m

i=1
|
i
.
Several vectors x
1
, x
2
, . . . , x
m
A are set to be linearly independent if

1
x
1
+
2
x
2
+ +
m
x
m
= 0
implies
1
=
2
= =
m
= 0. The largest number of linearly independent
vectors in A is called the dimension of A. The dimension can be innity. For
a nite dimensional space, let the dimension be n. Then a set of n linearly
independent vectors x
1
, x
2
, . . . , x
m
is called a basis of A.
Let A and be linear spaces over F. A map A from A to is a linear
transformation if
A(
1
x
1
+
2
x
2
) =
1
Ax
1
+
2
Ax
2
.
The null space or kernel of A is
^(A) = x A : Ax = 0.
The range or image of A is
(A) = y : y = Ax for some x A.
Clearly, A is injective or one-to-one i ^(A) = 0, and A is surjective or
onto i (A) = . Putting together, we see that A is bijective or a one-one
correspondence i ^(A) = 0 and (A) = .
Let | be a subspace of A. Denote
A| = Au : u |
and call it the image of | under A. Let 1 be a subspace of . Denote
A
1
1 = x A : Ax 1
180 APPENDIX A. MATHEMATICAL PRELIMINARIES
and call it the inverse image of 1. Here by no means we imply that A is
invertible. Using this notation, we see that (A) = AA and ^(A) = A
1
0.
It is not hard to verify the following relations: (Cf: Exercise A.6.1)
A(|
1
+|
2
) = A|
1
+A|
2
(A.1)
A(|
1
|
2
) A|
1
A|
2
(A.2)
A
1
(1
1
+1
2
) A
1
1
1
+A
1
1
2
(A.3)
A
1
(1
1
1
2
) = A
1
1
1
A
1
1
2
. (A.4)
One can also show that (A.2) becomes an equality if and only if
^(A) (|
1
+|
2
) = ^(A) |
1
+^(A) |
2
(A.5)
and (A.3) becomes an equality if and only if
(A) (1
1
+1
2
) = (A) 1
1
+(A) 1
2
(A.6)
(Cf: Exercise A.6.2).
Now consider a linear transformation A : A A. A subspace | A is
said to be A-invariant if A| |. In particular, 0, A, ^(A) and (A) are
A-invariant subspaces. The set of A-invariant subspaces is a lattice, under the
partial order . Hence it is a sublattice of the set of all subspaces of A.
Let | be an A-invariant subspace. Then A can also be considered as a map
from | to |. This map is called the restriction of A on | and is denoted by
A[
|
.
A.1.5 Normed Linear Space
Consider a linear space A over a eld F. For us F is either the real eld R or
the complex eld C. A norm | | in A is a function A R satisfying
1. |x| > 0 for all x ,= 0,
2. |x| = [[|x| for all x A and F.
3. |x +y| |x| +|y| for all x, y A.
Common norms in F
n
are the Holder p-norms, 1 p , dened by
|
_

_
x
1
.
.
.
x
n
_

_
|
p
=
_
n

i=1
[x
i
[
p
_
1/p
for 1 p < and
|
_

_
x
1
.
.
.
x
n
_

_
|

= max
1in
[x
i
[.
It is not a trivial matter to show | |
p
indeed satises the third requirement
above. We leave it to the reader as an exercise.
A.1. ALGEBRAIC STRUCTURES 181
In the space of matrices, several classes of norms are used in dierent situ-
ations. First, Holder p-norms can be extended to the space of matrices F
nm
:
[[[
_

_
a
11
a
1m
.
.
.
.
.
.
a
n1
a
nm
_

_
[[[
p
=
_
_
n

i=1
m

j=1
[a
ij
[
p
_
_
1/p
.
Here the reason why [[[ [[[ instead of | | is used for the norms is purely a
notational matter and is to avoid confusion with other classes of matrix norms.
Another way to dene [[[ [[[
p
is by introducing the so called vec operator:
vec
_
A
1
A
2
A
m

=
_

_
A
1
A
2
.
.
.
A
m
_

_
.
Here A
i
are the columns of A. Then
[[[A[[[
p
= |vecA|
p
.
The case when p = 2 is of particular interest. The norm [[[ [[[
2
is called the
Frobenius norm, and is also denoted by | |
F
.
Secondly, let the singular values of a matrix A F
nm
be
1
(A),
2
(A), . . . ,

minn,m
(A), ordered nonincreasingly. Dene
[A[
p
= |
_

1
(A)
.
.
.

minn,m
(A)
_

_
|
p
.
It is nontrivial to show that [ [
p
is indeed a norm, which is left as an exercise.
The third class of matrix norms are also used. For A F
nm
, dene
|A|
p
= sup
xF
m
,x,=0
|Ax|
p
|x|
p
= sup
|x|
p
=1
|Ax|
p
.
This norm is called the induced p-norm. It is not hard to show
|A|
1
= max
1jm
n

i=1
[a
ij
[ (A.7)
|A|
2
=
1
(A) (A.8)
|A|

= max
1in
m

j=1
[a
ij
[. (A.9)
Let
p
(Z), 1 p , be the set of bilateral sequences
x = , x(1), [x(0), x(1), F
182 APPENDIX A. MATHEMATICAL PRELIMINARIES
satisfying

k=
[x(k)[
p
<
for p < and
sup
<k<
[x(k)[ <
for p = . Dene the norm in
p
(Z) by
|x|
p
=
_

k=
[x(k)[
p
_
1/p
for p < and
|x|

= sup
<k<
[x(k)[.
A.1.6 Inner Product Space
An inner product on a linear space A is a function , ) : A A F such that
1. x, x) > 0 for all x ,= 0.
2. x, y) = y, x) for all x, y A.
3. x +y, z) = x, z) +y, z) for all x, y, z A.
4. x, y) = x, y) for all x, y A and F.
A linear space with an inner product is called an inner product space.
Proposition A.1 (Cauchy-Schwarz inequality)
[x, y)[
_
x, x)y, y).
Proof The case when x, y) = 0 is trivial. Assume now that x, y) ,= 0. For
all F,
0 x +y, x +y)
= x, x) +x, y) +y, x) +y, y)
= x, x) +x, y) +x, y) +||
2
y, y)
= x, x) + 2Re(x, y)) +||
2
y, y).
Let =
x,y)
[x,y)[
t. Then for all t ,
0 x, x) + 2[x, y)[t +y, y)t
2
.
This can happen only if
(2[x, y)[)
2
4x, x)y, y) 0,
A.1. ALGEBRAIC STRUCTURES 183
which yields
[x, y)[
2
x, x)y, y). 2
Proposition A.1 implies that an inner product space induces a norm |x| =
_
x, x). Another consequence of Proposition A.1 is that the inner product is
a continuous function in both variables.
The norm induced by an inner product satises the so-called parallelogram
law.
Theorem A.1 (Parallelogram law)
|x +y|
2
+|x y|
2
= 2|x|
2
+ 2|y|
2
.
It is quite trivial to verify this, but its opposite question is more interesting:
given a norm in a linear space, how can one know if the norm is induced by
an inner product? The following theorem answers this question, whose proof is
left as an exercise (Exercise A.14).
Theorem A.2 | | is induced by an inner product if
|x +y|
2
+|x y|
2
= 2|x|
2
+ 2|y|
2
.
In F
n
, the standard inner product is
x, y) = x

y.
The norm induced by this inner product is the Holder 2-norm | |
2
.
In F
nm
, let us dene
A, B) = tr(A

B).
The norm induced by this inner product is the Frobenius norm | |
F
.
In
2
(Z), let us dene
x, y) =

k=
x(k)y(k).
It is an easy exercise to show that x, y) is well dened for all x, y
2
(Z)
and , ) satises the requirements for an inner product. The induced norm is
exactly | |
2
.
Two vectors x, y are said to be orthogonal, denoted by x y, if x, y) = 0.
Several vectors x
i
, i = 1, 2, . . . , m, are said to be orthogonal if they are pairwise
orthogonal. They are said to be orthonormal if they are orthogonal and |x
i
| = 1
for all i.
Let x
1
, x
2
, . . . , x
m
be a set of linearly independent vectors. Construct a new
set of vectors in the following way
u
i
=
_
_
x
i

i1

j=1
u
j
, x
i
)u
j
_
_
/
_
_
_
_
_
_
x
i

i1

j=1
u
j
, x
i
)u
j
_
_
_
_
_
_
, i = 1, 2, . . . , m.
184 APPENDIX A. MATHEMATICAL PRELIMINARIES
One can check in a straightforward way that u
1
, u
2
, . . . , u
m
are orthonormal.
This process of getting a set of orthonormal vectors from a set of linearly inde-
pendent vectors is called the Gram-Schmidt orthonormalization.
The orthogonal complement of a vector x A, denoted by x

, is the set of
vectors in A orthogonal to x, i.e.,
x

= y A : x y.
The orthogonal complement of a set o of vectors in A, denoted by o

, is the
set of vectors in A orthogonal to every member of o, i.e.,
o

= y A : x y for all x o.
If | is a subspace, it is easy to see that
| |

= A.
A.2 Matrix Analysis
A.2.1 Matrix Operations
Basic matrix operations such as addition, multiplication, and inverse are as-
sumed. Also assumed are matrix functions such as determinant and trace.
Two special matrix operations will be introduced in this subsection.
The rst is the Schur complement. Let
A =
_
A
11
A
12
A
21
A
22
_
F
(n
1
+n
2
)(m
1
+m
2
)
be a partitioned matrix. If A
11
is invertible, then the Schur complement of A
11
in A is dened as
A/
11
= A
22
A
21
A
1
11
A
12
.
If A
12
is invertible, then the Schur complement of A
12
in A is dened as
A/
12
= A
21
A
22
A
1
12
A
11
.
If A
21
is invertible, then the Schur complement of A
21
in A is dened as
A/
21
= A
12
A
11
A
1
21
A
22
.
If A
22
is invertible, then the Schur complement of A
22
in A is dened as
A/
22
= A
11
A
12
A
1
22
A
21
.
The following theorem can be easily veried.
Theorem A.3 Assume the existence of the inverses required in each identities.
1. det(A) = det(A
11
) det(A/
11
).
2. det(A) = det(A
22
) det(A/
22
).
A.2. MATRIX ANALYSIS 185
3. A
1
=
_
A
1
11
+A
1
11
A
12
(A/
11
)
1
A
21
A
1
11
A
1
11
A
12
(A/
11
)
1
(A/
11
)
1
A
21
A
1
11
(A/
11
)
1
_
.
4. A
1
=
_
(A/
22
)
1
(A/
22
)
1
A
12
A
1
22
A
1
22
A
21
(A/
22
)
1
A
1
22
+A
1
22
A
21
(A/
22
)
1
A
12
A
1
22
_
.
5. A
1
=
_
(A/
22
)
1
(A/
12
)
1
(A/
21
)
1
(A/
11
)
1
_
.
The second matrix operation is the linear fractional transformation. Let
A =
_
A
11
A
12
A
21
A
22
_
F
(n
1
+n
2
)(m
1
+m
2
)
, X F
m
2
n
2
.
The lower linear fractional transformation is dened as
T
l
(A, X) = A
11
+A
12
X(I A
22
X)
1
A
21
.
The upper linear fractional transformation is dened as
T
u
(A, X) = A
22
+A
21
X(I A
11
X)
1
A
12
.
Now let
B =
_
B
11
B
12
B
21
B
22
_
F
(n
2
+n
3
)(m
2
+m
3
)
, Y F
m
3
n
3
.
Then what is the composition
T
l
(A, T
l
(B, Y ))?
Let us dene the star product of A and B be
A B =
_
T
l
(A, B
11
) A
12
(I B
11
A
22
)
1
B
12
B
21
(I A
22
B
11
)
1
A
21
T
u
(B, A
22
)
_
.
Then it is straightforward to see
T
l
(A, T
l
(B, Y )) = T
l
(A B, Y ).
A.2.2 Matrix Eigenvalue Problem
In this subsection, we will be concerned with square matrices A C
nn
.
Denition A.9 The characteristic polynomial of A is dened to be
c
A
(z) = det(zI A).
Proposition A.2 Let A C
nn
and C. The following statements are
equivalent:
1. c
A
() = 0.
186 APPENDIX A. MATHEMATICAL PRELIMINARIES
2. There exists nonzero x C
n
such that Ax = x.
3. There exists nonzero y C
n
such that y

A = y

.
A complex number satisfying item 1 of Proposition A.2 is called an eigen-
value of A. The vectors x and y satisfying the items 2 and 3 of Proposition A.2
are called respectively the right and left eigenvectors of A corresponding to the
eigenvalue . The set of eigenvalues of A, called the spectrum of A, is denoted
by (A). The maximum modulus of the eigenvalues of A, called the spectral
radius of A, is denoted by (A).
We can see that A C
nn
has n eigenvalues including multiplicity. Hence
(A) lies in C
n
/ .
The matrix function (zI A)
1
is called the resolvent of A. Using linear
algebra knowledge, we know
(zI A)
1
=
Adj(zI A)
c
A
(z)
where Adj(zI A) means the adjugate of zI A.
Theorem A.4 (Leverrier-Souriau-Faddeeva-Frame) Let
c
A
(z) = z
n
+a
1
z
n1
+a
2
z
n2
+ +a
n
Adj(zI A) = B
1
z
n1
+B
2
z
n2
+ +B
n
.
Then a
1
, . . . , a
n
and B
1
, . . . , B
n
can be computed using the following recursive
formulas:
B
1
= I,
a
i
=
1
i
tr(B
i
A), i = 1, 2, . . . , n,
B
i+1
= B
i
A+a
i
I, i = 1, 2, . . . , n 1.
Proof Since
c
A
(z)I = Adj(zI A)(zI A),
we have
(z
n
+a
1
z
n1
+a
2
z
n2
+ +a
n
)I
= B
1
z
n
+ (B
2
B
1
A)z
n1
+ (B
3
B
2
A)z
n2
+ B
n
A.
This gives B
1
= I, B
i+1
= B
i
A+a
i
I for i = 1, 2, . . . , n1, and a
n
I +B
n
A = 0.
A.2. MATRIX ANALYSIS 187
Observe
d
dz
c
A
(z) = nz
n1
+ (n 1)a
1
z
n2
+ (n 2)a
2
z
n3
+ +a
n1
=
d
dz
det
_

_
z a
11
a
12
a
1n
a
21
z a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
z a
nn
_

_
= det
_

_
1 0 0
a
21
z a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
z a
nn
_

_
+ det
_

_
z a
11
a
12
a
1n
0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
z a
nn
_

_
+ + det
_

_
z a
11
a
12
a
1n
a
21
z a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1
_

_
= tr[Adj(zI A)]
= tr(B
1
)z
n1
+ tr(B
2
)z
n2
+ tr(B
n
).
This shows (ni)a
i
= tr(B
i+1
) for i = 1, 2, . . . , n1. Plugging B
i+1
= B
i
A+a
i
I
in, we get (ni)a
i
= tr(B
i
A) +na
i
for i = 1, 2, . . . , n1. Rearrangement gives
a
i
=
1
i
tr(B
i
A) for i = 1, 2, . . . , n 1. Finally a
n
=
1
n
tr(B
n
A) follows from
0 = B
n
A+a
n
I. 2
Theorem A.5 (Cayley-Hamilton) c
A
(A) = 0.
Proof From the proof of Theorem A.4,
0 = B
n
A+a
n
I
= B
n1
A
2
+a
n1
A+a
n
I
=
= A
n
+a
1
A
n1
+ +a
n1
A+a
n
I.
2
Assume that A has l distinct eigenvalues
1
,
2
, . . . ,
l
. Then
c
A
(z) = (z
1
)
n
1
(z
2
)
n
2
(z
l
)
n
l
where n
i
1 and

l
i=1
n
i
= n. Here n
i
is called the (algebraic) multiplicity of
eigenvalue
i
.
Consider
c
i
= ^(A
i
I)
188 APPENDIX A. MATHEMATICAL PRELIMINARIES
and

c
i
= ^ [(A
i
I)
n
i
]
Here the space c
i
is called the eigenspace associated with eigenvalue
i
and is
the set of eigenvectors associated with eigenvalue
i
, together with the origin.
The space

c
i
is called the generalized eigenspace associated with eigenvalue
i
.
The following observations are quite obvious:
O1 c
i
and

c
i
are A-invariant subspaces.
O2 c
i


c
i
.
O3 1 dimc
i
dim

c
i
.
Here comes a less obvious observation.
Proposition A.3 dim

c
i
n
i
.
Proof Suppose that dim

c
i
= d > n
i
for a xed i. Find subspace T
i
, a
complement of

c
i
, such that

c
i
T
i
= C
n
.
We have to have dimT
i
= n d. Choose a matrix E C
nd
whose columns
form a basis of c
i
and a matrix F C
n(nd)
whose columns form a basis of
T
i
. Then P =
_
E F

is a nonsingular matrix. Since



c
i
is A-invariant, there
is a matrix

A
11
C
dd
such that AE = E

A
11
. Hence
P
1
AP =
_

A
11

A
12
0
(nd)d

A
22
_
.
Let be an eigenvalue of

A
11
. Then there exists nonzero x
1
C
d
such that

A
11
x
1
= x
1
.
This means AEx
1
= Ex
1
. Since Ex
1
belongs

c
i
, it follows that (A
i
I)
n
i
Ex
1
=
(
i
)
n
i
Ex
1
= 0, which forces =
i
. This shows that all eigenvalues of

A
11
are equal to
i
. Consequently,
det(zI A) = det
_
zI
d


A
11

A
12
0
(nd)d
zI
nd


A
22
_
= det(zI
d


A
11
) det(zI
nd


A
22
)
= (z
i
)
d
det(zI
nd


A
22
)
which contradicts the fact that the multiplicity of
i
is n
i
. 2
Because of Proposition A.3, observation O3 above can be amended as
O3 1 dimc
i
dim

c
i
n
i
.
Theorem A.6 For each A C
nn
, it holds

l
i=1

c
i
= C
n
.
A.2. MATRIX ANALYSIS 189
Proof Using partial fractional expansion, we see that there exist polynomials
q
i
, i = 1, 2, . . . , l, such that
1
c
A
(z)
=
l

i=1
q
i
(z)
(z
i
)
n
i
.
Multiplying both sides by c
A
(z), we obtain
1 =
l

i=1
q
i
(z)p
i
(z)
where p
i
(z) =
c
A
(z)
(z
i
)
n
i
. Thus,
I =
l

i=1
q
i
(A)p
i
(A).
For x C
n
, let
x
i
= q
i
(A)p
i
(A)x.
Then
x = x
1
+x
2
+ +x
l
.
Since
(A
i
I)
n
i
x
i
= (A
i
I)
n
i
q
i
(A)p
i
(A)x = q
i
(A)c
A
(A)x = 0,
it follows that x
i


c
i
. This shows that C
n
=

m
i=1
^[(A
i
I)
n
i
].
To show that this sum is a direct sum, we need to show that if
0 = x
1
+x
2
+ x
l
with x
i


c
i
, then x
i
= 0 for each i. Assume on the contrary that x
1
,= 0. Then
x
1
= x
2
x
l
.
Since x
1


c
1
, we have (A
1
I)
n
1
x
1
= 0. Since x
i


c
i
for i 2, it follows
that p
1
(A)x
1
= 0. We know that (z
1
)
n
1
and p
1
(z) are coprime, there exist
polynomials h
2
(z) and h
2
(z) such that
h
1
(z)(z
1
)
n
1
+h
2
(z)p
1
(z) = 1.
Therefore,
x
1
= [h
1
(A)(A
1
I)
n
1
+h
2
(A)p
1
(A)]x
1
= 0.
This is a contradiction. 2
Because of Theorem A.6, observation O3 above can be further amended as
O3 1 dimc
i
dim

c
i
= n
i
.
Since c
i


c
i
for each i = 1, 2, . . . , l, Theorem A.6 also implies that the
eigenspaces c
1
, c
2
, . . . , c
l
are linearly independent. However, dimc
i
can be any-
where from 1 through dim

c
i
.
190 APPENDIX A. MATHEMATICAL PRELIMINARIES
Example A.9
For matrix
A =
_
_
2 1 0
0 2 1
0 0 2
_
_
,
the only distinct eigenvalue is
1
= 2. We have dimc
1
= 1 and dim

c
1
= 3.
For matrix
A =
_
_
2 1 0
0 2 0
0 0 2
_
_
,
the only distinct eigenvalue is again
1
= 2. We have dimc
1
= 2 and dim

c
1
= 3.
For matrix
A =
_
_
2 0 0
0 2 0
0 0 2
_
_
,
the only distinct eigenvalue is still
1
= 2. We have dimc
1
= 3 and dim

c
1
= 3.
We see that c
i


c
i
in general and the eigenspaces do not span the whole
space C
n
in general. The gap between the eigenspace c
i
and the generalized
eigenspace

c
i
is the cause of all the troubles in matrix eigenstructure analysis.
For matrix A, if there is a gap between the eigenspace and the generalized
eigenspace corresponding to one of its eigenvalues, then A is said to be defective,
otherwise A is said to be nondefective.
If A has n distinct eigenvalues, i.e., n
i
= 1 for all i, then dimc
i
= dim

c
i
for all i, i.e., A is always nondefective. Hence a defective matrix must have
repeated eigenvalues.
For a matrix P (L(n, C), the transformation A P
1
AP is called a
similarity transformation. Two matrices A and B are said to be similar if there
exists a P (L(n, C) such that B = P
1
AP. Two similar matrices have the
same eigenvalues. A matrix A C
nn
is said to be diagonalizable if there exists
a P (L(n, C) such that P
1
AP is a diagonal matrix.
Theorem A.7 A C
nn
is diagonalizable if and only if A is nondefective.
Proof If A is nondefective, then dimc
i
= n
i
and

l
i=1
c
i
= C
n
. Choose
basis in c
i
to form a matrix P
i
C
nn
i
. Then AP
i
=
i
P
i
. Dene
P =
_
P
1
P
2
P
l

.
Then
AP = P
_

1
I
n
1

2
I
n
2
.
.
.

l
I
n
l
_

_
,
i.e., P
1
AP is diagonal.
A.2. MATRIX ANALYSIS 191
If A is diagonalizable, then there exists a P (L(n, C) such that
P
1
AP =
_

1
I
n
1

2
I
n
2
.
.
.

l
I
n
l
_

_
.
where
1
, . . . ,
l
are distinct distinct eigenvalues of A, each with multiplicity
n
i
. Even if initially the diagonal elements are not ordered so that equal ones
are grouped together, one can reorder them by modifying P. Partition P as
P =
_
P
1
P
2
P
l

such that P
i
C
nn
i
. Then it is easy to see that c
i
= ^(A
i
I) = (P
i
)
and dimc
i
= n
i
. This implies that A is nondefective. 2
In case when A is not diagonalizable, what is the best we can do? Choose
arbitrary basis in

c
i
. Put these basis vectors together to form an n n matrix
P. Note that these basis vectors, i.e., the columns of P, form a basis of C
n
.
Under this basis, A has matrix representation
P
1
AP =
_

_
A
1
A
2
.
.
.
A
l
_

_
.
Clearly, the eigenvalues of each A
i
are all equal to
i
. Hence A can be trans-
formed into a block diagonal matrix. The eigenvalues of each diagonal block
are the same. The size of each diagonal block is equal to the multiplicity of
that particular eigenvalue.
If we really wish to simplify matrix A further by similarity transformation,
then the best we can achieve is the so-called Jordan canonical form.
Theorem A.8 For each A C
nn
, there exists P (L(n, C) such that
P
1
AP = P
1
i
A
i
Q
i
=
_

_
J
1
J
2
.
.
.
J
l
_

_
where
J
i
=
_

_
J
i1
J
i2
.
.
.
J
im
i
_

_
C
n
i
n
i
where
J
ij
=
_

i
1

i
1
.
.
.
.
.
.

i
1

i
_

_
.
192 APPENDIX A. MATHEMATICAL PRELIMINARIES
A matrix in the form of J
ij
is called a Jordan block. The number of Jor-
dan blocks m
i
in matrix J
i
, all having the same diagonal elements
i
, can be
anywhere from 1 through n
k
. If it has n
k
Jordan blocks, then it is actually
diagonal.
Finding the Jordan canonical form is generally not an easy task, especially
in terms of numerical computation. It is then often not to ask for the simplest
form, but for a form which is simple enough to explicitly demonstrate some
properties, e.g., eigenvalues. This will be the topic of the following subsection.
A.2.3 Matrix Factorizations
The standard inner product of x, y F
n
is dened as
x, y) = x

y.
The norm in F
n
induced by this inner product is the Holder 2-norm:
|x|
2
=
_
x, x) =

x

x.
Let now x
1
, x
2
, . . . , x
m
be a set of linearly independent vectors in F
n
. The
Gram-Schmidt orthonormalization process gives a set of orthonormal vectors
u
1
, u
2
, . . . , u
m
by
u
i
=
_
_
x
i

i1

j=1
u
j
, x
i
)u
j
_
_
/
_
_
_
_
_
_
x
i

i1

j=1
u
j
, x
i
)u
j
_
_
_
_
_
_
, i = 1, 2, . . . , m.
Notice that u
i
only depends on x
1
, . . . , x
i
. In matrix form, this can be written
as
_
u
1
u
2
u
m

=
_
x
1
x
2
x
m

T
where T = [t
ij
] F
mm
satises t
ij
= 0 for i > j.
A matrix T = [t
ij
] F
nn
is said to be upper triangular if t
ij
= 0 for
i > j, i.e., all elements below the main diagonal are zero. The set of triangular
matrices is closed under addition, multiplication, and multiplication by scalars.
So it is called an algebra. A matrix U F
nm
is said to be an isometry if
U

U = I, i.e., all columns of U are orthonormal. For an isometry U, we have


Ux, Uy) = x, y), i.e., isometry preserves the inner product. In particular,
|Ux| = Ux, Ux)
1/2
= x, x)
1/2
= |x|, i.e., isometry also preserves the norm.
A matrix U F
nm
is said to be an co-isometry if UU

= I, i.e., all rows of U


are orthonormal. A matrix which is both an isometry and a co-isometry is said
to be unitary. For a unitary matrix U, we have U
1
= U

. The set of unitary


matrices is closed under multiplication only, so it is a group.
Proposition A.4 (QR factorization) Let A F
nm
with n m. Then there
exist an isometry Q and an upper triangular matrix R such that A = QR.
Proof We will only prove for the case when A has full column rank. The
general case is left as an exercise. Let
A =
_
x
1
x
2
x
m

.
A.2. MATRIX ANALYSIS 193
Then x
1
, x
2
, . . . , x
m
are linearly independent. By the Gram-Schmidt orthonor-
malization, we can nd orthonormal vectors u
1
, u
2
, . . . , u
m
and upper triangular
matrices T such that
_
u
1
u
2
u
m

=
_
x
1
x
2
x
m

T.
Since
t
ii
=
_
_
_
_
_
_
x
i

i1

j=1
u
j
, x
i
)u
j
_
_
_
_
_
_
1
,
the matrix T is nonsingular and T
1
is also upper triangular. Let Q =
_
u
1
u
2
u
m

, which is an isometry, and R = T


1
. Then A = QR
is the desired factorization. 2
Theorem A.9 (Schur) For each A F
nn
with eigenvalues
1
,
2
, . . . ,
n
in
any prescribed order, there exists unitary matrix U C
nn
such that
U

AU = T
is an upper triangular matrix with diagonal entries t
ii
=
i
.
Proof Let x
1
be an eigenvector of A corresponding to eigenvalue
1
. Aug-
ment x
1
to form a basis in C
n
: x
1
, x
2
, . . . , x
m
. Let X
1
=
_
x
1
x
2
x
n

.
Carry out QR factorization X
1
= Q
1
R
1
where Q
1
is unitary and R
1
is upper tri-
angular. Then the rst column of Q
1
is still an eigenvector of A corresponding
to
1
. Thus
Q

1
AQ
1
=
_

1

0 A
1
_
where A
1
C
(n1)(n1)
and it has eigenvalues
2
, ,
n
. Using the same
procedure, we can nd unitary matrix Q
2
C
(n1)(n1)
such that
Q

2
A
1
Q
2
=
_

2

0 A
2
_
.
Let
U
2
=
_
1 0
0 Q
2
_
.
Then
U

2
Q

1
AQ
1
U
2
=
_
_

1

0
2

0 0 A
2
_
_
.
Continue this process to produce unitary matrices Q
i
C
(ni+1)(ni+1)
such
that
Q

i
A
i1
Q
i
=
_

i

0 A
i
_
and u
i
C
nn
with
U
i
=
_
I 0
0 Q
i
_
.
194 APPENDIX A. MATHEMATICAL PRELIMINARIES
Then the matrix
U = Q
1
U
2
U
n1
is unitary and U

AU gives the desired form. 2


Theorem A.9 is the basis for numerical eigenvalue computation. Based
on Theorem A.9, one can also derive the extremely important singular value
decomposition (SVD). The detailed derivation is left as an exercise.
Theorem A.10 (SVD) For A F
nm
, there exist unitary matrices U F
nn
,
V F
mm
, and a nonnegative diagonal matrix S R
nm
such that
A = USV

.
The diagonal entries of S are called singular values of matrix A and are
usually denoted by
1
(A),
2
(A), . . . ,
minm,n
(A), ordered nonincreasingly.
A.2.4 Real Symmetric and Hermitian Matrices
Let us also denote o
n
by H
n
(R) and H
n
by H
n
(C). The study of matrices
in H
n
(F) is often associated with the study of quadratic forms or quadratic
functions F
n
R of the form
q(x) = x

Ax
where A H
n
(F). Put it in another way: there is a one-one correspondence
between a quadratic form and a Hermitian matrix.
A matrix A H
n
(F) is said to be
positive denite if x

Ax > 0 for all nonzero x F


n
;
positive semi-denite if x

Ax 0 for all x F
n
;
negative denite if x

Ax < 0 for all nonzero x F


n
;
negative semi-denite if x

Ax 0 for all x F
n
;
indenite if it is none of above.
We use notation A > 0, A 0, A < 0, A 0 to mean A being positive
denite, positive semi-denite, negative denite, and negative semi-denite,
respectively. The set of all n n positive semi-denite matrices is denoted by
T
n
.
We often use A B to mean B A 0. Then the relation is a partial
order in H
n
(F). The partially ordered set (H
n
(F), ), however, is not a lattice
as we have argued.
What is the easiest way to test whether a given matrix A H
n
(F) is positive
denite, positive semi-denite, negative denite, or negative semi-denite? The
following theorem gives an answer.
Theorem A.11 The following statements are equivalent:
1. A > 0.
A.2. MATRIX ANALYSIS 195
2. (A) > 0.
3. det
_

_
a
11
a
1i
.
.
.
.
.
.
a
i1
a
ii
_

_
> 0 for all i = 1, 2, . . . , n.
If our purpose is to test whether A is positive semi-denite, can we simply
replace all > in the above theorem by ? The answer is a big no. This is
an example where the intuition based on continuity is often misleading.
More generally, a matrix A H
n
(F) can have eigenvalues distributed in
the real line. We dene the triple of the numbers of negative eigenvalues,
zero eigenvalues, and positive eigenvalues, all with multiplicity counted, as the
inertia of A, denoted by
(A) =

(A),
0
(A),
+
(A)
where

(A),
0
(A),
+
(A) are the numbers of negative eigenvalues, zero eigen-
values, and positive eigenvalues of A respectively. For example, an nn positive
denite matrix A has (A) = 0, 0, n. The signature of matrix A is dened to
be
+
(A)

(A). Also note that the rank of A is


+
(A) +

(A).
In the quadratic form x

Ax, if we make a variable change x = Pz where


P (L(n, F), then the quadratic form becomes z

APz. The corresponding


matrix changes from A to P

AP. The transformation from A to P

AP is called
a congruence transformation. If P happens to be unitary, i.e., if P

= P
1
, then
the congruence transformation is also a similarity transformation. This means
that a unitary congruence transformation is also a unitary similarity transfor-
mation. An immediate consequence of the Schur decomposition theorem is that
for each A H
n
(F), there exists a P (L(n, F) such that
P

AP =
_
_
I

+
(A)
0 0
0 0

0
(A)
0
0 0 I

(A)
_
_
. (A.10)
To see this more clearly, one can rst apply a unitary congruence transformation
to transform A to a diagonal matrix with the eigenvalues on the diagonal and
then follow by another diagonal congruence transformation to normalize the
nonzero diagonal elements. The matrix on the right hand side of (A.10) is
called a signature matrix, whose trace gives the signature of A. One can then
immediately show the following Sylvester inertia theorem.
Theorem A.12 (P

AP) = (A) for each P (L(n, F).


This theorem says that the inertia is invariant under congruence transforma-
tion. It leads to another important theorem, whose proof is left as an exercise.
Theorem A.13 Let M =
_
A C
B C

_
H
(n
1
+n
2
)
and A is invertible. Then
(M) = (A) +(M/
11
). In particular, M > 0 if and only if A > 0 and M/
11
.
196 APPENDIX A. MATHEMATICAL PRELIMINARIES
Let f : R
l
H
n
(F) be an ane map. In particular, f maps x = [x
1
x
l
]
t

R
l
to
f(x) = H
0
+x
1
H
1
+x
2
H
2
+ x
l
H
l
for some given H
0
, H
1
, . . . , H
l
H
n
(F). An inequality of the form
f(x) 0
is called a Linear Matrix Inequality (LMI). An LMI problem is as follows:
Given H
0
, H
1
, , H
l
H
n
(F), nd x so that the LMI is satised, i.e., nd x
in f
1
(T
n
).
Theorem A.14 f
1
(T
n
) is a closed convex set.
Because of this theorem and the special structure of LMIs, there are e-
cient algorithms to solve an LMI problem. Many engineering problems, espe-
cially linear system analysis and synthesis problems, can be converted into LMI
problems.
Exercises
A.1 Give an example for each of the following algebraic structures: group,
ring, eld, equivalence relation, poset, linear space, normed linear space,
inner product space.
A.2 Show that the poset of real symmetric matrices (o
n
, ) and Hermitian
matrices (H
n
, ) are not lattices when n > 1.
A.3 Let , o, T be subspaces of F
n
.
1. Show that in general
(o +T ) ,= o + T ,
i.e., the lattice of all subspaces of F
n
is not modular.
2. Show that if o , then
(o +T ) = o + T .
3. Show that if any one of the following is true, then all are true.
(o +T ) = o + T
o (+T ) = o +o T
T (+o) = T +T o.
A.4 Show that (R
n
/ ,
w
) is a lattice. How can the meet and join be com-
puted.
A.5 The space of complex n n Hermitian matrices H
n
is a linear space over
R. What is its dimension? Find a basis for this linear space.
A.2. MATRIX ANALYSIS 197
A.6 Verify (A.1-A.4) and also verify that (A.2) and (A.3) become equalities
when (A.5) and (A.6) hold respectively.
A.7 Show that the set of invariant subspaces of linear transformation A : A
A is a lattice, i.e., show that if | and 1 are A-invariant, so are | 1 and
| +1.
A.8 Let A : A A be a linear transformation where dimA = n.
1. Show that in general
(A) +^(A) ,= A.
2. Show that the following three statements are equivalent
(a) (A) +^(A) = A.
(b) A
1
| = | +^(A) for all A-invariant subspace |.
(c) A1 = (A) 1 for all A-invariant subspace 1.
3. Show that
(A
n
) +^(A
n
) = A
always holds. Hence A
n
| = | +^(A
n
) and A
n
1 = (A
n
) 1 for
all A-invariant subspaces | and 1.
A.9 Let A : A A be a linear transformation and | A be a subspace.
Prove that A(A
1
|) = | (A) and A
1
(A|) = | +^(A).
A.10 Assume A, B be matrices with the same number of rows. Prove that
(A) (B) if and only if ^(B

) ^(A

). Also show the following


equalities:
1. (A) = ^(A

.
2. (A) = (AA

) and ^(A) = ^(A

A).
A.11 Verify (A.7)-(A.9).
A.12 Prove or disprove that a continuous function f : A , where A
and are real normed linear spaces, satisfying f(x
1
+ x
2
) = f(x
1
) +
f(x
2
), x
1
, x
2
A is a linear transformation.
A.13 Prove the Pythagorean Theorem: If x, y R
n
and x y, then |x|
2
2
+
|y|
2
2
= |x +y|
2
2
.
A.14 Prove that a norm in a linear space satisfying the parallelogram law
|x +y|
2
+|x y|
2
= 2|x|
2
+ 2|y|
2
is induced by an inner product. (Hint: the proof for a linear space over
C is slightly dierent from that for a linear space over R. Let us prove it
for both cases.)
A.15 Assume compatibility and existence of all involved inverses.
198 APPENDIX A. MATHEMATICAL PRELIMINARIES
1. Show all identities in Theorem A.3.
2. Show that
(ABD
1
C)
1
= A
1
+A
1
B(D CA
1
B)
1
CA
1
.
A.16 Consider the star product in F
2n2n
:
A B =
_
A
11
A
12
A
21
A
22
_

_
B
11
B
12
B
21
B
22
_
=
_
T
l
(A, B
11
) A
12
(I B
11
A
22
)
1
B
12
B
21
(I A
22
B
11
)
1
A
21
T
u
(B, A
22
)
_
.
1. The star identity is a matrix J such that
J A = A J = A
for all A. Find J.
2. The star inverse of A is a matrix A

, if exists, such that


A A

= A

A = J.
Find A

.
A.17 Let A =
_
A
11
A
12
A
21
A
22
_
and X be unitary constant matrices with I A
22
X
invertible. Show that T
l
(A, X) is also a unitary matrix.
A.18 A matrix A is said to be normal if A

A = AA

.
1. Show that Hermitian matrices, skew-Hermitian matrices, and uni-
tary matrices are normal.
2. Show that a normal matrix is diagonalizable by a unitary similarity
transformation (using the Schur theorem).
3. If an nn normal matrix A has eigenvalues
1
, . . . ,
n
, what are its
singular values?
A.19 Prove the following statements:
1. The eigenvalues of a nilpotent matrix are all zero.
2. The eigenvalues of a unitary matrix have unit absolute values.
3. The eigenvalues of a Hermitian matrix are real.
4. The eigenvalues of a skew-Hermitian matrix are imaginary.
A.20 A matrix A F
nn
is said to be idempotent if A
2
= A.
1. Show that an idempotent matrix is diagonalizable.
2. Show that if A F
nn
is idempotent, then ^(A) (A) = F
n
.
3. What are the possible eigenvalues of an idempotent matrix?
A.2. MATRIX ANALYSIS 199
A.21 Two matrices A, B (
nn
are said to be simultaneously diagonalizable
if there exists P (L(n, C) such that P
1
AP and P
1
BP are diagonal.
They are said to commute if AB = BA.
1. Show that if A, B are simultaneously diagonalizable, then they com-
mute.
2. Show that if A, B are diagonalizable and commute, then they are
simultaneously diagonalizable. (Hint: One may rst prove for the
case when all eigenvalues of A are distinct and then extend to the
general case.)
A.22 Using the Schur decomposition theorem to show the following:
1. For A F
nm
, there exist unitary matrices U, V , and a positive
diagonal matrix S such that
A = USV

.
(This decomposition is called the singular value decomposition (SVD).)
2. For each A F
nn
, there exist a positive semidenite matrix P and
a unitary matrix U such that
A = PU.
(This decomposition is called the polar decomposition.)
3. A positive denite matrix A can be factorized as A = T

T where T
is upper triangular. (This factorization is called Cholesky factoriza-
tion.)
4. For a 2n 2n unitary matrix W, there exist n n unitary matrices
U
1
, U
2
, V
1
, V
2
such that
W =
_
U
1
0
0 U
2
__
C S
S C
__
V

1
0
0 V

2
_
where
C =
_

_
cos
1
.
.
.
cos
n
_

_
, S =
_

_
sin
1
.
.
.
sin
n
_

_
for some
i
[0,

2
], i = 1, . . . , n. (This decomposition is called CS
decomposition.)
A.23 Let A be a 2n2n reversed diagonal matrix with reversed diagonal entries
d
1
, . . . , d
2n
R ordered from top-right to bottom-left:
A =
_
_
d
1
.
.
.
d
2n
_
_
.
What are the eigenvalues and singular values of A?
200 APPENDIX A. MATHEMATICAL PRELIMINARIES
A.24 For A C
mn
with singular values
1
(A),
2
(A), . . . ,
minm,n
(A), what
are the eigenvalues of
_
0 A
A

0
_
?
A.25 Let A C
nn
.
1. Show that there exist diagonalizable matrix D and nilpotent matrix
N such that A = D +N.
2. Show that the eigenspace of A corresponding to eigenvalue
i
is equal
to ^(D
i
I).
3. Show that ^(A
i
I) = ^(D
i
I) ^(N).
A.26 Write a MATLAB program to implement the LSFF algorithm. The input
of the program should be the matrix A and the output should be a vector
containing the coecients of c
A
and a three dimensional array containing
the coecients of Adj(zI A). Use your program to compute c
A
and
Adj(zI A) for
A =
_

_
4 6 4 1
1 0 0 0
0 1 0 0
0 0 1 0
_

_
.
A.27 Familiarize yourself with MATLAB commands qr, schur, svd, polar, and
chol.
A.28 Show the following:
1. Let A R
nm
and B R
np
. Then there exists C R
pm
such
that BC = A if and only if (A) (B).
2. Let A R
nm
and B R
pm
. Then there exists C R
np
such
that CB = A if and only if ^(B) ^(A).
A.29 Show that the set of complex Hermitian matrices H
n
is a not a linear space
over C, but rather a linear space over R. What is its dimension? Find
an orthonormal basis under the standard matrix inner product A, B) =
trA

B.
A.30 Prove Theorem A.13.
A.31 Use Theorem A.13 to prove Theorem A.11.
A.32 Show the following
1. det(I +AB) = det(I +BA);
2. trAB = trBA.
A.2. MATRIX ANALYSIS 201
A.33 Consider linear equation
Ax = b
where A F
mn
, b F
m
, and x F
n
. If A is square and nonsingular, we
can get the unique solution of the equation easily as x = A
1
b. However
we are often interested in the case when A is rectangular or singular.
1. Assume rankA = m. Then there may be innity many solutions.
Show that the minimum norm solution, i.e., the solution with the
smallest 2-norm, is A

(AA

)
1
b.
2. Assume rankA = n. Then there may not be solutions. Show that the
so-called least square solution, i.e., the x which minimizes |Axb|
2
,
is (A

A)
1
A

.
3. In general, we are interested in the minimum norm least square so-
lution, which is the one with the smallest 2-norm among all x that
minimizes |Axb|
2
. Find the minimum norm least square solution
of the linear equation.
4. For linear equation
_
1 2
3 6
_
x =
_
1
2
_
,
nd its minimum norm least square solution.
A.34 Let the singular values of A C
mn
be
1
(A),
2
(A), . . . ,
minm,n
(A).
Prove that
|A|
F
=

_
minm,n

i=1

2
i
(A).
A.35 Show that for each A T
n
, there exists a unique B T
n
such that
A = B
2
. This B is called the square root of A, denoted by A
1/2
. For
A =
_
2

2

2 3
_
, compute A
1/2
.
A.36 Show that for A, B H
n
(F), with A > 0, there exists P (L(n, F) such
that P

AP and P

BP are simultaneously diagonal.


A.37 Prove the following statements about matrices A, B H
n
(F):
1. If either A > 0 or B > 0, then the eigenvalues of AB are real.
2. If both A > 0 and B > 0, then the eigenvalues of AB are positive.
A.38 The linear space R
nn
has an inner product A, B) = trA
t
B and has two
subspaces:
o
n
= A R
nn
: A
t
= A
T
n
= A R
nn
: A
t
= A
which are called symmetric matrix space and skew-symmetric matrix
space.
202 APPENDIX A. MATHEMATICAL PRELIMINARIES
1. Show that o
n
T
n
and o
n
T
n
= R
nn
.
2. If L is a linear transformation on R
nn
mapping X to A
t
XA X.
Show that o
n
and T
n
are invariant subspaces of L.
A.39 The spectral radius of A F
nn
, denoted by (A), is the maximum
modulus of all its eigenvalues. Show that |A|
p
(A), where | |
p
is
the induced matrix 2-norm. Also show that for each > 0, there exists a
P (L(n, F) such that |P
1
AP|
2
< (A) +.

You might also like