Professional Documents
Culture Documents
A Solution Manual
Georg Muntingh and Christian Schulz
Preface
This solution manual gradually appeared over the course of several years, when first
Christian Schulz (20052008) and then I (20092013) were guiding the exercise sessions
of a numerical linear algebra course at the University of Oslo.
We would like to thank Tom Lyche for providing solutions to some of the exercises. Several students contributed by pointing out mistakes and improvements to the
exercises and solutions. Any remaining mistakes are, of course, our own.
Blindern, November 2013,
Georg Muntingh.
Contents
Preface
Chapter 0. Preliminaries
Exercise 0.24: The AT A inner product
Exercise 0.25: Angle between vectors in complex case
Exercise 0.41: The inverse of a general 2 2 matrix
Exercise 0.42: The inverse of a 2 2 matrix
Exercise 0.43: Sherman-Morrison formula
Exercise 0.44: Cramers rule; special case
Exercise 0.45: Adjoint matrix; special case
Exercise 0.47: Determinant equation for a plane
Exercise 0.48: Signed area of a triangle
Exercise 0.49: Vandermonde matrix
Exercise 0.50: Cauchy determinant
Exercise 0.51: Inverse of the Hilbert matrix
1
1
1
1
1
2
2
2
2
3
3
4
5
7
7
7
7
8
8
9
9
9
9
9
10
10
10
10
Chapter 2. LU Factorizations
Exercise 2.3: Column oriented backsolve (TODO)
Exercise 2.6: Computing the inverse of a triangular matrix
Exercise 2.15: Row interchange
Exercise 2.16: LU of singular matrix
Exercise 2.17: LU and determinant
Exercise 2.18: Diagonal elements in U
Exercise 2.20: Finite sums of integers
Exercise 2.21: Operations
11
11
11
11
12
12
12
13
14
ii
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
2.22:
2.30:
2.39:
2.61:
2.62:
2.63:
2.67:
14
14
14
15
15
15
15
16
16
16
17
17
18
18
19
20
22
22
22
23
23
24
25
25
26
26
27
28
28
28
28
28
29
29
29
30
30
30
30
30
31
31
31
32
32
32
iii
32
33
34
34
35
35
36
36
37
37
37
38
40
40
40
40
41
42
42
42
43
43
43
43
44
44
45
45
47
48
48
48
49
50
50
iv
50
50
51
51
51
52
52
53
55
55
56
56
56
57
57
58
58
59
59
61
62
62
63
64
64
64
65
65
65
66
66
67
68
69
69
69
70
71
72
72
72
72
73
73
73
74
75
75
75
76
76
77
77
79
79
79
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
vi
79
80
80
80
81
81
82
82
83
84
85
85
85
CHAPTER 0
Preliminaries
Exercise 0.24: The AT A inner product
Assume that A Rmn has linearly independent columns. We show that h, iA :
(x, y) 7 xT AT Ay satisfies the axioms of an inner product on a real vector space
V, as described in Definition 0.20. Let x, y, z V and a, b R, and let h, i be the
standard inner product on V.
Positivity. One has hx, xiA = xT AT Ax = hAx, Axi 0, with equality holding
if and only if Ax = 0. Since Ax is a linearly combination of the columns of A with
coefficients the entries of x, and since the columns of A are assumed to be linearly
independent, one has Ax = 0 if and only if x = 0.
Symmetry. One has hx, yiA = xT AT Ay = (xT AT Ay)T = yT AT Ax = hy, xiA .
Linearity. One has hax + by, ziA = (ax + by)T AT Az = axT AT Az + byT AT Az =
ahx, ziA + bhy, ziA .
Exercise 0.25: Angle between vectors in complex case
By the Cauchy-Schwarz inequality for a complex inner product space,
|hx, yi|
1.
0
kxkkyk
Note that taking x and y perpendicular yields zero, taking x and y equal yields one,
and any value in between can be obtained by picking an appropriate affine combination
of these two cases.
Since the cosine decreases monotonously from one to zero on the interval [0, /2],
there is a unique argument [0, /2] such that
cos =
|hx, yi|
.
kxkkyk
1 3 1 2
= 0.
x2 =
/
2 6 2 1
2
A = 3
6
matrix
6 3
2 6 .
3
2
1+2 3
1+1 2 6
(1)
(1) 3
6
2
T
2+2 2
2+1 6 3
(1)
adjA = (1)
6
3 2
6
3
(1)3+2 2
(1)3+1
3
2 6
T
14
21 42
= 42 14 21 .
21 42 14
6
1+3 3 2
(1)
6 3
2
3
2+3 2 6
(1)
6 3
2
3
3+3 2 6
(1)
3 2
6
x y z 1 a
0
x1 y1 z1 1 b 0
x2 y2 z2 1 c = 0 .
x3 y3 z3 1
d
0
2
Since the coordinates a, b, c, d of the plane are not all zero, the above matrix is singular, implying that its determinant is zero. Computing this determinant by cofactor
expansion of the first row gives the equation
y1 z1 1
x1 z1 1
x1 y1 1
x1 y1 z1
+ y2 z2 1 x x2 z2 1 y + x2 y2 1 z x2 y2 z2 = 0
y3 z3 1
x3 z3 1
x3 y3 1
x3 y3 z3
of the plane.
Exercise 0.48: Signed area of a triangle
Let T denote the triangle with vertices P1 , P2 , P3 . Since the area of a triangle is
invariant under translation, we can assume P1 = A = (0, 0), P2 = (x2 , y2 ), P3 = (x3 , y3 ),
B = (x3 , 0), and C = (x2 , 0). As is clear from Figure 2, the area A(T ) can be expressed
as
A(T ) = A(ABP3 ) + A(P3 BCP2 ) A(ACP2 )
1
1
1
= x3 y3 + (x2 x3 )y2 + (x2 x3 )(y3 y2 ) x2 y2
2
2
2
1 1 1
1
= 0 x2 x3 ,
2 0 y y
2
3
which is what needed to be shown.
Exercise 0.49: Vandermonde matrix
For any n = 1, 2, . . ., let
1 x1 x21
1 x2 x22
2
Dn := 1 x3 x3
..
... ...
.
1 x x2
n
n
x1n1
xn1
2
n1
x3
..
..
.
.
n1
x
n
be the determinant of the Vandermonde matrix in the Exercise. Clearly the formula
Y
(?)
DN =
(xi xj )
1j<iN
holds for N = 1 (in which case the product is empty and defined to be 1) and N = 2.
Let us assume (?) holds for N = n 1 > 2. Since the determinant is an alternating
multilinear form, adding a scalar multiple of one column to another does not change
the value of the determinant. Subtracting xkn times column k from column k + 1 for
k = n 1, n 2, . . . , 1, we find
1 x1 xn x21 x1 xn x1n1 x1n2 xn
1 x2 xn x22 x2 xn x2n1 x2n2 xn
n1
n2
2
Dn = 1 x3 xn x3 x3 xn x3 x3 xn .
..
..
..
..
...
.
.
.
.
1 x x x2 x x xn1 xn2 x
n
n
n n
n
n
n
n
3
xn1 xn
x
x
n1
n
n1
(xi xj ).
1j<in
A = (ai,j )i,j =
1
i + j
i,j
1
+
= 2. 1
..
1
n +1
ci,j =
Qn
n
Y
k=1 (i
1 +2
1
2 +2
..
.
1
n +2
...
1
1 +n
1
2 +n
1
n +n
..
.
(i + k ).
k=1
k6=j
1i<jn
for some constant k. To determine k, we can evaluate det C at a particular value, for
instance any {i , j }i,j satisfying 1 + 1 = = n + n = 0. In that case C becomes
a diagonal matrix with determinant
det C =
n Y
n
Y
(i + k ) =
i=1 k=1
k6=i
n Y
n
Y
(i k ) =
i=1 k=1
k6=i
(i k )
1i<kn
Y
1i<kn
(k i ).
det A =
1i<jn
1i<jn
n
Y
(i + j )
i,j=1
(b) Deleting row l and column k from A, results in the matrix Al,k associated to
the vectors [1 , . . . , l1 , l+1 , . . . , n ] and [1 , . . . , k1 , k+1 , . . . , n ]. By the adjoint
formula for the inverse A1 = (bk,l ) and by (??),
det Al,k
bk,l := (1)k+l
det A
n
Y
Y
Y
(i + j )
(i j )
(i j )
i,j=1
= (1)
k+l
n
Y
1i<jn
i,j6=l
1i<jn
i,j6=k
(i + j )
i,j=1
i6=l
j6=k
n
Y
(i j )
1i<jn
(i j )
1i<jn
n
Y
(s + l )
(s + k )
= (l + k )
s=1
s6=l
n
Y
s=1
s6=k
n
Y
s=1
s6=l
n
Y
s=1
s6=k
n
Y
s6=l
s6=k
(s l )
= (l + k )
s + k
l
s=1 s
(s k )
s + l
,
s
k
s=1
tni,j
= (i + j 1)
n
n
Y
s+i1Ys+j1
s=1
s6=j
sj
si
s=1
s6=i
1 i, j n.
for i = 1, 2, . . . .
Clearly (?) holds when i = j = 1. Suppose that (?) holds for some (i, j). Then
n
n
Y
s+1+i1 Y s+j1
tni+1,j = (i + j)
sj
s1i
s=1
s=1
s6=j
s6=i+1
n+1
Y
n
Y
(s + i 1) (s + j 1)
= (i + j)
1
s=2
n
2
(i + j) Y
(s j)
s=1
n1
Y
(s i)
s=1
s6=j
s=0
s6=i
n
n
Y
Y
(s + i 1) (s + j 1)
s=1
s=1
s6=i
n
Y
(s i)
s=1
s6=i
n
n
Y
Y
1 i2 n2
s+i1
s+j1
=
(i + j 1)
(i + j 1)
2
i+j i
sj
si
s=1
s=1
s6=j
s6=i
1 i n
f (i)f (j)
i + j i2
f (i + 1)f (j)
,
=
(i + 1) + j 1
so that (?) holds for (i + 1, j). Carrying out a similar calculation for (i, j + 1), or using
the symmetry of Tn , we conclude by induction that (?) holds for any i, j.
=
CHAPTER 1
1 1 1
1 1 3
2 8 3
Example 1.1
1
1
0
1
1
0
1 1 1
1 1
2 2
0 2
2
6 5 1
0 0
[A|b], yields
1 1
2
2 .
1 7
k = 1, . . . , n 1.
This clearly holds for k = 1, since A is strictly diagonally dominant. Assume (?) holds
for some k satisfying 1 k n 2. Using (1.4), the reverse triangle inequality, the
induction hypothesis, and strict diagonal dominance of A,
a
c
k
k
|dk+1 | |ck | |ak | |dk+1 | |ak | > |ck+1 |,
|uk+1 | = dk+1
uk
|uk |
and it follows by induction that an LU factorization exists. Moreover, since the LU
factorization is completely determined by (1.4), it is unique.
Exercise 1.9: LU factorization of 2nd derivative matrix
Let L = (lij )ij , U = (rij )ij and T be as in the exercise. Clearly L is unit lower triangular
and U is upper triangular. We compute the product LU by separating cases for its
entries. There are several ways to carry out and write down this computation, some
more precise than others. For instance,
(LU)11 = 1 2 = 2;
i1
i+1
1 + 1
= 2,
i
i
i1
i
=
= 1,
i
i1
= 1 1 = 1,
(LU)ii =
for i = 2, . . . , m;
(LU)i,i1
for i = 2, . . . , m;
(LU)i1,i
for i = 2, . . . , m;
for |i j| 2.
(LU)ij = 0,
It follows that T = LU is an LU factorization.
7
for 1 j i m.
In order to show that S = T1 , we multiply S by T and show that the result is the
identity matrix. To simplify notation we define sij := 0 whenever i = 0, i = m + 1,
j = 0, or j = m + 1. With 1 j < i m, we find
m
X
si,k Tk,j = si,j1 + 2si,j si,j+1
ST i,j =
k=1
= 1
ST
j,i
m
X
i
m+1
(j + 1 + 2j j 1) = 0,
k=1
i
i+1
j+2 1
j 1
j
m+1
m+1
i 1 2i + i + 1
= j + 2j j + j
= 0,
m+1
m
X
ST i,i =
si,k Tk,i = si,i1 + 2si,i si,i+1
i1
= 1
m+1
k=1
= 1
i
m+1
(i 1) + 2 1
i
m+1
i+1
i 1
m+1
i=1
i = 3(i1 + i ) = 3
yi+1 yi1
,
h
It follows that
!
XX
i
aij ei eTj
XX
i
kl
aij ei eTj
kl
XX
i
aij ik jl = akl
(B)ij = A A
ij
n
X
k=1
k=1
ij
n
X
aik bjk =
k=1
n
X
a:k bT:k
ij
k=1
Exercise 1.18: System with many right hand sides; compact form
Let A, B, and X be as in the Exercise.
(=): Suppose AX = B. Multiplying this equation from the right by ej yields
Ax.j = b.j for j = 1, . . . , p.
(=): Suppose Ax.j = b.j for j = 1, . . . , p. Let I = Ip denote the identity matrix.
Then
AX = AXI = AX[e1 , . . . , ep ] = [AXe1 , . . . , AXep ]
= [Ax.1 , . . . , Ax.p ] = [b.1 , . . . , b.p ] = B.
Exercise 1.19: Block multiplication example
The product AB of two matrices A and B is defined precisely when the number of
columns of A is equal to the number of rows of B. For both sides in the equation
AB = A1 B1 to make sense, both pairs (A, B) and (A1 , B1 ) need to be compatible in
this way. Conversely, if the number of columns of A equals the number of rows of B
and the number of columns of A1 equals the number of rows of B1 , then there exists
integers m, p, n, and s with 1 s p such that
A Cm,p , B Cp,n , A1 Cm,s , A2 Cm,ps , B1 Cs,n .
Then
(AB)ij =
p
X
k=1
aik bkj =
s
X
k=1
p
X
aik bkj +
k=s+1
aT
1 0T
aT B1
CAB =
=
=
.
0 C1 0 A1 0 B1
0 C1 A1 0 B1
0 C1 A1 B1
10
CHAPTER 2
LU Factorizations
Exercise 2.3: Column oriented backsolve (TODO)
0
b1k
a11 a1,k a1,k+1 a1,n
..
..
..
..
...
...
.. .
.
.
.
.
0
ak,k ak,k+1 ak,n bkk
= 1 .
ak+1,k+1 ak+1,n 0
. 0
..
...
.. .
0
.
..
an,n
0
0
Evaluating the upper block matrix multiplication then yields (2.4). Solving the above
system thus yields the kth column of B.
Performing this block multiplication for k = n, n 1, . . . , 1, we see that the computations after step k only use the first k 1 leading principal submatrices of A. It
follows that the column bk computed at step k can be stored in row (or column) k of
A without altering the remaining computations.
Exercise 2.15: Row interchange
Suppose we are given an LU factorization
1 1
1 0 u11 u12
=
.
0 1
l21 1
0 u22
Carrying out the matrix multiplication on the right hand side, one finds that
1 1
u11
u12
=
,
0 1
l21 u11 l21 u12 + u22
11
implying that u11 = u12 = 1. It follows that necessarily l21 = 0 and u22 = 1, and the
pair
1 0
1 1
L=
,
U=
0 1
0 1
1 1
is the only possible LU factorization of the matrix
. One directly checks that
0 1
this is indeed an LU factorization.
Exercise 2.16: LU of singular matrix
Suppose we are given an LU factorization
1 1
1 0 u11 u12
=
.
1 1
l21 1
0 u22
Carrying out the matrix multiplication on the right hand side, one finds that
1 1
u11
u12
=
,
1 1
l21 u11 l21 u12 + u22
implying that u11 = u12 = 1. It follows that necessarily l21 = 1/u11 = 1 and u22 =
1 l21 u12 = 0, and the pair
1 0
1 1
L=
,
U=
1 1
0 0
1 1
. One directly checks that
is the only possible LU factorization of the matrix
1 1
this is indeed an LU factorization.
Exercise 2.17: LU and determinant
Suppose A has an LU factorization A = LU. Then, by Lemma 2.11, A[k] = L[k] U[k]
is an LU factorization for k = 1, . . . , n. By induction, the cofactor expansion of the
determinant yields that the determinant of a triangular matrix is the product of its
diagonal entries. One therefore finds that det(L[k] ) = 1, det(U[k] ) = u11 ukk and
det(A[k] ) = det(L[k] U[k] ) = det(L[k] ) det(U[k] ) = u11 ukk
for k = 1, . . . , n.
Exercise 2.18: Diagonal elements in U
From Exercise 2.17, we know that det(A[k] ) = u11 ukk for k = 1, . . . , n. Since A
is nonsingular, its determinant det(A) = u11 unn is nonzero. This implies that
det(A[k] ) = u11 ukk 6= 0 for k = 1, . . . , n, yielding a11 = u11 for k = 1 and a
well-defined quotient
det(A[k] )
u1,1 uk1,k1 uk,k
=
= uk,k ,
det(A[k1] )
u1,1 uk1,k1
for k = 2, . . . , n.
12
1 xm+1
.
1x
Then
Pm0 =
1 (m + 1)xm + mxm+1
,
(x 1)2
1 + 2 + + m = Pm0 (1)
1 (m + 1)xm + mxm+1
x1
(x 1)2
m(m + 1)xm1 + m(m + 1)xm
= lim
x1
2(x 1)
1
= m(m + 1),
2
establishing (2.10). In addition it follows that
= lim
1 + 3 + + 2m 1 =
m
X
(2k 1) = m + 2
m
X
k = m + m(m + 1) = m2 ,
k=1
k=1
which establishes (2.12). Next, applying lHopitals rule three times, we find that
1 2 + 2 3 + + (m 1) m = Pm00 (1)
is equal to
2 + (m2 + m)xm1 + 2(1 m2 )xm + (m2 m)xm+1
x1
(x 1)3
2
m2
(m 1)(m + m)x
+ 2m(1 m2 )xm1 + (m + 1)(m2 m)xm
= lim
x1
3(x 1)2
2
m3
(m 2)(m 1)(m + m)x
+ 2(m 1)m(1 m2 )xm2 + m(m + 1)(m2 m)xm1
= lim
x1
6(x 1)
1
= (m 1)m(m + 1),
3
lim
1 + 2 + + m =
m
X
k=1
k =
m
X
(k 1)k + k =
k=1
m
X
k=1
(k 1)k +
m
X
k=1
1
1
1
1
= (m 1)m(m + 1) + m(m + 1) = (m + 1)(m + )m,
3
2
3
2
which establishes (2.11).
13
n
X
i=1
j=1
i + i(i + 1) + (n i)(2i 1) =
i=1
j=i+1
n
X
i2 + 2ni n + i
i=1
= n2 + (2n + 1)
n
X
i=1
n
X
i2
i=1
1
1
= n2 + n(n + 1)(2n + 1) n(n + 1)(2n + 1)
2
6
1
2
1
1
= n2 + n(n + 1)(2n + 1) = n3 + n = n(2n2 + 1)
3
3
3
3
arithmetic operations. A similar calculation gives the same result for the product BA.
Exercise 2.30: Making a block LU into an LU (TODO)
15
CHAPTER 3
4 1 1
0
1
4
0 1
.
1
0
4 1
0 1 1
4
P
In every row i, one has |aii | = 4 > 2 = | 1| + | 1| + |0| = j6=i |aij |. In other words,
A is strictly diagonally dominant.
= . 21
.
.
.
.
..
..
.
..
..
..
..
..
..
.
.
.
(A)br1 (A)br2 (A)brs
Abr1 Abr2 Abrs
The identity (A1 + A2 ) B = (A1 B) + (A2 B) follows from
..
..
..
..
.
.
.
.
(A1 + A2 )br1 (A1 + A2 )br2 (A1 + A2 )brs
=
..
..
..
..
.
.
.
.
A1 br1 + A2 br1 A1 br2 + A2 br2 A1 brs + A2 brs
A2 br1 A2 br2
A1 brs
A2 brs
follows from
Ab11 Ab1s
.. C
= ...
.
Abr1 Abrs
.
.
..
=
.
Ab c
Ab1s ct1
11 t1
.
..
..
.
Abr1 ct1 Abrs ct1
..
=A
.
b c
11 t1 b1s ct1
.
..
..
.
br1 ct1
Bc11
..
=A
.
Bct1
Ab11 c1u
..
.
Abr1 c1u
..
.
Ab11 ctu
..
.
Abr1 ctu
b11 c1u
..
.
br1 c1u
..
.
b11 ctu
..
.
br1 ctu
brs ct1
Bc1u
.. .
.
Bctu
Ab1s c1u
..
Abrs c1u
Ab1s ctu
..
.
Abrs ctu
b1s c1u
..
.
brs c1u
b1s ctu
..
.
brs ctu
again because |cos(x)| < 1 for any x (0, ). Since, in addition, M is symmetric,
Lemma 2.41 implies that M is symmetric positive definite.
Exercise 3.13: Eigenvalues 2 2 for 2D test matrix
One has
1
2d + 2a
1
2d a a 0
1
a 2d 0 a 1 2d + 2a
Ax =
a 0 2d a 1 = 2d + 2a = (2d + 2a) 1 = x,
1
2d + 2a
0 a a 2d 1
0 0 0 0
v00 v01 v02 v03
v10
0
v13
,
= 0
v20
0
v23 0
0 0 0 0
v30 v31 v32 v33
leaving four equations
to determine the interior points v11 , v12 , v21 , v22 . As 6h2 /12 =
1/ 2(m + 1)2 = 1/18 for m = 2, we obtain
20v11 4v01 4v10 4v21 4v12 v00 v20 v02 v22
1
= (8f11 + f01 + f10 + f21 + f12 ),
18
20v21 4v11 4v20 4v31 4v22 v10 v30 v12 v32
1
= (8f21 + f11 + f20 + f31 + f22 ),
18
20v12 4v02 4v11 4v22 4v13 v01 v21 v03 v23
1
= (8f12 + f02 + f11 + f22 + f13 ),
18
20v22 4v12 4v21 4v32 4v23 v11 v31 v13 v33
1
= (8f22 + f12 + f21 + f32 + f23 ),
18
Using the values known from the boundary condition, these equations can be simplified
to
1
20v11 4v21 4v12 v22 = (8f11 + f01 + f10 + f21 + f12 ),
18
18
1
(8f21 + f11 + f20 + f31 + f22 ),
18
1
20v12 4v11 4v22 v21 = (8f12 + f02 + f11 + f22 + f13 ),
18
1
20v22 4v12 4v21 v11 = (8f22 + f12 + f21 + f32 + f23 ).
18
(b) For f (x, y) = 2 2 sin(x) sin(y), one finds
Substituting these
20 4
4 20
4 1
1 4
Solving this system we find that v11 = v12 = v21 = v22 = 5 2 /66.
Exercise 3.15: Matrix equation for nine point scheme
(a) Let
2 1 0
1 2 1
1 2 1
0 1 2
v11 v1m
..
...
V = ...
.
vm1 vmm
for j, k = 0, . . . , m + 1,
for (s, t) ,
for (s, t) .
where A := (T I + I T)2 ,
for i = 1, . . . , m.
It follows that A = M2 has eigenpairs (i + j )2 , si sj , for i, j = 1, . . . , m. (Note
that we can verify this directly by multiplying A by si sj and using the mixed product
rule.) Since the i are positive, the eigenvalues of A are positive. We conclude that A
is symmetric positive definite.
20
21
CHAPTER 4
(j1)(k1)
N
,
2
i
N
N := e
= cos
2
N
i sin
2
N
.
1 1
1
1
1 i 1 i
F4 =
1 1 1 1 .
1 i 1 i
Computing the transpose and Hermitian transpose gives
1 1
1
1
1 1
1
1
1 i 1 i
1 i 1 i
H
= F4 ,
F
=
FT4 =
4
1 1 1 1 6= F4 ,
1 1 1 1
1 i 1 i
1 i 1 i
which is what needed to be shown.
Exercise 4.6: Sine transform as Fourier transform
According to Lemma 4.2, the Discrete Sine Transform can be computed from the
Discrete Fourier Transform by (Sm x)k = 2i (F2m+2 z)k+1 , where
z = [0, x1 , . . . , xm , 0, xm , . . . , x1 ]T .
For m = 1 this means that
i
and S1 x1 = (F4 z)2 .
2
1
1
Since h = m+1 = 2 for m = 1, computing the DST directly gives
S1 x1 = sin(h)x1 = sin
x 1 = x1 ,
2
while computing the Fourier transform gives
1 1
1
1
0
0
1 i 1 i x1 2ix1
F4 z =
1 1 1 1 0 = 0 .
1 i 1 i
x1
2ix1
z = [0, x1 , 0, x1 ]T
m X
m
X
m X
m X
m X
m
X
sip spk slr srj
fkl
p + r
m X
m X
m X
m
ip
pk
rj
lr
X
sin
sin
sin
sin
m+1
m+1
m+1
m+1
= h4
fkl ,
p
r
sin2 2(m+1)
+ sin2 2(m+1)
p=1 r=1 k=1 l=1
p=1 r=1 k=1 l=1
p=1 r=1
TV + VT = h2 F.
Let T = SDS1 be the orthogonal diagonalization of T from Equation (4.4), and write
X = VS and C = h2 FS.
(a) Multiplying Equation (?) from the right by S, one obtains
TX + XD = TVS + VSD = TVS + VTS = h2 FS = C.
(b) Writing C = [c1 , . . . , cm ], X = [x1 , . . . , xm ] and applying the rules of block
multiplication, we find
[c1 , . . . , cm ] =
=
=
=
=
=
C
TX + XD
T[x1 , . . . , xm ] + X[1 e1 , . . . , m em ]
[Tx1 + 1 Xe1 , . . . , Txm + m Xem ]
[Tx1 + 1 x1 , . . . , Txm + m xm ]
[(T + 1 I)x1 , . . . , (T + m I)xm ],
which is equivalent to System (4.9). To find X, we therefore need to solve the m tridiagonal linear systems of (4.9). Since the eigenvalues 1 , . . . , m are positive, each matrix
T + j I is diagonally dominant. By Theorem 1.7, every such matrix is nonsingular and
has a unique LU factorization. Algorithms 1.3 and 1.4 then solve the corresponding
system (T + j I)xj = cj in O(m) operations for some constant . Doing this for all
m columns x1 , . . . , xm , one finds the matrix X in O(m2 ) operations.
(c) To find V, we first find C = h2 FS by performing O(2m3 ) operations. Next we
find X as in step b) by performing O(m2 ) operations. Finally we compute V = 2hXS
by performing O(2m3 ) operations. In total, this amounts to O(4m3 ) operations.
23
6:
i j
j,k=1
V SXS
For the individual steps in this algorithm, the time complexities are shown in the
following table.
step
1
O(1)
complexity
3
2
O(m )
O(m)
5
3
O(m )
6
2
O(m )
O(m3 )
Hence the overall complexity is determined by the four matrix multiplications and
given by O(m3 ).
Exercise 4.11: Fast solution of biharmonic equation
From Exercise 3.16 we know that T Rmm is the second derivative matrix. According
to Lemma 3.8, the eigenpairs (j , sj ), with j = 1, . . . , m, of T are given by
sj = [sin(jh), sin(2jh), . . . , sin(mjh)]T ,
j = 2 2 cos(jh) = 4 sin2 (jh/2),
and satisfy sTj sk = j,k /(2h) for all j, k, where h := 1/(m + 1). Using, in order, that
U = SXS, TS = SD, and S2 = I/(2h), one finds that
h4 F = T2 U + 2TUT + UT2
4h6 gjk
4h6 gjk
h6 gjk
=
=
.
2
(j + k )2
4(j + k )2
4 sin2 (jh/2) + 4 sin2 (kh/2)
25
function U = simplefastbiharmonic(F)
m = length(F);
h = 1/(m+1);
hv = pi*h*(1:m);
sigma = sin(hv/2).2;
S = sin(hv*(1:m));
G = S*F*S;
X = (h6)*G./(4*(sigma*ones(1,m)+ones(m,1)*sigma).2);
U = zeros(m+2,m+2);
U(2:m+1,2:m+1) = S*X*S;
end
function V = standardbiharmonic(F)
m = length(F);
h = 1/(m+1);
T = gallery(tridiag, m, -1, 2, -1);
A = kron(T2, eye(m)) + 2*kron(T,T) + kron(eye(m),T2);
b = h.4*F(:);
x = A\b;
V = zeros(m+2, m+2);
V(2:m+1,2:m+1) = reshape(x,m,m);
end
After specifying m = 4 by issuing the command F = ones(4,4), the commands simplefastbiharmonic(F) and standardbiharmonic(F) both return
the matrix
0
0
0
0
0
0
0 0.0015 0.0024 0.0024 0.0015 0
.
0 0.0024 0.0037 0.0037 0.0024 0
0 0.0015 0.0024 0.0024 0.0015 0
0
0
0
0
0
0
For large m, it is more insightful to plot the data returned by our Matlab functions.
For m = 50, we solve and plot our system with the commands in Listing 4.3.
26
Listing 4.3. Solving the biharmonic equation and plotting the result
1
2
3
4
5
F = ones(50, 50);
U = simplefastbiharmonic(F);
V = standardbiharmonic(F);
surf(U);
surf(V);
simplefastbiharmonic
standardbiharmonic
x 10
x 10
0
60
0
60
60
40
60
40
40
20
40
20
20
0
20
0
On the face of it, these plots seem to be virtually identical. But exactly how close are
they? We investigate this by plotting the difference with the command surf(U-V),
which gives
simplefastbiharmonic minus standardbiharmonic
14
x 10
2
1.5
0.5
0
60
60
40
40
20
20
0
We conclude that their maximal difference is of the order of 1014 , which makes them
indeed very similar.
Exercise 4.14: Fast solution of biharmonic equation using 9 point rule
(TODO)
27
CHAPTER 5
qn1 qn2 q1 q0
1
0
0
0
1
0
0 .
A () = det(A I) = det
..
..
..
..
...
.
.
.
.
0
0
By the rules of determinant evaluation, we can substract from any column a linear
combination of the other columns without changing the value of the determinant.
Multiply columns 1, 2, . . . , n 1 by n1 , n2 , . . . , and adding the corresponding
linear combination to the final column, we find
qn1 qn2 q1 f ()
1
0
0
0
1
0
0 = (1)n f (),
A () = det
..
..
..
..
..
.
.
.
.
.
0
0
1
0
where the second equality follows from cofactor expansion along the final column.
Multiplying this equation by (1)n yields the statement of the Exercise.
(b) Similar to (a), by multiplying rows 2, 3, . . . , n by , 2 , . . . , n1 and adding the
corresponding linear combination to the first row.
Exercise 5.16: Schur decomposition example
The matrix U is unitary, as U U = UT U = I. One directly verifies that
1 1
T
R := U AU =
.
0
4
Since this matrix is upper triangular, A = URUT is a Schur decomposition of A.
Exercise 5.19: Skew-Hermitian matrix
By definition, a matrix C is skew-Hermitian if C = C.
=: Suppose that C = A + iB, with A, B Rm,m , is skew-Hermitian. Then
A iB = C = C = (A + iB) = AT iBT ,
which implies that AT = A and B = BT (use that two complex numbers coincide
if and only if their real parts coincide and their imaginary parts coincide). In other
words, A is skew-Hermitian and B is real symmetric.
=: Suppose that we are given matrices A, B Rm,m such that A is skewHermitian and B is real symmetric. Let C = A + iB. Then
C = (A + iB) = AT iBT = A iB = (A + iB) = C,
meaning that C is skew-Hermitian.
29
j=1
3 0 1
A = 4 1 2 ,
4 0 1
1 1 0
J = 0 1 0 ,
0 0 1
2 1 1
S = 4 0 0 .
4 0 2
Exercise 5.46: Big Jordan example
The matrix A has Jordan form A = SJS1 , with
3 1 0 0 0 0 0 0
0 3 0 0 0 0 0 0
0 0 2 1 0 0 0 0
0 0 0 2 1 0 0 0
1
, S =
J=
0 0 0 0 2 0 0 0
9
0 0 0 0 0 2 1 0
0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 2
14
28
42
56
70
84
98
49
9
18
27
36
45
54
63
0
5
10
15
20
16
12
8
4
6
12
18
24
12
9
6
3
0
0
0
0
9
0
0
0
8
7
6
5
4
3
2
1
9 9
0 0
0 9
0 0
.
0 0
0 0
0 0
0 0
1 1 0
1 n 0
(?)
Jn = 0 1 0 = 0 1 0 .
0 0 1
0 0 1
Clearly (?) holds for n = 1. Suppose (?)
1 n 0 1 1
Jn+1 = Jn J = 0 1 0 0 1
0 0 1 0 0
0
1 n+1 0
0 = 0
1
0 ,
1
0
0
1
31
2
100
1 100
100 1
A = (SJS ) = SJ S = 4
4
2 1 1
1 100 0 0
0 1 0 1
= 4 0 0
4 0 2 0 0 1 0
1
1 1
1 100 0
2 1 1
0 0 0 1 0 4 0 0
0 2 0 0 1 4 0 2
41 0
201 0 100
1
0
= 400 1 200 .
2
1
1
400 0 199
2
2
32
33
CHAPTER 6
1 1
A = 2 2 ,
2 2
1 2 2
,
A =
1 2 2
T
9 9
A A=
.
9 9
T
18 0
= 0 0 .
0 0
The normalized eigenvectors v1 , v2 of AT A corresponding to the eigenvalues 1 , 2 are
the columns of the matrix
1 1 1
.
V = [v1 v2 ] =
2 1 1
Using Theorem 6.7.3 one finds u1 , which can be extended to an orthonormal basis
{u1 , u2 , u3 } using Gram-Schmidt Orthogonalization (see Theorem 0.29). The vectors
u1 , u2 , u3 constitute a matrix
1 2 2
1
U = [u1 u2 u3 ] = 2 2 1 .
3 2 1 2
34
A SVD of A is therefore
1 2
1
2 2
A=
3 2 1
given by
2
18 0 1
1
1
1 0 0
.
2 1 1
2
0 0
0
...
AT A =
0
0
the matrix
0 0
. . . .. ..
. . .
0 0
0 1
has eigenpairs
(0, ej ) for j = 1, . . . , n 1 and (1, en ). Then = eT1 R1,n and
V = en , en1 , . . . , e1 Rn,n . Using Theorem 6.7.3 we get u1 = 1, yielding U = 1 .
A SVD for A is therefore given by
A = eTn = 1 eT1 en , en1 , . . . , e1 .
(c) In this exercise
1 0
A=
,
0 3
A = A,
1 0
A A=
.
0 9
T
2
1 2
1
1
2
0
0
3
4
2 2 1 ,
V=
=
,
U=
.
0 1 0
3 2 1 2
5 4 3
From Theorem 6.16 we know that V1 forms an orthonormal basis for span(AT ) =
span(B), V2 an orthonormal basis for ker(A) and U2 an orthonormal basis for ker(AT ) =
ker(B). Hence
span(B) = v1 + v2 ,
ker(A) = v3
36
and
ker(B) = 0.
A 0 vi
A ui
0 pi for i = r + 1, . . . , n
0 A
ui
Avi
i qi for i = 1, . . . , r
Cqi =
=
=
A 0 vi
A ui
0 qi for i = r + 1, . . . , n
0 A uj
0
0
Crj =
=
=
= 0 rj , for j = n + 1, . . . , m.
A 0
0
A uj
0
This gives a total of n + n + (m n) = m + n eigen pairs.
Exercise 6.27: Rank example
We are given the singular value decomposition
1
1
1
1
6
2
2
2
2
1
1
1
1
0
2
2
2
2
A = UVT =
1
1
1
1
0
2 2
2
2
1
1
1
1
0
2
2
2
2
0
6
0
0
0
0
0
0
2
3
2
3
1
3
2
3
31
32
1
3
23
2
3
kA BkF 22 + 32 = 62 + 02 = 6.
37
2
2
1
0
2 2 1
D
0
A1 = A := U
VT =
2 2 1 .
0
2 2 1
1
if i = j;
1
if i < j;
bij =
2n
2
if (i, j) = (n, 1);
0
otherwise.
while the column vector x = (xj )j Rn is given by
1
if j = n;
xj =
2n1j otherwise.
For the final entry in the matrix product Bx one finds that
(Bx)n =
n
X
j=1
For any of the remaining indices i 6= n, the i-th entry of the matrix product Bx can
be expressed as
(Bx)i =
n
X
bij xj = bin +
j=1
= 1 + 2
n1
X
2n1j bij
j=1
n1i
bii +
n1
X
2n1j bij
j=i+1
= 1 + 2
n1i
n1
X
2n1j
j=i+1
= 1 + 2
n1i
n2i
n2i
X
j 0 =0
1
= 1 + 2n1i 2n2i
1
2
j 0
1 n1i
2
12
= 0.
As B has a nonzero kernel, it must be singular. The matrix A, on the other hand,
is nonsingular, as its determinant is (1)n 6=p0. The matrices A and B differ only in
their (n, 1)-th entry, so one has kA BkF = |an1 bn1 |2 = 22n . In other words, the
tiniest perturbation can make a matrix with large determinant singular.
38
min
CRn,n
rank(C)=r
39
CHAPTER 7
Matrix Norms
Exercise 7.4: Consistency of sum norm?
Observe that the sum norm is a matrix norm. This follows since it is equal to the
l1 -norm of the vector v = vec(A) obtained by stacking the columns of a matrix A on
top of each other.
Let A = (aij )ij and B = (bij )ij be matrices for which the product AB is defined.
Then
X
X X
kABkS =
aik bkj
|aik | |bkj |
i,j
i,j,k
|aik | |blj | =
i,j,k,l
X
i,k
|aik |
l,j
where the first inequality follows from the triangle inequality and multiplicative property of the absolute value | |. Since A and B where arbitrary, this proves that the
sum norm is consistent.
Exercise 7.5: Consistency of max norm?
Observe that the max norm is a matrix norm. This follows since it is equal to the
l -norm of the vector v = vec(A) obtained by stacking the columns of a matrix A on
top of each other.
To show that the max norm is not consistent we use a counter example. Let
A = B = (1)2i,j=1 . Then
1 1 1 1
2 2
1 1 1 1
1 1 1 1 = 2 2 = 2 > 1 = 1 1 1 1 ,
M
M
M
M
contradicting kAB||M kAkM kBkM .
Exercise 7.6: Consistency of modified max norm?
Exercise 7.5 shows that the max norm is not consistent. In this Exercise we show that
the max norm can be modified so as to define
a consistent matrix norm.
(a) Let A Cm,n and define kAk := mnkAkM as in the Exercise. To show that
k k defines a consistent matrix norm we have to show that it fulfills the three matrix
norm properties and that it is submultiplicative. Let A, B Cm,n be any matrices and
any scalar.
nmkA + BkM nm kAkM + kBkM = kAk + kBk.
kABk = mn max
ai,k bk,j
1im
1jn
k=1
q
mn max
1im
1jn
|ai,k ||bk,j |
k=1
mn max
max |bk,j |
1im
1kq
1jn
q
X
!
|ai,k |
k=1
q mn
max |ai,k |
1im
1kq
!
max |bk,j |
1kq
1jn
= kAkkBk.
(b) For any A Cm,n , let
kAk(1) := mkAkM
and
kAk(2) := nkAkM .
Comparing with the solution of part (a) we see, that the points of positivity, homogeneity and subadditivity are fulfilled here as well, making kAk(1) and kAk(2) valid
matrix norms. Furthermore, for any A Cm,q , B Cq,n ,
q
!
!
X
kABk(1) = m max
ai,k bk,j m max |ai,k | q max |bk,j |
1im
1im
1kq
1jn
1kq
1jn
(1)
k=1
(1)
= kAk kBk ,
kABk(2) = n max |
1im
1jn
q
X
!
ai,k bk,j | q
k=1
(2)
max |ai,k | n
1im
1kq
!
max |bk,j |
1kq
1jn
= kAk(2) kBk ,
which proves the submultiplicativity of both norms.
j=1
i=1 j=1
i=1 j=1
k=1
which shows that the matrix norm k kS is subordinate to the vector norm k k1 .
41
j=1
j=1,...,n
j=1
= kAkM kxk1 .
(b) Assume that the maximum in the definition of kAkM is attained in column l,
implying that kAkM = |ak,l | for some k. Let el be the lth standard basis vector. Then
kel k1 = 1 and
kAel k = max |ai,l | = |ak,l | = |ak,l | 1 = kAkM kel k1 ,
i=1,...,m
kAxk
.
kxk1
By (b), equality is attained for any standard basis vector el for which there exists a k
such that kAkM = |ak,l |. We conclude that
kAkM = max
x6=0
kAxk
,
kxk1
|y Ax| =
max
kxk2 =1=kyk2
kxk2 =1=kyk2
1 |y x|
max
|y UV x| =
max
1 kyk2 kxk = 1 .
kxk2 =1=kyk2
kxk2 =1=kyk2
max
kxk2 =1=kyk2
|y x|
max
kxk2 =1=kyk2
|y Ax|.
1
=
n
1
kxk2
= max n
,
06=xC kAxk2
kAxk2
min
06=xCn kxk2
2 1
A=
,
1 2
1 2 1
.
=
3 1 2
Using Theorem 7.12, one finds kAk1 = kAk = 3 and kA1 k1 = kA1 k = 1. The
singular values 1 2 of A are the square roots of the zeros of
0 = det(AT A I) = (5 )2 16 = 2 10 + 9 = ( 9)( 1).
Using Theorem 7.14, we find kAk2 = 1 = 3 and kA1 k2 = 21 = 1. Alternatively,
since A is symmetric positive definite, we know from (7.11) that kAk2 = 1 and
kA1 k2 = 1/2 , where 1 = 3 is the biggest eigenvalue of A and 2 = 1 is the
smallest.
Exercise 7.21: Unitary invariance of the spectral norm
Suppose V is a rectangular matrix satisfying V V = I. Then
kVAk22 = max kVAxk22 = max x A V VAx
kxk2 =1
kxk2 =1
kxk2 =1
equal to its Euclidean norm, considered as a vector. In order for kAuk2 < kAk2
to hold, there needs to exist another unit vector v for which one has the inequality
kAuk2 < kAvk2 of Euclidean norms. In other words, we need to pick a matrix A that
scales more in the direction v than in the direction u. For instance, if
2 0
0
1
A=
,
u=
,
v=
,
0 1
1
0
then
kAk2 = max kAxk2 kAvk2 = 2 > 1 = kAuk2 .
kxk2 =1
(|1 x1 |p + + |n xn |p )1/p
(x1 ,...,xn )6=0
(|x1 |p + + |xn |p )1/p
max
kAxkp
kAej kp
= (A).
kxkp
kej kp
Together, the above two statements imply that kAkp = (A) for any diagonal matrix
A and any p satisfying 1 p .
Exercise 7.24: Spectral norm of a column vector
We write A Cm,1 for the matrix corresponding to the column vector a Cm . Write
kAkp for the operator p-norm of A and kakp for the vector p-norm of a. In particular
kAk2 is the spectral norm of A and kak2 is the Euclidean norm of a. Then
kAkp = max
x6=0
kAxkp
|x|kakp
= max
= kakp ,
x6=0
|x|
|x|
kAk1 = max
1jn
kAk = max
1im
m
X
i=1
n
X
j=1
i=1 j=1
!
|ai,j |
= max
1jn
!
|ai,j |
= max
1im
m
X
!
|bi,j |
i=1
n
X
= k |A| k1 ,
!
|bi,j |
= k |A| k ,
j=1
|ai,j ||xj |
= k |A| |x| k2 .
kAxk2 =
ai,j xj
i=1
i=1
j=1
j=1
Now let x with kx k2 = 1 be a vector for which kAk2 = kAx k2 . That is, let x be
a unit vector for which the maximum in the definition of 2-norm is attained. Observe
that |x | is then a unit vector as well, k |x | k2 = 1. Then, by the above estimate of
l2 -norms and definition of the 2-norm,
kAk2 = kAx k2 k |A| |x | k2 k |A| k2 .
44
(d) By Theorem 7.12, we can solve this exercise by finding a matrix A for which A
and |A| have different largest singular values. As A is real and symmetric, there exist
a, b, c R such that
a b
|a| |b|
A=
,
|A| =
,
b c
|b| |c|
2
2
a + b2 ab + bc
a + b2 |ab| + |bc|
T
T
A A=
,
|A| |A| =
.
ab + bc b2 + c2
|ab| + |bc| b2 + c2
To simplify these equations we first try the case a + c = 0. This gives
2
2
a + b2
0
a + b2 2|ab|
T
T
A A=
,
|A| |A| =
.
0
a2 + b 2
2|ab| a2 + b2
To get different norms we have to choose a, b in such a way that the maximal eigenvalues
of AT A and |A|T |A| are different. Clearly AT A has a unique eigenvalue := a2 + b2
and putting the characteristic polynomial () = (a2 + b2 )2 4|ab|2 of |A|T |A| to
zero yields eigenvalues := a2 + b2 2|ab|. Hence |A|T |A| has maximal eigenvalue
+ = a2 + b2 + 2|ab| = + 2|ab|. The spectral norms of A and |A| therefore differ
whenever both a and b are nonzero. For example, when a = b = c = 1 we find
1 1
A=
,
kAk2 = 2,
k |A| k2 = 2.
1 1
Exercise 7.32: Sharpness of perturbation bounds
Suppose Ax = b and Ay = b + e. Let K = K(A) = kAkkA1 k be the condition
number of A. Let yA and yA1 be unit vectors for which the maxima in the definition
of the operator norms of A and A1 are attained. That is, kyA k = 1 = kyA1 k,
kAk = kAyA k, and kA1 k = kA1 yA1 k. If b = AyA and e = yA1 , then
kA1 ek
kA1 yA1 k
kyA1 k
kek
ky xk
=
=
= kA1 k = kAkkA1 k
=K
,
1
kxk
kA bk
kyA k
kAyA k
kbk
showing that the upper bound is sharp. If b = yA1 and e = AyA , then
ky xk
kA1 ek
kyA k
1
1
kAyA |
1 kek
=
=
=
=
=
,
kxk
kA1 bk
kA1 yA1 k
kA1 k
kAkkA1 k kyA1 k
K kbk
showing that the lower bound is sharp.
Exercise 7.33: Condition number of 2nd derivative matrix
Recall that T = tridiag(1, 2, 1) and, by Exercise 1.10, T1 is given by
1
T1 ij = T1 ji = (1 ih)j > 0, 1 j i m, h =
.
m+1
From Theorems 7.12 and 7.14, we have the following explicit expressions for the 1-, 2and -norms
n
m
X
X
1
kAk1 = max
|ai,j |, kAk2 = 1 , kA1 k2 =
, kAk = max
|ai,j |
1jn
1im
m
i=1
j=1
for any matrix A Cm,n , where 1 is the largest singular value of A, m the smallest
singular value of A, and we assumed A to be nonsingular in the third equation.
45
(a) For the matrix T this gives kTk1 = kTk = m + 1 for m = 1, 2 and kTk1 =
kTk = 4 for m 3. For the inverse we get kT1 k1 = kT1 k = 21 = 18 h2 for m = 1
and
1 2 1
1 2 1
1
= kT1 k
kT k1 =
= 1 =
3 1 2 1
3 1 2
for m = 2. For m > 2, one obtains
m
m
j1
X
X
1 X
(1 jh)i +
(1 ih)j
T ij =
i=1
i=1
i=j
j1
m
X
(1 jh)i +
i=1
j1
X
(1 ih)j
(1 ih)j
i=1
i=1
(j 1)j jm
(j 1)j
= (1 jh)
+
(2 jh)
2
2
2
j
= (m + 1 j)
2
1
1
=
j j 2,
2h
2
1
which is a quadratic function in j that attains its maximum at j = 2h
= m+1
. For
2
odd m > 1, this function takes its maximum at integral j, yielding kT1 k1 = 81 h2 .
For even m > 2, on the other hand, the maximum over all integral j is attained at
j = m2 = 1h
or j = m+2
= 1+h
, which both give kT1 k1 = 18 (h2 1).
2h
2
2h
Similarly, we have for the infinity norm of T1
m
m
i1
X
X
1
1
1 X
(1 ih)j +
(1 jh)i =
i i2 ,
T i,j =
2h
2
j=1
j=1
j=i
and hence kT1 k = kT1 k1 . This is what one would expect, as T (and therefore
T1 ) is symmetric. We conclude that the 1- and -condition numbers of T are
2
m = 1;
1 6
m = 2;
cond1 (T) = cond (T) =
2
h
2
m odd, m > 1;
2
h 1 m even, m > 2.
(b) Since the matrix T is symmetric, TT T = T2 and the eigenvalues of TT T are
the squares of the eigenvalues 1 , . . . , n of T. As all eigenvalues of T are positive,
each singular value of T is equal to an eigenvalue. Using that i = 2 2 cos(ih), we
find
1 = |m | = 2 2 cos(mh) = 2 + 2 cos(h),
m = |1 | = 2 2 cos(h).
It follows that
1
1 + cos(h)
=
= cot2
cond2 (T) =
m
1 cos(h)
46
h
2
.
1
tan2 x
<
1
.
x2
2
3
s(x, y) =
kx + ixk2 kx ixk2
k(1 + i)xk2 k(1 i)xk2
=
4
4
(|1 + i| |1 i|)kxk2
= 0,
=
4
so that hx, xi = kxk2 0, with equality holding precisely when x = 0.
(2) Conjugate symmetry. Since s(x, y) is real, s(x, y) = s(y, x), s(ax, ay) =
|a|2 s(x, y), and s(x, y) = s(x, y),
s(x, ix) =
hy, xi = s(y, x)is(y, ix) = s(x, y)is(ix, y) = s(x, y)is(x, iy) = hx, yi.
(3) Linearity in the first argument. Assuming the parallelogram identity,
1
1
1
1
2s(x, z) + 2s(y, z) = kx + zk2 kz xk2 + ky + zk2 kz yk2
2
2
2
2
2
2
1
x
+
y
x
y
1
x
+
y
x
y
z
+
=
z+
+
2
2
2
2
2
2
2
2
1
z + x + y x y
1
z x + y + x y
2
2
2
2
2
2
2
2
x y
2
x y
2
x + y
x + y
=
z +
+
z
2
2
2
2
2
2
x
+
y
x
+
y
z
=
z
+
2
2
x+y
=4s
,z ,
2
implying that s(x + y, z) = s(x, z) + s(y, z). It follows that
hx + y, zi = s(x + y, z) + is(x + y, iz)
= s(x, z) + s(y, z) + is(x, iz) + is(y, iz)
= s(x, z) + is(x, iz) + s(y, z) + is(y, iz)
= hx, zi + hy, zi.
That hax, yi = ahx, yi follows, mutatis mutandis, from the proof of Theorem 7.42.
47
-1
0.5
0.5
0.5
-0.5
0.5
-1
-0.5
0.5
-1
-0.5
0.5
-0.5
-0.5
-0.5
-1
-1
-1
48
i=1
i=1
i=1
|xi |
=n
n
|xi |
=
n
n
i=1
i=1
i=1
i=1
Since the function x 7 x1/p is monotone, we obtain
!1/q
!1/p
n
n
X
X
n1/q kxkq = n1/q
|xi |q
n1/p
|xi |p
= n1/p kxkp ,
i=1
i=1
|xi |q
|xi |q
= kxkq .
i=1
i=1
i=1
kxkq
= n1/q kxk ,
i=1
i=1
proving the right inequality. Using that the map x 7 x1/q is monotone, the left
inequality follows from
n
X
q
q
kxk = (max |xi |)
|xi |q = kxkqq .
i
i=1
49
CHAPTER 8
aij xk (j)
aij xk (j) + bi ,
aii
j=1
j=i+1
which is identical to Jacobis method (8.3).
Exercise 8.13: Convergence of the R-method when eigenvalues have
positive real part (TODO)
=
0 1
0 41 3 4
34 0
1 0 1 2
1 0
1
1
G1 = I M1 A = I (D AL ) A =
0 1
34 41 3 4
0 2
=
.
0 32
p
From this, we find (GJ ) = 3/2 and (G1 ) = 3/2, both of which are bigger than 1.
Exercise 8.18: Strictly diagonally dominance; The J method
If A = (aij )ij is strictly diagonally dominant, then it is nonsingular and a11 , . . . , ann 6=
0. For the Jacobi method, one finds
a1n
12
0
aa11
aa13
a11
11
a2n
a21
0
aa23
a22
a22
22
a3n
a31 a32
1
0
.
.
.
.
.
.
.
.
.
.
.
.
.
n2
n3
n1
aann
aann
0
aann
By Theorem 7.12, the -norm can be expressed as the maximum, over all rows, of
the sum of absolute values of the entries in a row. Using that A is strictly diagonally
dominant, one finds
P
X aij
|a |
= max j6=i ij < 1.
kGk = max
aii 1in |aii |
i
j6=i
As by Lemma 7.11 the -norm is consistent, Corollary 8.9 implies that the Jacobi
method converges for any strictly diagonally dominant matrix A.
Exercise 8.19: Strictly diagonally dominance; The GS method
Let A = AL + D AR be decomposed as a sum of a lower triangular, a diagonal,
and an upper triangular part. By Equation (8.14), the approximate solutions x(k) are
related by
Dx(k+1) = AL x(k+1) + AR x(k) + b
in the Gauss Seidel method. Let x be the exact solution of Ax = b. It follows that
the errors (k) := x(k) x are related by
D(k+1) = AL (k+1) + AR (k) .
Let r and ri be as in the exercise. Let k 0 be arbitrary. We show by induction on i
that
(?)
(k+1)
|j
| rk(k) k ,
for j = 1, . . . , i.
51
We conclude that the Gauss Seidel method converges for strictly diagonally dominant
matrices.
Exercise 8.23: Convergence example for fix point iteration
(k)
(k)
lim xi
= lim 1 ak = 1 lim ak = 1,
k
(k)
for i = 1, 2.
(k)
When |a| > 1, however, |x1 | = |x2 | = |1 ak | becomes arbitrary large with k and
(k)
limk xi diverges.
The eigenvalues of G are the zeros of the characteristic polynomial 2 a2 =
( a)( + a), and we find that G has spectral radius (G) = 1 , where := 1 |a|.
Equation (8.32) yields an estimate k = log(10)s/(1 |a|) for the smallest number of
iterations k so that (G)k 10s . In particular, taking a = 0.9 and s = 16, one
expects at least k = 160 log(10) 368 iterations before (G)k 1016 . On the other
hand, 0.9k = |a|k = 10s = 1016 when k 350, so in this case the estimate is fairly
accurate.
Exercise 8.24: Estimate in Lemma 8.22 can be exact
As the eigenvalues of the matrix GJ are the zeros of 2 1/4 = ( 1/2)( + 1/2) = 0,
one finds the spectral radius (GJ ) = 1/2. In this example, the Jacobi iteration process
is described by
1
1 1
2 0
1
0 2
(k+1)
(k)
,
c=
= 21 .
x
= GJ x + c,
GJ = 1
0 2
1
0
2
2
The initial guess
0
(0)
x =
0
52
(k)
(k)
can be quite slow. This makes it an impractical method for computing the spectral
radius of A.
(a) The Matlab code
1
2
3
4
5
6
7
8
9
n = 5
a = 10
l = 0.9
for k = n-1:200
L(k) = nchoosek(k,n-1)*a(n-1)*l(k-n+1);
end
stairs(L)
x 10
2.5
1.5
0.5
0
0
20
40
60
80
100
120
140
160
180
200
1
2
3
4
k = n-1;
finds that f (k) dives for the first time below 108 at k = 470. We conclude that the
matrix Ak is close to zero only for a very high power k.
(b) Let E = E1 := (A I)/a be the n n matrix in the exercise, and write
0 Ink
Ek :=
Rn,n .
0 0
Clearly Ek = Ek for k = 1. Suppose that Ek = Ek for some k satisfying 1 k n 1.
Using the rules of block multiplication,
Ek+1 = Ek E1
0k,1 , Ik 0k,nk1
0nk,k Ink
=
nk1
0k,k 0k,nk
0nk,k+1 0I1,nk1
0nk,k+1 Ink1
=
0k,k+1 0k,nk1
= Ek+1 .
Alternatively, since
1 if j = i + 1,
(E)ij =
0 otherwise,
(E )ij =
1 if j = i + k,
0 otherwise,
one has
(Ek+1 )ij = (Ek E)ij =
=
1 if j = i + k + 1,
0 otherwise,
1
j=0
which is what needed to be shown.
54
55
CHAPTER 9
X
1
1
1X
Q(y) = yT UDUT y bT y = vT Dv cT v =
j vj2
cj vj ,
2
2
2 j=1
j=1
which is what needed to be shown.
Exercise 9.4: Steepest descent iteration
In the method of Steepest Descent we choose, at the kth iteration, the search direction
pk = rk = b Axk and optimal step length
k :=
rTk rk
.
rTk Ark
2 1
A=
,
1 2
0
b=
,
0
and an initial guess x0 = [1, 1/2]T of its minimum. The corresponding residual is
0
2 1
1
3/2
r0 = b Ax0 =
=
.
0
1 2
1/2
0
Performing the steps in Equation (9.6) twice yields
rT r0
9/4
1
2 1 3/2
3
= ,
t0 = Ar0 =
=
, 0 = 0T =
1 2
0
3/2
9/2
2
r0 t0
1 3/2
1
1
1/4
3/2
3
0
x1 =
+
=
, r1 =
=
1/2
0
1/2
0
3/2
3/4
2
2
rT r1
9/16
1
2 1
0
3/4
t1 = Ar1 =
=
, 1 = 1T =
= ,
1 2
3/4
3/2
9/8
2
r1 t1
1 0
1 3/4
1/4
1/4
0
3/8
x2 =
+
=
, r2 =
=
.
1/2
1/8
3/4
0
2 3/4
2 3/2
Moreover, assume that for some k 1 one has
1
k 0
1k
k 1
, x2k1 = 4
, r2k1 = 3 4
,
(?)
t2k2 = 3 4
2
1
1/2
56
(??)
t2k1 = 3 4
1
,
2
x2k = 4
1
,
1/2
r2k = 3 4
1/2
.
0
Then
t2k = 3 4
1
2 1 1/2
1(k+1)
,
=34
1/2
1 2
0
9 42k ( 21 )2
rT2k r2k
1
=
=
,
1
T
2
r2k t2k
9 42k 2
1
1
(k+1) 1
k
k 1/2
,
+ 34
= 4
x2k+1 = 4
2
1/2
0
2
1
1
1(k+1)
(k+1) 0
k 1/2
34
r2k+1 = 3 4
=34
,
0
1/2
1
2
2 1 0
(k+1)
(k+1) 1
t2k+1 = 3 4
=34
,
1 2
1
2
2k =
rT2k+1 r2k+1
9 42(k+1)
1
= ,
=
2k+1 = T
2(k+1)
94
2
2
r2k+1 t2k+1
1
1
(k+1) 1
(k+1) 0
(k+1)
x2k+2 = 4
+ 34
= 4
,
2
1
1/2
2
1
(k+1) 1
(k+1) 0
(k+1) 1/2
r2k+2 = 3 4
=34
34
,
2
1
0
2
Using the method of induction, we conclude that (?), (??), and k = 1/2 hold for any
k 1.
Exercise 9.7: Conjugate gradient iteration, II
Using x(0) = 0, one finds
x(1) = x(0) +
(b Ax(0) , b Ax(0) )
(b, b)
(b Ax(0) ) =
b.
(0)
2
(0)
(b Ax , Ab A x )
(b, Ab)
(1)
(b, b)
9 0
0
=
=
.
b=
3/2
(b, Ab)
18 3
We find, in order,
(0)
(0)
=r
1
0 = ,
4
3
1 (1)
0 = , r = 2 ,
0
2
3
2
1
(2)
= 23 ,
1 = , x =
.
2
3
4
0
=
,
3
p(1)
Since the residual vectors r(0) , r(1) , r(2) must be orthogonal, it follows that r(2) = 0 and
x(2) must be an exact solution. This can be verified directly by hand.
57
where, by (9.3),
1
f () := Q(xk + pk ) = Q(xk ) pTk rk + 2 pTk Apk
2
is a quadratic polynomial in . Since A is assumed to be positive definite, necessarily
pTk Apk > 0. Therefore f has a minimum, which it attains at
=
pTk rk
.
pTk Apk
Applying (9.15) repeatedly, one finds that the search direction pk for the conjugate
gradient method satisfies
rTk1 rk1
rTk rk
rTk rk
pk = rk + T
pk1 = rk + T
rk1 + T
pk2 =
rk1 rk1
rk1 rk1
rk2 rk2
As p0 = r0 , the difference pk rk is a linear combination of the vectors rk1 , . . . , r0 ,
each of which is orthogonal to rk . It follows that pTk rk = rTk rk and that the step length
is optimal for
=
rTk rk
= k .
pTk Apk
Exercise 9.10: Starting value in cg
s0 = r0 Ay0 = r0 ,
q0 = s0 = r0 ,
sTk sk
,
qTk Aqk
yk+1 = yk + k qk ,
sk+1 = sk k Aqk ,
sTk+1 sk+1
,
qk+1 = sk+1 + k qk .
k :=
sTk sk
How are the iterates yk and xk related? As remarked above, s0 = r0 and q0 = r0 = p0 .
Suppose sk = rk and qk = pk for some k 0. Then
sk+1 = sk k Aqk = rk
rTk rk
Apk = rk k Apk = rk+1 ,
pTk Apk
rTk+1 rk+1
pk = pk+1 .
rTk rk
It follows by induction that sk = rk and qk = pk for all k 0. In addition,
qk+1 = sk+1 + k qk = rk+1 +
yk+1 yk = k qk =
rTk rk
pk = xk+1 xk ,
pTk Apk
58
for any k 0,
so that yk = xk x0 .
Exercise 9.15: The A-inner product
We verify the axioms of Definition 0.20.
Positivity: Since A is positive definite, x 6= 0 = hx, xi = xT Ax > 0. On the
other hand, x = 0 = hx, xi = xT Ax = 0T A0 = 0. It follows that hx, xi 0
for all x, with equality if and only if x = 0.
Symmetry: One has hx, yi = xT Ay = (xT Ay)T = yT AT x = yT Ax = hy, xi
for all vectors x and y.
Linearity: One has hax + by, zi = (ax + by)T Az = axT Az + byT Az = ahx, zi +
bhy, zi for all real numbers a, b and vectors x, y, z.
Exercise 9.17: Program code for testing steepest descent
Replacing the steps in (9.16) by those in (9.6), Algorithm 9.13 changes into the following
algorithm for testing the method of Steepest Descent.
Listing 9.1. Testing the method of Steepest Descent
1
2
3
4
5
6
7
8
9
10
11
12
13
To check that this program is correct, we compare its output with that of cgtest.
1
2
3
Running these commands yields Figure 1, which shows that the difference between
both tests is of the order of 109 , well within the specified tolerance.
As in Tables 9.12 and 9.14, we let the tolerance be tol = 108 and run sdtest for
the m m grid for various m, to find the number of iterations Ksd required before
||rKsd ||2 tol ||r0 ||2 . Choosing a = 1/9 and d = 5/18 yields the averaging matrix, and
we find the following table.
n
Ksd
2 500
10 000
40 000
1 000 000
4 000 000
37
35
32
26
24
Choosing a = 1 and d = 2 yields the Poisson matrix, and we find the following
table.
59
10
x 10
8
0
50
40
50
30
40
30
20
20
10
10
0
n
Ksd /n
Ksd
100
400
1 600
2 500
10 000
40 000
4.1900
419
4.0325
1 613
3.9112
6 258
3.8832
9 708
3.8235
38 235
3.7863
151 451
KJ
385
8 386
KGS
194
4 194
KSOR
35
164
324
645
Kcg
16
94
188
370
37
75
Here the number of iterations KJ , KGS , and KSOR of the Jacobi, Gauss-Seidel and
SOR methods are taken from Table 8.1, and Kcg is the number of iterations in the
Conjugate Gradient method.
Since Ksd /n seems to tend towards a constant, it seems that the method of Steepest
Descent requires O(n) iterations
for solving the Poisson problem for some given accu
racy, as opposed to the O( n) iterations required by the Conjugate Gradient method.
The number of iterations in the method of Steepest Descent is comparable to the number of iterations in the Jacobi method, while the number of iterations in the Conjugate
Gradient method is of the same order as in the SOR method.
The spectral
condition number of the mm Poisson matrix is = 1+cos(h) /(1
cos h) . Theorem 9.16 therefore states that
(?)
||x xk ||A
||x x0 ||A
1
+1
k
= cos
m+1
60
.
How can we relate this to the tolerance in the algorithm, which is specified in terms
of the Euclidean norm? Since
kxk2A
xT Ax
=
kxk22
xT x
is the Rayleigh quotient of x, Lemma 5.26 implies the bound
min kxk22 kxk2A max kxk22 ,
with min = 4 1cos(h) the smallest and max = 4 1+cos(h) the largest eigenvalue
of A. Combining these bounds with Equation (?) yields
k s
1 + cos m+1
1
kx xk k2
k
cos
=
.
kx x0 k2
+1
m+1
1 cos m+1
Replacing k by the number of iterations Ksd for the various values of m shows that
this estimate holds for the tolerance of 108 .
Exercise 9.18: Using cg to solve normal equations
We need to perform Algorithm 9.11 with AT A replacing A and AT b replacing b. For
the system AT Ax = AT b, Equations (9.13), (9.14), and (9.15) become
x(k+1) = x(k) + k p(k) ,
k =
r(k) T r(k)
r(k) T r(k)
=
,
p(k) T AT Ap(k)
(Ap(k) )T Ap(k)
k =
r(k+1) T r(k+1)
,
r(k) T r(k)
with p(0) = r(0) = b AT Ax(0) . Hence we only need to change the computation of
r(0) , k , and r(k+1) in Algorithm 9.11, which yields the implementation in Listing 9.2.
61
8
20
4
(0)
r , Ar(0) , A2 r(0) = b, Ab, A2 b = 0 , 4 , 16 .
0
4
0
(b) As x(0) = 0 we have p(0) = r(0) = b. We have for k = 0, 1, 2, . . . Equations
(9.13), (9.14), and (9.15),
x(k+1) = x(k) + k p(k) ,
k =
r(k) T r(k)
,
p(k) T Ap(k)
k =
r(k+1) T r(k+1)
,
r(k) T r(k)
0 = ,
x = 0 ,
2
0
8
2
1
(2)
4 ,
1 = ,
x =
3
3 0
3
3
(3)
2 = ,
x = 2 ,
4
1
p = 2 ,
r = 2 ,
0 = ,
4
0
0
0
4
1
4
1
(2)
(2)
0 ,
8 ,
r =
1 = ,
p =
3 4
9
9 12
0
0
(3)
(3)
0
r =
,
2 = 0,
p = 0 .
0
0
(c) By definition we have W0 = {0}. From the solution of part (a) we know
that Wk = span(b0 , Ab0 , . . . , Ak1 b0 ), where the vectors b, Ab and A2 b are linearly
independent. Hence we have dim Wk = k for k = 0, 1, 2, 3.
62
From (b) we know that the residual r(3) = b Ax(3) = 0. Hence x(3) is the exact
solution to Ax = b.
We observe that r(0) = 4e1 , r(1) = 2e2 and r(2) = (4/3)e3 and hence the r(k) for
k = 0, 1, 2 are linear independent and orthogonal to each other. Thus we are only left to
show that Wk is the span of r(0) , . . . , r(k1) . We observe that b = r(0) , Ab = 2r(0) 2r(1)
and A2 b = 5r(0) 8r(1) + 3r(2) . Hence span(b, Ab, . . . , Abk1 ) = span(r(0) , . . . , r(k1) )
for k = 1, 2, 3. We conclude that, for k = 1, 2, 3, the vectors r(0) , . . . , r(k1) form an
orthogonal basis for Wk .
One can verify directly that p(0) , p(1) , and p(2) are A-orthogonal. Moreover, observing that b = p(0) , Ab = (5/2)p(0) 2p(1) , and A2 b = 7p(0) (28/3)p(1) + 3p(2) ,
it follows that
span(b, Ab, . . . , Abk1 ) = span(p(0) , . . . , p(k1) ),
for k = 1, 2, 3.
63
CHAPTER 10
= x v = y.
x=x
I2 T
v v
vT v
vT v
(b) Since kxk22 = kyk22 ,
hx y, x + yi = hx, xi hy, xi + hx, yi hy, yi = kxk22 kyk22 = 0,
which means that x y and x + y are orthogonal. Px is the orthogonal projection of
x into span(x + y), because it satisfies
T
T
v xv
1v v
hx Px, (x + y)i =
, (x + y) =
v, (x + y)
vT v
2 vT v
1
=
(x y), (x + y) = 0,
2
for an arbitrary element (x + y) in span(x + y).
x/ e1
e1 /(1) e1
u= p
=p
= 2e1 ,
1 x1 /
1 1/(1)
and
0
H = I uuT =
...
0
0
1
..
.
0
0
0
.
. . . ..
.
1
64
2
1 2 2
t
vv
1
v = x y = 1 ,
Q = I 2 t = 2 2 1 ,
vv
3 2 1 2
1
such that Qx = y.
Exercise 10.7: 2 2 Householder transformation
Let Q = IuuT R2,2 be any Householder transformation. Then u = [u1 u2 ]T R2 is
a vector satisfying u21 +u22 = kuk22 = 2, implying that the components of u are related via
u21 1 = 1 u22 . Moreover, as 0 u21 , u22 kuk2 = 2, one has 1 u21 1 = 1 u22 1,
and there exists an angle 0 [0, 2) such that cos(0 ) = u21 1 = 1 u22 . For such an
angle 0 , one has
p
p
p
u1 u2 = 1 + cos 0 1 cos 0 = 1 cos2 0 = sin(0 ).
We thus find an angle := 0 for which
cos() sin()
cos(0 ) sin(0 )
1 u21 u1 u1
.
=
=
Q=
sin() cos()
sin(0 ) cos(0 )
u1 u2 1 u22
Furthermore, we find
2
cos(2)
cos
cos sin cos
sin cos2
.
=
=
Q
=
sin cos sin
sin(2)
sin
2 sin cos
When applied to the vector [cos , sin ]T , therefore, Q doubles the angle and reflects
the result in the y-axis.
Exercise 10.16: QR decomposition
That Q is orthonormal, and therefore unitary, can be shown directly by verifying that
QT Q = I. A direct computation shows that QR = A. Moreover,
2 2
0 2
R
1
R=
0 0 =: 02,2 ,
0 0
where R1 is upper triangular. It follows that A = QR is a QR decomposition.
65
1 1
1
1
1
2
2
,
A = Q1 R 1 ,
Q1 :=
R1 :=
.
0 2
2 1 1
1 1
1
0 1
A = [a1 , a2 , a3 ] = 2 1 0
2
2 1
be as in the Exercise. We wish to find Householder transformations Q1 , Q2 that produce
zeros in the columns a1 , a2 , a3 of A. Applying Algorithm 10.4 to the first column of
A, we find
3 2 1
2
1
0
1 .
Q1 A := (I u1 uT1 )A = 0
u1 = 1 ,
3 1
0
1
0
Next we need to map the bottom element (Q1 A)3,2 of the second column to zero,
without changing the first row of Q1 A. For this, we apply Algorithm 10.4 to the
vector [0, 1]T to find
1
0 1
0
T
u2 =
and
Q2 := I u2 u2 =
,
1
1 0
which is a Householder transformation of size 2 2. Since
3 2 1
1 0
Q2 Q1 A :=
Q1 A = 0 1 0 ,
0 H2
0
0 1
it follows that the Householder transformations Q1 and Q2 bring A into upper triangular form.
(b) Clearly the matrix Q3 := I is orthogonal and R := Q3 Q2 Q1 A is upper
triangular with positive diagonal elements. It follows that
A = QR,
1 3
1
1 3
7
A = [a1 , a2 , a3 ] =
1 1 4 .
1 1 2
66
v1 = a1 =
1 ,
1
2
2
v2 = a2 12 v1 =
2 ,
2
12 =
13 =
aT2 v1
= 1,
v1T v1
aT3 v1
3
= ,
T
2
v1 v1
23 =
Hence we have
1 2 3
1 2
3
V=
1 2 3 ,
1 2 3
5
aT3 v2
= ,
T
4
v2 v2
4 4 6
= 1 0 4 5 ,
R
4 0 0 4
3
3
v3 = a3 13 v1 23 v2 =
3 .
3
and
2 0 0
D = 0 4 0 .
0 0 6
we obtain
Using Q1 = VD1 and R1 = DR,
1 1 1
2
2
3
1 1 1
1
0 4 5 .
A = Q1 R 1 =
2 1 1 1 0 0 6
1 1 1
r cos
x=
,
r sin
cos sin
P=
.
sin cos
Using the angle difference identities for the sine and cosine functions,
cos( ) = cos cos + sin sin ,
sin( ) = sin cos cos sin ,
we find
cos cos + sin sin
r cos( )
Px = r
=
.
sin cos + cos sin
r sin( )
67
O(n ) +
n1
X
4 + 2 + 6(n + 1 k) = O(n ) + 6
k=1
2
n1
X
k=1
= O(n2 ) + 3n + 9n 12 = O(n2 )
arithmetic operations.
68
(n + 2 k)
CHAPTER 11
Least Squares
Exercise 11.7: Straight line fit (linear regression)
In each case, we are given an over-determined system Ax = b with corresponding
normal equations A Ax = A b.
(a) In this case A = [1, 1, . . . , 1]T , x = [x1 ], and b = [y1 , y2 . . . , ym ]T , implying that
1 t1
y1
1 t2
y
x1
,
.2 ,
A=
x
=
,
b
=
.
.
.. ..
..
x2
1 tm
ym
so that
m
t1 + + tm
,
A A=
t1 + + tm t21 + + t2m
y1 + + ym
A b=
.
t1 y1 + + tm ym
The solution x = [x1 , x2 ]T to the normal equations describes the line y(t) = x2 t + x1
closest to the points (t1 , y1 ), . . . , (tm , ym ), in the sense that the total error
kAx
bk22
m
X
x1 + ti x2 yi
2
i=1
m
X
y(ti ) yi
2
i=1
is minimal.
Exercise 11.8: Straight line fit using shifted power form
We are given an over-determined system Ax = b, with
1 t1 t
y1
1 t t
y2
2
x
A = ..
x= 1 ,
b=
.. ,
... ,
x2
.
.
ym
1 tm t
t1 + t2 + + tm
,
t =
m
A A=
,
t1 + + tm mt (t1 t)2 + + (tm t)2
69
y1 + + ym
A b=
.
(t1 t)y1 + + (tm t)ym
The solution x = [x1 , x2 ]T to the normal equations describes the line y(t) = x2 (t t)+x1
closest to the points (t1 , y1 ), . . . , (tm , ym ), in the sense that the total error
kAx
bk22
m
X
x1 + (ti t)x2 yi
2
i=1
m
X
y(ti ) yi
2
i=1
is minimal.
(b) Solving the normal equations, we find x1 = 2.375 and x2 = 0.87. Plotting the
data (ti , yi ) and the fitted linear function y(t) = x2 (t t) + x1 gives
4
3.5
3
2.5
2
1.5
1
998
999
1000
1001
1002
i = 1, . . . , m.
(a) Let c1 = x1 /2, c2 = x2 /2, and r2 = x3 + c21 + c22 as in the Exercise. Then, for
i = 1, . . . , m,
0 = (ti c1 )2 + (yi c2 )2 r2
x 2 x 2
x1 2
x2 2
1
2
= ti
+ yi
x3
2
2
2
2
= t2i + yi2 ti x1 yi x2 x3 ,
from which Equation (11.6) follows immediately. Once x1 , x2 , and x3 are determined,
we can compute
r
x2
1 2 1 2
x1
c2 = ,
r=
x + x + x3 .
c1 = ,
2
2
4 1 4 2
(b) The linear least square problem is to minimize kAx bk22 , with
t1 y1 1
t1 + y12
x1
.
.
.
.
.. .. ,
..
A = ..
b=
,
x = x2 .
2
x3
tm ym 1
t2m + ym
(c) Whether or not A has independent columns depends on the data ti , yi . For
instance, if ti = yi = 1 for all i, then the columns of A are clearly dependent. In
70
general, A has independent columns whenever we can find three points (ti , yi ) not on
a straight line.
(d) For these points the matrix A becomes
1 4 1
A = 3 2 1 ,
1 0 1
which clearly is invertible. We find
1
x1
1 4 1
17
2
13 = 4 .
x = x2 = 3 2 1
x3
1 0 1
1
1
It follows that c1 = 1, c2 = 2, and r = 2. The points (t, y) = (1, 4), (3, 2), (1, 0)
therefore all lie on the circle
(t 1)2 + (y 2)2 = 4,
as shown in the following picture.
4
3
2
1
0
-1
B = A := V U ,
:=
Rn,m .
0nr,r 0nr,mr
Note that = and = . Let us use this and the unitarity of U and
V to show that B satisfies the first two properties from the Exercise.
(1) ABA = UV V U UV = U V = UV = A
(2) BAB = V U UV V U = V U = V U = B
Moreover, since in addition the matrices and are Hermitian,
(3) (BA) = A B = V U U V = V V
= V( ) V = V V = V U UV = BA
(4) (AB) = B A = U V V U = U U
= U( ) U = U U = UV V U = AB
71
1 1
A = 1 1 ,
0 0
1 1 1 0
B=
4 1 1 0
1 1
1
1
1
0
AB = 1 1
=
1 1 0
4
0 0
1 1
1 1 1 0
1 1 =
BA =
4 1 1 0 0 0
1 1 0
1
1 1 0 ,
2 0 0 0
1 1 1
,
2 1 1
1 1
1 1
1 1 1
= 1 1 = A,
ABA = A(BA) = 1 1
1 1
2
0 0
0 0
1 1 1 1 1 1 0
1 1 1 0
BAB = (BA)B =
=
= B.
2 1 1 4 1 1 0
4 1 1 0
By Exercises 11.15 and 11.16, we conclude that B must be the pseudoinverse of A.
Exercise 11.18: Linearly independent columns and generalized inverse
If A Cm,n has independent columns then both A and A have rank n m. Then,
by Theorem 6.18, A A must have rank n as well. Since A A is an n n-matrix of
maximal rank, it is nonsingular and we can define B := (A A)1 A . We verify that
B satisfies the four axioms of Exercise 11.15.
(1) ABA = A(A A)1 A A = A
(2) BAB = (A A)1 A A(A A)1 A = (A A)1 A = B
vu
A
=
kuk22 kvk22
kuk22 kvk22
is well defined. We verify the four axioms of Exercise 11.15 to show that B must be
the pseudoinverse of A.
uv vu uv
ukvk22 kuk22 v
=
= uv = A;
kuk22 kvk22
kuk22 kvk22
vu uv vu
vkuk22 kvk22 u
vu
(2) BAB =
=
=
= B;
kuk42 kvk42 kuk42 kvk42
kuk22 kvk22
vu uv
vu uv
= BA;
(3) (BA) =
=
2
2
2
2
kuk2 kvk 2 kuk2 kvk 2
uv vu
uv vu
(4) (AB) =
=
= AB.
2
2
kuk2 kvk2
kuk22 kvk22
This proves that B is the pseudoinverse of A.
(1) ABA =
V1 1
1 U1 .
U1
1 V1 as well. We conclude that (A ) = (A ) .
(b) Since A := V1 1
1 U1 is a singular value factorization, it has pseudo inverse
1 1
(A ) = (U1 ) (1 ) V1 = U1 1 V1 = A.
(c) Let 6= 0. Since the matrix A has singular value factorization U1 (1 )V1 ,
it has pseudo inverse
1
(A) = V1 (1 )1 U1 = 1 V1 1
1 U1 = A .
73
A = UA A VA = UA,1 UA,2
VA
,
0
B = UB B,1 0 VB,1 VB,2 .
This gives
1
A,1
= VA 1
A,1 A,1 VA = I
and
1
B,1
BB = UB B,1 0 VB,1 VB,2 VB,1 VB,2
UB
0
= UB B,1 1
B,1 UB = I.
Moreover we get
(AA ) = UA,1
= UA,1
= UA,1
1
A,1
UA,2
VA
VA A,1
0
I 0
UA,2
UA,1 UA,2
0 0
A,1
UA,2
VA
VA 1
A,1
0
UA,1 UA,2
UA,1 UA,2
= AA
and
B,1
0 VB,1 VB,2
(B B) = VB,1 VB,2
UB UB 1
B,1
0
1
B,1
= VB,1 VB,2
UB UB B,1 0 VB,1 VB,2
0
= B B.
We now let E := AB and F := B A . Hence we want to show that E = F. We
do that by showing that F satisfies the properties given in Exercise 11.15.
EFE = ABB A AB = AB = E
FEF = B A ABB A = B A = F
(FE) = (B A AB) = (B B) = B B = B A AB = FE
(EF) = (ABB A ) = (AA ) = AA = ABB A = EF
(b) We have
A= a b
and
c
.
B=
d
74
This gives
a2 ab
A A=
ab b2
T
and
BT B = c2 + d2 .
Hence we want to choose a and b such that AT A is diagonal and c and d such that it
is the square of a nice number. Thus we set b = 0, c = 3 and d = 4, yielding
2
a 0
3
T
A= a 0 ,
A A=
,
B=
,
BT B = 25 = 52 .
0 0
4
We hence can derive the following singular value decompositions and pseudoinverse.
1 0
1 1
A= 1 a 0
,
A =
,
0 1
a 0
1 3 4 5
1
1 ,
3 4 .
B=
B =
0
5 4 3
25
We thus get
(AB) = (3a) =
1
3a
and
B A =
3
,
25a
and have
1
9 1
3
6=
=
= B A
3a
25 3a
25a
for all nonzero a R.
(AB) =
A = V1 1
1 U1 . Then A = A if and only if 1 = 1 , which happens precisely
when all nonzero singular values of A are one.
Exercise 11.25: Linearly independent columns
By Exercise 11.18, if A has rank n, then A = (A A)1 A . Then A(A A)1 A b =
AA b, which is the orthogonal projection of b into span(A) by Theorem 11.10.
Exercise 11.26: Analysis of the general linear system
In this exercise, we can write
1 0
=
,
1 = diag(1 , . . . , r ),
0 0
(b) By (a), the linear system Ax = b has a solution if and only if the system
1 y1
c1
.. ..
. .
1 0
r yr cr
y=
=c
=
0 0
0 cr+1
. .
.. ..
cn
0
has a solution y. Since 1 , . . . , r 6= 0, this system has a solution if and only if
cr+1 = = cn = 0. We conclude that Ax = b has a solution if and only if cr+1 =
= cn = 0.
(c) By (a), the linear system Ax = b has a solution if and only if the system
y = c has a solution. Hence we have the following three cases.
r = n:
Here yi = ci /i for i = 1, . . . , n provides the only solution to the system
y = b, and therefore x = Vy is the only solution to Ax = b. It follows that
the system has exactly one solution.
r < n, ci = 0 for i = r + 1, . . . , n:
Here each solution y must satisfy yi = ci /i for i = 1, . . . , r. The remaining
yr+1 , . . . , yn , however, can be chosen arbitrarily. Hence we have infinitely many
solutions to y = b as well as for Ax = b.
r < n, ci 6= 0 for some i with r + 1 i n:
In this case it is impossible to find a y that satisfies y = b, and therefore
the system Ax = b has no solution at all.
1 2
A = 1 1 ,
1 1
b1
b = b2
b3
be as in the Exercise.
(a) By Exercise 11.18, the pseudoinverse of A is
1 1
1
T
1 T
A = (A A) A =
.
1 21 21
76
1 0 0
b1
2b1
1
b1 := AA b = 0 12 12 b2 = b2 + b3 ,
2 b +b
b3
0 12 12
2
3
so that the orthogonal projection of b into ker(AT ) is
0 0
0
b1
0
1
b2 := (I AA )b = 0 21 12 b2 = b2 b3 ,
2 b b
b3
0 12 21
3
2
where we used that b = b1 + b2 .
(b) By Theorem 7.12, the 2-norms kAk2 and kA k2 can be found by computing the
largest singular values of the matrices A and A . The largest singular value 1 of A is
the square root of the largest eigenvalue 1 of AT A, which satisfies
3 1
4
T
0 = det(A A 1 I) = det
= 21 91 + 2.
4
6 1
p
8 6 6
1
5 2 I
0 = det(AT A 2 I) = det 6 5
4 6 5
5
1
= 2 222 92 + 1 .
2
Alternatively, we could have used that the largest singular value of A is the inverse of
the smallest singular
follows from the singular value factorization). It
pvalueof A (this
p
1
follows that 2 = 2 9 + 73 = 2/ 9 73. We conclude
s
9
+
73
1
= 9 + 73 6.203.
K(A) = kAk2 kA k2 =
9 73
2 2
Exercise 11.34: Equality in perturbation bound (TODO)
(b) For = 0, we get the same solution as in (a). For 6= 0, however, the solution
to the system
3
3 + x1
7
=
3 + 3 + 2 x2
7 + 2
is
0
1 3 + 2 3
x1
7
2 1
= 2
=
.
1
x02
3
7 + 2
3
b ,
2 = 1 = A
1
1
2
2
which shows that the solution from (a) is more accurate.
78
CHAPTER 12
D := diag(a11 , . . . , ann ),
t R,
for the affine combinations of A and its diagonal part D. Let t1 , t2 [0, 1], with
t1 < t2 , so that A(t1 ), A(t2 ) are convex combinations of A and D. For any eigenvalue
of A(t2 ), we are asked to show that A(t1 ) has an eigenvalue such that
C 2 kDk + kA Dk .
(?)
| | C(t2 t1 )1/n ,
In particular, every eigenvalue of A(t) is a continuous function of t.
Applying Theorem 12.8 with A(t1 ) and E = A(t2 ) A(t1 ), one finds that A(t1 )
has an eigenvalue such that
11/n
| | kA(t1 )k + kA(t2 )k
kA(t2 ) A(t1 )k1/n .
Applying the triangle inequality to the definition of A(t1 ) and A(t2 ), and using that
the function x 7 x11/n is monotone increasing,
11/n
| | 2kDk + (t1 + t2 )kA Dk
k(A D)k1/n (t2 t1 )1/n .
Finally, using that t1 + t2 2, that the function x 7 x1/n is monotone increasing,
and that k(A D)k 2kDk + 2k(A D)k, one obtains (?).
Exercise 12.6: Nonsingularity using Gerschgorin
We compute the Gerschgorin disks
R1 = R4 = C1 = C4 = {z C : |z 4| 1},
R2 = R3 = C2 = C3 = {z C : |z 4| 2}.
Then, by Gerschgorins Circle Theorem, each eigenvalue of A lies in
(R1 R4 ) (C1 C4 ) = {z C : |z 4| 2}.
In particular A has only nonzero eigenvalues, implying that A must be nonsingular.
Exercise 12.7: Gerschgorin, strictly diagonally dominant matrix
Suppose A is a strictly diagonally dominant matrix. For such a matrix, one finds
Gerschgorin disks
(
)
X
Ri = z C : |z aii |
|aij | .
j6=i
79
P
Since |aii | >
i, the origin is not an element of any of the Ri , and
j6=i |aij | for all S
S
S
therefore neither of the union Ri , nor of the intersection ( Ri ) ( Ci ) (which is
smaller). Then, by Gerschgorins Circle Theorem, A only has nonzero eigenvalues,
implying that det(A) = det(A 0 I) 6= 0 and A is nonsingular.
Exercise 12.12: -norm of a diagonal matrix
Let A = diag(1 , . . . , n ) be a diagonal matrix. The spectral radius (A) is the absolute
value of the biggest eigenvalue, say i , of A. One has
kAk = max kAxk = max max{|1 x1 |, . . . , |n xn |} (A),
kxk =1
kxk =1
n2
X
k=1
arithmetic
operations.
Pn2 2 This sum can be computed by either using the formulae for
Pn2
k
and
k=1 k , or using that the highest order term can be found by evaluating
k=1
an associated integral. One finds that the algorithm requires of the order
Z n
10
N
4(n k)2 + 4n(n k) dk = n3
3
0
arithmetic operations.
Exercise 12.17: Number of arithmetic operations
The multiplication v*C involves (n k)2 floating point multiplications and about
(n k)2 floating point sums. Next, computing the outer product v*(v*C) involves
(n k)2 floating point multiplications, and subtracting C - v*(v*C) needs (n k)2
substractions. In total we find 4(nk)2 arithmetic operations, which have to be carried
out for k = 1, . . . , n 2, meaning that the algorithm requires of the order
N :=
n2
X
4(n k)2
k=1
80
arithmetic
operations.
Pn2
Pn2 2 This sum can be computed by either using the formulae for
k=1 k and
k=1 k , or using that the highest order term can be found by evaluating
an associated integral. One finds that the algorithm requires of the order
Z n
4
N
4(n k)2 dk = n3
3
0
arithmetic operations.
4
1
A=
0
0
1
4
1
0
0
1
4
1
0
0
,
1
4
= 4.5.
Applying the recursive procedure described in Corollary 12.21, we find the diagonal
elements d1 (), d2 (), d3 (), d4 () of the matrix D in the factorization AI = LDLt ,
d1 () = 4 9/2 = 1/2,
d2 () = 4 9/2 12 /(1/2) = +3/2,
d3 () = 4 9/2 12 /(+3/2) = 7/6,
d4 () = 4 9/2 12 /(7/6) = +5/14.
As precisely two of these are negative, Corollary 12.21 implies that there are precisely
two eigenvalues of A strictly smaller than = 4.5. As
det(A 4.5I) = det(LDLt ) = d1 ()d2 ()d3 ()d4 () 6= 0,
the matrix A does not have an eigenvalue equal to 4.5. We conclude that the remaining
two eigenvalues must be bigger than 4.5.
81
2
show that this implies that 5 + 24 < dn,k+1 10. First observe that 5 + 24 =
25+10 24+24 = 49+10 24. From Equations (1.4) weknow that dn,k+1 = 101/dn,k ,
which yields dn,k+1 < 10 since dn,k > 0. Moreover, 5 + 24 < dn,k implies
1
1
50 + 10 24 1
=
dn,k+1 = 10
> 10
= 5 + 24.
dn,k
5 + 24
5 + 24
Hence 5 + 24 < dn,k+1 10, and we conclude that 5 + 24 < dn,k 10 for any n 1
and 1 k n.
(b) We have A = LDLT with L triangular and with ones on the diagonal. As a
consequence,
det(A) = det(L) det(D) det(L) = det(D) =
n
Y
di > 5 +
n
24 .
i=1
A
is symmetric, it admits an orthogonal diagonalization A
=U
TD
U.
Let
(b) Since A
1
2 T
T
E := U D U . Then E, as the product of three nonsingular matrices, is nonsingular.
12 U, since
Its inverse is given explicitly by F := UD
12 UUT D 21 U
T = UD
12 D 12 U
T = U
U
T = I
FE = UD
=U
TD
U
ET AE = UD
Similarly B = UT DU implies UBUT = D, which yields
21 UBUT D 21 U
T = UD
21 D 12 D 12 D 12 U
T = I.
ET BE = UD
We conclude that for a symmetric matrix A and symmetric positive definite matrix B,
the congruence transformation X 7 ET XE simultaneously diagonalizes the matrices
A and B, and even maps B to the identity matrix.
82
function k=count(c,d,x)
n = length(d);
k = 0; u = d(1)-x;
if u < 0
k = k+1;
end
for i = 2:n
umin = abs(c(i-1))*eps;
if abs(u) < umin
if u < 0
u = -umin;
else
u = umin;
end
end
u = d(i)-x-c(i-1)2/u;
if u < 0
k = k+1;
end
end
(b) Let A = tridiag(c, d, c) and m be as in the Exercise. The following Matlab program computes a small interval [a, b] around the mth eigenvalue m of A and returns
the point in the middle of this interval.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
(c) The following table shows a comparison between the values and errors obtained
by the different methods.
83
method
exact
findeigv
Matlab eig
value
error
0.02413912051848666
0.02413912051848621
0.02413912051848647
16
0
4.44 10
1.84 1016
84
CHAPTER 13
The QR Algorithm
Exercise 13.4: Orthogonal vectors
In the Exercise it is implicitly assumed that u u 6= 0 and therefore u 6= 0. If u and
Au u are orthogonal, then
0 = hu, Au ui = u (Au u) = u Au u u.
Dividing by u u yields
u Au
= .
uu
Exercise 13.14: QR convergence detail (TODO)
85