Professional Documents
Culture Documents
Author(s): G. W. Stewart
Reviewed work(s):
Source: SIAM Review, Vol. 19, No. 4 (Oct., 1977), pp. 634-662
Published by: Society for Industrial and Applied Mathematics
Stable URL: http://www.jstor.org/stable/2030248 .
Accessed: 16/07/2012 14:32
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Society for Industrial and Applied Mathematics is collaborating with JSTOR to digitize, preserve and extend
access to SIAM Review.
http://www.jstor.org
SIAM
REVIEW
AtAAt=At,
(li.b)
AA tA =A,
(1. lc)
(li.d)
(AtA)H=AtA.
investigated
have been extensively
and itsgeneralizations
The pseudo-inverse
is thatit
in thepseudo-inverse
and widelyapplied.One reasonforthisinterest
in
constructions
thesuccinctexpression
geometric
of someimportant
permits
and
withthepseudo-inverse
space.Thispaperwillbe concerned
n-dimensional
ontoa subspace
projection
theorthogonal
constructions:
tworelatedgeometric
andthelinearleastsquaresproblem.
The orthogonal
projectiononto a subspaceX is the uniqueHermitian,
from
P whosecolumnspace[denotedbyR (P)] isX. It follows
matrix
idempotent
(1.la) thatthematrix
PA =AAt
RA=AtA
is theprojection
ontoR/(AH), therowspaceofA.
of
isthesolutionofthelinearleastsquaresproblem
Thesecondconstruction
choosinga vectorx to minimize
p(x) = ||b-AxII2,
denotestheusualEuclideannorm.Thesolutions
whereb isa fixedvectorand11112
ofthisproblemaregivenby
Z
x = A tb a (I-RA.
(1.4)
(1.3)
ON THE
PERTURBATION
OF PSEUDO-INVERSES
635
norm.
It followsthatx = A tbis theuniquesolutionof(1.3) thathas minimal
inA onA
ofperturbations
Theobjectofthispaperistodescribetheeffects
ontoR (A), and
ontheprojection
onPA,andonA tb;i.e.,onthepseudo-inverse,
are
on the solutionof the linearleast squaresproblem.Such descriptions
tools.
forthreereasons.First,the resultsare usefulmathematical
important
the elementsof A willseldombe known
Second,in numericalapplications
inA.
oftheuncertainties
anditis necessary
tohaveboundson theeffects
exactly,
and leastsquares
projections
processesforcomputing
Finally,manynumerical
had been performed
on a perturbed
solutionsbehaveas ifexactcomputations
and
whosesizedependson thealgorithm
A + E, whereE is a smallmatrix
matrix
usedinitsexecution.
thearithmetic
bounds,
We shallbe concernedwiththreekindsof results:perturbation
The perturbation
andderivatives.
boundsare neededin
expressions,
asymptotic
and derivatives
are
above.Asymptotic
expressions
theapplications
mentioned
is actuallyknown.Moreover
usefulcomputational
toolswhentheperturbation
bounds.Not
theycan be used to checkthe sharpnessof the perturbation
bound
sharpperturbation
to obtaina reasonably
itis ratherdifficult
surprisingly
forms
Asymptotic
oftheperturbations.
thattellsthecompletestoryoftheeffects
are easierto comeby.
andderivatives
webeginin? 2 witha
In ordertomakethissurvey
reasonably
self-contained,
In ? 3 wedeveloptheperturbation
theory
for
background.
reviewofthenecessary
in ? 4 fortheprojection
PA,andin ? 5 fortheleastsquares
thepseudo-inverse,
solutionA tb.
inversesee the
on thegeneralized
Notesand references.
For background
andRao and
and
Odell
(1971),
Ben-Israel
and
Greville
(1974),
Boullion
booksby
is
due
whose
papers
Mitra(1971).Theexpression
to
Penrose
(1956),
(1955),
(1.1)
inthepseudo-inverse.
interest
initiated
thecurrent
andrelatedprobon perturbation
forpseudo-inverses
theory
Manyarticles
To datethemostcompletesurveyof the
lemshaveappearedin theliterature.
andunifying
problemhas beengivenbyWedin(1973). In additionto collecting
thispaperwillpresentsomenewresults.
earliermaterial,
2. Preliminaries.
thispaperwe shalluse the notationalconventions
Notation.Throughout
matrices
aredenotedbyuppercaseitalicand
ofHouseholder
(1964).Specifically,
andscalarsbylowercaseGreek
vectors
Greekletters,
bylowercaseitalicletters,
C" thesetofcomplex
ThesymbolC denotesthesetofcomplexnumbers,
letters.
AH is the
The matrix
and Crxn thesetof complexmx n matrices.
n-vectors,
ofA. The columnspaceofA is denotedbyR(A), and its
conjugatetranspose
byR (A)'.
complement
orthogonal
A e Crxn with
witha fixedmatrix
We shallbe concerned
rank(A) = r.
636
G. W. STEWART
UA V= (
(2.1)
(Ell
E2.
(B21
UH
E12
E22
Bl
B12\=
B22
(All+Ell
E21
E12
E22
A, B, andE, andinthe
ofthematrices
willbe calledreducedforms
Theseforms
form.In thiscase,
reduced
in
are
the
matrices
that
assume
shall
often
we
sequel
is givenby
thepseudo-inverse
A =(
(2.2)
lO
A11
, r),
02,
where
.
cYli> 0
r>?
A, and
ofthematrix
valuedecomposition
Thisreducedformis calledthesingular
the
and
(2.2)
valuesofA. Fromtherelation
thenumbers
o-iarecalledthesingular
factthat(UHAV)t= VHAtU, it followsthat
At=
V(X
UH.
A, whichwillbe denotedbyo-i(A),canbe
valueofa matrix
The ithsingular
intheform
written
(2.3)
ai(A)=
sup
dim(,t)= i xFET
n),
= 1
lIX112
where
(2.4)
= _-1YH
IIYI12
providesa naturalconvention
is theusualEuclideannorm.Thischaracterization
for numberingthe singularvalues of a rectangularmatrix:A e Cmxn has n
ON THE
637
OF PSEUDO-INVERSES
PERTURBATION
Two inequalities
thatwe shallneedin thesequelfollowfairly
from
directly
(2.3). Theyare
o-i(A ) - o-,(E) _-o-i(A + E)
o-ci
(A ) + o-i(E)
o-i(AC)o-'io(A) o-,(C),
o-1(A)rri(C).
and
(2.5)
theconditions
thatsatisfies
1. A0#
2. IIaAII=IaIIIAII,
(2.6)
3.
A norm
IIAII>O,
IA+BII?II||AII+IIBII.
is unitarilyinvariantif
IIUHAVII= I|A||
matrices
U andV.Theperturbation
boundsinthispaperwillbe cast
forallunitary
willnowbe described.
intermsofunitarily
invariant
whoseproperties
norms,
valuedecomposimatrices
thesingular
Let U and V be theunitary
realizing
A e CmXn.Thenforanyunitarily
invariant
norm11
tionofthematrix
-IIm,n
(2.7)
IIAIIn =
l( o)|m
=
IUHAVIIm,n
ThusIA IlIm,n
isa function
valuesofA, say
ofthesingular
= (Pm,n(LTi,02,
IIAIIm,n
(2.8)
,,n)
0 -<0- <i
(i = 1, 2,
n)
4>
Pm,n(01,
0in)
n(0l1
-0 m,
OSn)
isgenerated
We shallsaythatthenorm11
by Om,n.
-IIm,n
An important
normis thespectralnorm11112generatedbythefunction
sp
defined
by
(02((Tl, (a2,
an) = max{I|-1I,
.
L}
Ioan
sup IIAxII2,
IIA112=
IIX 112= 1
where11'112on therightdenotestheEuclideannormdefined
by(2.4).
638
G. W. STEWART
relationwithother
an important
consistency
The spectralnormsatisfies
normgenerated
byp0,
thenit
If11|Iisa unitarily
invariant
unitarily
invariant
norms.
followsfrom(2.5) and (2.9) that
(2.11)
|ICDII1C1|C11211DII
II,CIIIIDI12
e CrmXn andrespectively
whenever
E C?mxnor CIIe
IICDII
E CmXn
lIDII
normis the Frobeniusnorm
A secondexampleof a unitarily
invariant
generated
bythefunction
,
(PF(01,
On)
(1
+ n
Foranymatrix
A E Cmxn
IIAII=
1
i=F
j=l
Ia21I=trace(AHA).
theconsistency
relation
The Frobeniusnormsatisfies
| |ICIIFIID
IIF.
IICDIIF
weshallwork
dimensions,
ofvarying
Sinceweshallbe dealingwithmatrices
CmXn.
It
isimportant
witha family
defined
on
invariant
norms
=
ofunitarily
U??,n1
withone another
properly.
Accordinteract
thattheindividual
normsso defined
definition.
ingly,
we makethefollowing
invariant
DEFINITION
2.1. Let |I|11 U??,n=1 Cmxn-->R be a familyof unitarily
ifthereis a symmetric
function
defined
generated
norms.
Then1111is uniformly
p0,
such
that
with
a
finite
number
of
nonzero
forall infinite
terms,
only
sequences
= SD
(A), 02(A) ,
(o-1
IIA11
on (A), O,O,)
if
forall A E CtmXnIt is normalized
lxII= IX112
as a matrix.
foranyvectorx considered
theconditions
(2.6). Any
mustsatisfy
Thefunction
spintheabovedefinition
Indeedwe have
canbe normalized.
normdefined
bysucha function
p(P-1(X), 0, 0, ' ' ') =
0, 0, ..),
p(1IX112,
= Ui|X
of
,u thatis independent
thatllx
112
forsomeconstant
fromwhichitfollows
11
'p
thedimension
of
then
the
normalized
family
generates
ofx. The function
A1S
norms.
A uniformly
First,since
family
ofnormshassomeniceproperties.
generated
thenonzerosingular
valuesofa matrix
anditsconjugatetranspose
arethesame,
we have
=
llxii
IIA HII =
ii.
isbordered
i.e.,
itsnormremains
unchanged;
byzeromatrices,
Second,ifa matrix
ON THE
PERTURBATION
OF PSEUDO-INVERSES
639
In particular
ifA is in reducedform,
then
and IIAtII=11A-11.
IIAiI=IIA11j1
Itisalsoa consequence
of(2.12)that(2.11)holdsfora uniformly
generated
family
ofnorms
whenever
theproduct
CD isdefined,
as maybe seenbybordering
C and
D withzeromatrices
untiltheyarebothsquare.
A thirdproperty
is thatif1I is normalized
then
IIA
IA 112
11
(2.13)
In factfrom(2.11) andthefactthatlxi
we have
I= 11xI12,
= IIAx
A
(2.14)
IIAxII2
= 11AIIX112
11
is thesmallest
number
forallx. Butby(2.10) h|All2
forwhich(2.14) holdsforallx,
fromwhich(2.13) follows.A trivialcorollary
of (2.11) and (2.13) is thatliiiis
consistent:
IICDII
=II|CIII
IDI|.
Finallywe observethat
< IIDxII2
4 GC||
VxIICxII2
= IDII.
(2.15)
To provethisimplication
notethatby(2.3) thehypothesis
impliesthato-j(C)?
IICII=
|IIDIIfollowsfrom(2.9).
oir(D).Hencetheinequality
In thesequel 1111
willalwaysreferto a uniformly
generated,
normalized,
unitarily
invariant
norm.
Perturbationof matrixinverses.We shall later need some resultson the
in the
inversesofperturbations
of nonsingular
matrices.
Thesearesummarized
theorem.
following
THEOREM
(2.16)
JIB-1 A 1l/IIA
-'11?
IhEIIIIA
j,
where
(2.17)
Ic= IIAl!'IB-112.
If A is nonsingularand
(2.18)
IIA-11121IEII< 1,
thenB is a fortiori
nonsingular.In thiscase
(2.19)
IIAII/y,
JIB1=C
and
(2.20)
JIB-'-A-11K IIEll
IIA-ll1 =y IIA1'
where
(2.21)
K =
IIA-1112
IIA11
640
G. W. STEWART
and
y=
1- K
IE|II/IIAII> 0.
Likewise
I-PA.
RA-I-RA
willdenotetheprojection
ontoR(AH)'.
WhenA is in reducedform,
itsprojections
can be easilywritten
out:
PA(
0)eC
(OA
RA=(
mxm,
RA
)Cnxn.
)E?
=(O
It followsthat
= IIA11II
IIPAARA11
and
11E1111,IIPAERAl=
IIPAERAI
1=
IIPAERA
11= 11E2111,
IIPAERALII
IIE1211,
IIE2211.
Theseidentities
enableus to pass fromresultsforthereducedformto general
resultsstatedintermsofprojections
ofA andE.
We shallneed some properties
later.These are
of normsof projections
in thefollowing
summarized
theorem.
THEOREM
2.3. ForanyA and B thefollowing
statements
aretrue.
1. If rank(A) = rank(B), thenthesingular
valuesofPAPB andPBPA arethe
sameso that
IIPAPBI I = IIPBPAJI
Moreover
thenonzero
valuesa-ofPAP' correspond
topairs+o- of
singular
eigenvalues of PB
- PA,
so that
IIPB -PA112
= IIPAPBII2 = IIPBPAiI21
IIPBPA'I
IIPBPA
11
ON THE PERTURBATION
OF PSEUDO-INVERSES
641
a
however,
in theliterature;
Proof.Proofsofparts1 and2 arereadilyfoutid
ofpart1 is givenintheAppendix
decomposition,
proof,basedon a usefulmatrix
to thispaper. For part3 writePB = P1+ P2 whererank(P1)= rank(A) and
to 92(A)). Then
PAP2=0 (i.e.,R (P2) is orthonormal
IIPAPBI = IIPA(I
= JP1PAJJ,
frompart1. Nowforanyx
thelastequalityfollowing
II'
JJP1PAX
C IIPBPAX
11
andtheresultfollowsfrom(2.15). 0
in termsofE.
WhenB = A + E, we canestimateIIPBPAII
intheform
PBPA can be written
THEOREM 2.4. Theproduct
PBPA = (B')HRBEHPA.
(2.22)
Hence
(2.23)
IIPBPAll_-IIB'11211EII,
||A JJ2}IIEII.
_ min{j1Btjj2,
IIPBP'll
(2.24)
Proof.We have
= (Bt)HEHPR
(Bt)HBH(B')HEHPA
= (Bt)HRBEHPA,
642
G. W. STEWART
THEOREM
(2.25)
Weshallusethereducedforms
ofA andB. Firstsuppose(2.25) holds.
Proof.
")]R
But
R(A)
[)
P BlllQ
J
E2109
PHE12
E22J
Thefirst
rowandcolumnofPHB,1 Q iszero.IfE21q ? 0,thenthenonzerovector
(E21q)
ON THE
PERTURBATION
OF PSEUDO-INVERSES
643
The treatment
ofunitarily
invariant
normsin finite
dimensional
spaceshas
oftenbeena littlesloppy.In infinite
thereis usuallyonlyone
dimensional
settings
spaceand one generating
function,
and thesameis truein a finite
dimensional
setting
whenoneis concerned
withsquarematrices.
However,whenone considersrectangular
matrices
withvarying
dimensions,
different
normscanbe usedfor
different
and thereis no reasonwhythesenormsshouldinteract
dimensions,
nicely.Howbad things
cangetisillustrated
bythefamily
ofnorms11 defined
for
A eCmxn by
=- IIA112.
IIA11
n
is an acuteperturbation
ofA. All thesetheoremsare based on expressionsforBt,
whichalso yieldasymptoticexpressionsforBt and expressionsforthederivative
of At.
Lowerbounds.Beforeproceeding
to obtainboundson IlBt-Atll,we shall
(3.1)
IlBt-AtlI2
_1/h1Ell2.
(3.2)
IIBtII2?1/h|Ell2.
644
G. W. STEWART
Proof.Suppose fordefiniteness
thatrank(B) ' rank(A). Then thereis, say,
workwithA H and
a vectory E R (B) withIY
Y12 = 1 suchthaty E R (A)' (otherwise
BH). Thus
1 = yHy
= yHPBy
= YHEBtYCIIEII2IIBtyII2,
Fromthisandthe
whichshowsthatliBty|12,
andhencel|Bt|12
isnotlessthan1/h|Ell2.
112
11
1 IBtyII
112(Bt-At)Y
CIBt-AtIl.O
=
hJEll2
Bt-At=-BtPBERAAt+BtPBPA-RBRAAt,
(3.4)
Bt-At= -BtPBERAAt+(BHB)tRBEHPA'
R BEHPA(AA
H)t.
- At C
IlBt
max{IIAt112,
IIBtII2}IIEII,
Frobenius
of theproofgivenbyWedin (1973).
Proof.The proofis a slightmodification
We shall give onlythe proofforthe Frobeniusnorm.
ON THE
PERTURBATION
645
OF PSEUDO-INVERSES
(3.5)
+ IIPBP1
112).
+ F2112
tPA112
IIF1
?- IBtII2(IIPBEA
But fromTheorems2.4 and 2.5
|
t|2 + IIPBPA
IF
IIPBEAtpA112+ IIPBPAIIF
IIPBEA
+ IIP_EA t2 = IIEAtl2
IIPBEA tII2
Hence
IIEIIIIA
t1l2
+ F2IIF
C-IAtII2IIBtII2IIEIIF.
JIF1
(3.6)
Also fromTheorem2.5
(3.7)
=
RBRAIF= IIAtI2IIRAR
BIIF
IIF3IIFI|AtII2IR
tERBIIF
C IIAtII2IIEIIF,
= IIA
tI2lIA
tIl21lBtll2llEII.
|lBt- Atll AIIA
where
table.
A is giveninthefollowing
(3.8)
>
X X1
Arbitrary
Spectral
Frobenius
rank(A)<min (m,n)
rank(A) = m $ n =min(m,n)
di
rank(A)=m=n
(1+14)/2
(lBt
(3.9)
- AtIIc
|EIl
Al
/A1K
~~~~~~~IIBtII2
646
G. W. STEWART
where
K
= IAJJ
IIA'I12
B-OA
[ 1 + _2(F)9] 1/2]
ao-
(1 + ao2)1/2= (1 +
2)1/21
we have
a-1
q (aF)
aq, (F).
+ o (|IFII).
q, (F)= IIFII
For largeF, r,P(F)is bounded:
qls,(F)
Ir
Ir1.
Finally,forthe spectralnorm
+2(F)= IIFII2/(1
+IIFII2)1122
ON THE
PERTURBATION
OF PSEUDO-INVERSES
647
(F)
satisfies
1 Ftl
(3.11)
and
(3.12)
II()-V (I
) ||=Q(F) .
Proof.It is easilyverifiedthat
(I)
(3.13)
(I+FHF)-(l F(
11
_2(F)]2
fromwhich(3.11) follows.Also if
G = (F)
- (I
0),
then
GGH
I(I+
FHF)1.
whichestablishes(3.12). [
The mainresultis based on an explicitrepresentation
of Bt. We shallwork
withthe reducedformsof A and B.
3.7. Let B be an acuteperturbation
THEOREM
ofA. Then
(3.14)
Bt= (I
F12)tBI1 (F1)t,
where
F21= E21B l1,
F12= Bl1E12.
I
(B) =[(
:)
648
G. W. STEWART
(E22
Since
B11(B1E12)
= E12, we musthave
(E12\
B1 Bil
E22)
E21)
E12,
fromwhichit followsthat
B =(F)Bi1(I
(3.15)
F12).
1F12,
P'ERA
= IIAIIIB1I112.
Then
(3.16)
(IA
IIKAI
tII
IAII
IiI(K4
+l
IIAIIJ
IAII
J21
I?)
I12
( I)
(Ir
0),
J12 = (Ir
F12).
Bt-A t= (Jt12
-It12)A
11It + Jt12A
11(Jtl-Itl) +J1t2(B
11-A 11)Jt1
- A 11?J
IIJt12(B11
IIA1IK11F1111
IIA11II'
ON THE
PERTURBATION
OF PSEUDO-INVERSES
649
By Lemma 3.6
(3.19)
(F12) = IA
(B 1E12)
IIA1l'IIqi,
11IIq,
11111Q(
IIA11)
and likewise
(3.20)
1Jtf2A
-I1t)II
(Jtl1
The bound (3.16) followson combining(3.17), (3.18), (3.19), and (3.20) and
thatIA11i= IIAtll.
remembering
However, the bound additionallyshows thatE12 and E21 can have at most a
boundedeffect
on JIBt-Atll.
When A is square and nonsingular,E12 and E21 are void, and the bound
reducesto thatof Theorem2.2. Note thatthe numberi, definedin analogywith
(2.17), playsan analogous role here.
As in the second part of Theorem 3.2, if E1l is sufficiently
small,we can
in termsofI|A11112
and ||Ell.Thisgivesthefollowing
estimate
IIBilI112
corollary.
In
Theorem
COROLLARY
3.9.
3.8, let
(3.21)
K =
IIA11
IIAtII2
and supposethat
1,
IlAtIJ21 JE1111<
so that
y
Then
(3.22)
IlBtll IIAtjj/^y
and
IIBt-AtIIK 11F111 E2
1 i~(~A)+
ation
f BI1f1
vhIAB11
Proof. FromIItI
teeA
(3.23)
____K
hav
IIA
1P'q'P
E12\
650
G.
W. STEWART
By Theorem2.2
IIA-111/yIIAtil/y
IIB-111
whichestablishes(3.21). Also X ?K/y, and (3.23) followsfrom(3.16).
small,
The numberK is definedin analogywith(2.21). For E1l sufficiently
KX- , and (3.16) and (3.23) give essentiallythe same bound.
Asymptoticformsand derivatives.AsymptoticformsforB maybe obtained
fromeither (3.4) or (3.14). Of course for Bt to approach At we must have
small,B may
rank(A) = rank(B); and sincewe are assumingthatE is arbitrarily
be assumedto be an acute perturbationof A. In thiscase
Bt = At +O(JJEJJ),
and
PB =BBt=
(A +E)[At+O(IIEII)]=PA +O(JJEJJ)
Bt =At-AtPAERAAt+(AHA)tRAEHP
_RFEHPA(AA
H)t
+ O(JJE12).
functionof rwith
from(3.24) thatifA (r) is a differentiable
It followsimmediately
rank[A (r)] = rank[A (r')]
functionof r and
forall r,thenA (-r)tis a differentiable
(3.25)
dAt
d,r
dA
t+ (AHA)tRA
-A tPA dRAA
dr
dAH
dAH
PA(AA)
J PA-RA
dr
ar
Ht
(I A 1E[H) + O(IE1111JE2111)
and
(I F12)t
F~~
(12A
11)
+ O(II,1F11111F1211).
Hence from(3.14)
(AHAA11)1E2H+ O(IF1111I1JE2111)
EH(A
HAllA AH)-lE H
+ O(JJF11E12
IIE2111)
IE1211
ON THE
PERTURBATION
OF PSEUDO-INVERSES
651
Theorem
3.8. Then
E21112/
IIPBPA 112-+_
[1 + (,kIIE21JJ2/IIA112)212<1
(K
A]1 )2]1/2 <1
(4.1)
matxF21
The matrix
R()=R
I)
652
G. W. STEWART
PB-
PA
(F21(I+F2HF2D)-1
_ ((I+FHF21)-1-I
(I+FHF21)D1F2HA
F21(I+F2AF21)21F2I'
fromwhichit is easilyverifiedthat
(4.3)
(PBP
)2
(FHF21(I+FAHF21)-1
r2(F21)/[l
'
= IIF21112K11E21112/IIA 112
o(F21)
In terms of projections,
112
I?IIPAERA112/IIA
1 + (KZIIPAERA
11/2
112/IIA
112)2
- PA11
IIPB
2<!:[
The bound is interesting in several ways. First it depends not at all on E12 and E22.
Second its dependence on E1l is only through the constant Z. Third the bound is
always less than unity. Finally, it goes to zero along with E21. We may summarize
this last observation in the following corollary.
4.2. Regarding B as variable, a sufficient condition for
COROLLARY
lim PB
B -A
= PA
lim PABRA = 0
B-+A
nay replace iX by
(4.4)
_
-pA
PB-A
(I E2 1112)
F2H, + 0 (I E2 1113)
+ O(IE21113) O(I1E21112)
FF21
J
In terms of projections
(4.5)
PB = PA + PAERAA
+A
RAEHPA
O (IIPPAERA
112).
ON THE
PERTURBATION
OF PSEUDO-INVERSES
653
(4.6)
dr
p
PA
dARtRdAH
RAAt+At
dr
RA-PA.
dr
+ PBERAA.
(b2
X1=A -ib1
and
X2 = 0.
r= b - Ax
is givenby
11r112=11b2112.
= 7IIAII1211XI12
11b1112
Sincebi = A 1xi, we havei- 1. Also llx_IIAIAtll
IIbill,whichshowsthat
(5.2)
-' K77|IPAk
IIh112/IIX
112/IIPAb
112
112
654
G. W. STEWART
(5.3)
But1JxJJ2=
12,whichcombinedwith(5.3) yields(5.2). [
r12 IIblI2/IIA
Theorem5.1 showsthattheperturbation
inx is determinedbytheprojection
of k ontoR (A). However,PAk is normalizedby I0PAb
12,and ifthislatterquantity
is small,the perturbationmaybe large.Since
Ilb112=
IIPAbI11 + JJrJJ,2
THEOREM
5.2. Let x = Atb and x + h = Btb, whereB = A +E is an acute
perturbation
ofA. Then
(54)
~~Jh1J2JIE11JJ2
jIAIIj2
(5X412
FE
J
~IF2k(Jb2JJ2
ffF21112~
\T1b1112
FA12 IjAII2
IJAII2)
Proof.Write
(5.5)
Then
(5.6)
)bl2-'
IfJJ2(Bl1-Ai
2II11XI2,
112
IIA
and
(5.7)
(V12 -It2)A
-lbl2-
q12(l
A 112 IIX112-
Now
- I]bl
(5.8) Jt2B l l (Jt1- It 1)b= Jt2B11[(I+F2H1F2l)-l
To bound the firsttermin (5.8), note that
Hence
(I+F2lF21)
-I=-(I+F2lF21)
(5.9)
+ FjF21y1)J2JJF2JJF21b1JJ2
= JIBiJ2JJ(I
JJB
=' JIBIIIE2II2iIIXII2
= [K
JJJJ2J2JJ21Blblb12
11
I2A1uIX12.
ON THE
PERTURBATION
655
OF PSEUDO-INVERSES
(5.10)
IIJ21B11
(I+F2HAF21)-1F21b2II2
C-IIBf11
121IIE211
11b211
11b2112
112
=
I2IIF21II2
lb1
IIB11
IIX121JAJ112
-1 AE
21112
11b211
|X112
The bound(5.4) followson combining
(5.5)-(5.10). [
Thefirst
in(5.4) areunexceptionable.
twoterms
Thefirst
termcorresponds
to
theclassicalresultforlinearsystems
and is theonlynonzerotermwhenA is
ThesecondtermdependsonPAERAandvanishes
squareandnonsingular.
when
A is offullcolumnrank,as itis in manyapplications.
The thirdtermrequiresmoreexplanation.
IftermsofsecondorderinIIE21JJ
areignored,
thisexpression
becomesessentially
(5.11)
-2 JJb2JJ2
JJE211122 tan JJF21112
K
77
IAil2
1IIb1II2
IAi1l2K71
where0 is theanglesubtended
byb and RY(A).The numberkq1tan0 can vary
from0 tox. It is smallwhen0 is small(i.e. theresidualvectoris small).It is also
is smallandx reflects
theill-conditioning
reducedinsizewhenlIE11112
ofA so that
1
Ki-1.
Whenx does not reflectthe ill-conditioning
7K
of A and 0 is
itisoforderK 2,thusmaking
thethird
termin(5.4)thedominant
significant,
one.
We haveboundedthethirdterminthedecomposition
(5.5) insucha wayas
itsbehaviorwhenE21 is small.In factitis boundedforallvaluesofE21,
toreflect
andthethirdtermin (5.4) maybe replacedby
-||b||2
Kl
11
E21\
+
IA1
and
r= b -
=Btb
=
PBb,
then
|r
- rI2 _
IIPB- PA112Ib
112
andIIPB
-PAI12canbe boundedby(4.1) inTheorem4.1.
In applications
inr; rather
inthe
one maynotbe interested
oneis interested
residualr ofx withrespectto thematrix
A:
=b -Ax.
Ifwe write
F-r = (PB-PA)b-Ei,
656
G. W. STEWART
then
An asymptotic
Asymptotic
formsand derivatives.
formfortheperturbed
leastsquaressolutionx^canbe obtainedfrom(3.4):
x= x -
(5.12)
An equivalent
whichmaybe usefulincomputational
asymptotic
formula,
work,
canbe derivedfromthereducedform(3.23).Thederivative
formula
correspondingto (5.12) is
dA
dx= -APtPA-R
dr
dr
AXRI
dAH
dr
dAH
dr
b
PAb
in
An inverseperturbation
theorem.
Theorem5.2 showshowa perturbation
A can affect
theleastsquaressolution.Here we considerthequestion:givena
is x^the least squaressolutionof a slightly
vectorx^,underwhatconditions
is givenin thefollowing
perturbed
problem?One suchcondition
theorem.
THEOREM
5.3. Letx E ECbe given.Letx = Atb, r= b-Ax, andr= b -Ax. If
llP1l
llr12l+
thenthere
is a matrix
E ofrankunitywith
2,
= /llx
IIEl2
112
(5.13)
-x^)c!(A).
SincerE R (A)',
11
11|2 +
|e112,
rVl
+ IlIe
2= llrll
2l.
A112
==. Let
whichshowsthatlIe12
E = e?x/ X2ll2
ThenE satisfies
(5.13) andR (E) c R (A). HenceR (A + E) c R (A). But
b -(A + E x = rER(A)l
ON THE
PERTURBATION
657
OF PSEUDO-INVERSES
A consequence
ofthistheorem
is thatthereis littleusehunting
fortheexact
minimizing
x. Provided
theresidualisnearlyminimal,
theapproximate
solution
x,
howeverinaccurate,
is theexactsolutionofa slightly
perturbed
problem.
It is sometimes
desirablethattheperturbation
matrix
E inTheorem5.3 not
altersomeofthecolumns
ofA (e.g.a columnmaybe datesinyears).Thiscanbe
done as follows.Let x be the vectorobtainedfromx^by settingto zero the
components
corresponding
to thecolumnsthatarenotto be disturbed.
Then
E
-eiH/I2
? I|X12
istherequired
sothatIIEll2 IEll2;howeverIIEll2
matrix.
Ofcourse11Xll|2
may
stillbe smallenoughforpractical
purposes.
Notesandreferences.
Muchoftheperturbation
theory
forpseudo-inverses
hasbeena byproduct
ofthesearchforboundsforthelinearleastsquaresproblem.
GolubandWilkinson
(1966) gavea first
orderanalysisoftheproblemandwere
thefirst
to notethedependenceof thesolutionon K2.Rigorousupperbounds
werederived
byHansonandLawson(1969),Pereyra
(1969),andStewart
(1969).
Wedin(1969) also givesbounds.More recenttreatments
have been givenby
LawsonandHanson(1974)andAbdelmalek
(1974).VanderSluis(1975)wasthe
first
to pointoutthemitigating
effect
ofrj in (5.11).
The inverseperturbation
is new.
theorem
Appendix.In thisAppendix,we shallgivea proofofpartone ofTheorem
2.3 thatisbasedon a generaldecomposition
ofunitary
a decomposition
matrices,
thatisofindependent
interest.
In establishing
thedecomposition,
weshallusethe
mA tomeanA E
notation
THEOREMA.1. Let theunitarymatrixWe Cnxn be partitioned
in theform
(W2 1
W2 2
r
n-r
and V= diag(V1,
n-r
V2)
such that
r
(A.1)
UHWV=
j
r
n-2r
r IF -1
r (1
r
n-2r O O
where
F = diag(y1,Y2,
Yr)
7r)
0-
and
L = diag(o1,
0-2, *
658
G.
W.
STEWART
Proof. Let
r =uHwJJv,
of W1,withthe diagonalelementsof F
be the singularvalue decomposition
...
= yr; i.e.,
<1 = Ykl = '
orderedso thatyl 2 C
F = diag(F', Ir-k),
The matrix
ofF' are lessthanunity.
wherethediagonalelements
W21
Hence
columns.
hasorthonormal
) VJ =rF2+(W2jV)H(W2jV).
I[= W(i)v]H[(
V1),whichsaysthatthe
so is (1W21
V1)H( W21
SinceI andF2 arediagonalmatrices,
SincetheithdiagonalentryofI- I2 is the
columnsof W21V1 are orthogonal.
of W21V,
k ' R _ n - r columns
normoftheithcolumnof W21V1,onlythefirst
k columns
are
matrix
whosefirst
arenonzero.Let U2 e C(nr)x(nr) be anyunitary
thenormalized
columnsof W21Vl. Then
U2
W2 1 Vl
where
1;= diag(oj,0(J2,*'
'
0) )diag
ok(,k, *
r-k
11;, O
Since
diag(,
U2 HW2l)1=
hasorthonormal
we musthave
columns,
(A.2)
yi2+Oi2=
(i =
1,2, * ,r).
X' is nonsingular.
In particular,
matrixV2 e C(n-r)x(n-r) such
a unitary
In a likemannerwe maydetermine
that
UlHW12V2=
(T, 0)
where T= diag (r1,r2,' ' ', Tr) and ri?0 (i = 1, 2, * *, r). Since, as above,
it
2
yi+ ri = 1, it followsfrom(A.2) thatT= -X.
Set U = diag (U1, U2) and V = diag (Vl, V2). Then theforegoingshowsthat
thematrix
x=
UHWV
ON THE
659
OF PSEUDO-INVERSES
PERTURBATION
in theform
canbe partitioned
k
r-k
(A.3)
X=
0/ o
O
I
k
r- k
n - 2r
r-k
n-2r
-it
0
0
0
0
X33
X34
X35
r-k
O0t
X43
X44
45
X53
X54
X55
we have
Sincecolumns1 and4 inthepartition
(A.3) areorthogonal,
'X34 = 0,
we have X34
and sincel' is nonsingular,
0. LikewiseX35,
X43,
Fromtheorthogonality
ofcolumns1 and3 in (A.3) itfollowsthat
0,
XIX33=
-FiX'+
fromwhichitfollowsthatX33 = F'.
X is thusseento havetheform
The matrix
r-k
Ft
O
o
I
-11
O
r k
n - 2i
r-k
X=
k
-
ot
n-2r
r-k
o
O
X44
X45
X54
X55
F/
'
The matrix
X5(4
X5s)
Set
is unitary.
U2= diag(Ik, U3)U2
and U= diag(U1, U2).Then
UHWV= diag(Ir+k,
k
r-k
n-2r
/F' 0 -'
0 I 0
0
O
k
r-k
X=
k
r-k
n-2r
U3)X,
r-k
IX
F'
0
0
of,
isprecisely
involved
thedimensions
ofthematrices
which,
considering
(A.1). 0
660
G. W. STEWART
= R (B) andletX'
To establish
partoneofTheorem2.3,letX = R (A) andON
and 9/' denotetheirorthogonal
complements.
Assumethat
r=dim (Q)=dim( /)?m/2
inassuming
sinceinthesequelwe
thelastinequality,
(thereisnolossofgenerality
can also workwith6V and 0"). Let X = (X1,X2) and Y= (Y1, Y2) be unitary
matrices
withR (X1)= X andR ( Y1)= O/.Let
W =XHY=
(W,i
W12)
be partitionedconformally
withX and Y. If U = diag(U1, U2) and V=
diag(Vl, V2)arethematrices
whoseexistence
isinsured
byTheoremA. 1 andwe
set
(i=1, 2),
X, = X, U
X=(Xl, x2)
and
Y=(Y1,Y2),
(i=1,2),
Yi=YyVI
then
Xi Y1 = FX Y2 = (-X, 0),
X2
Yi =
Y2(OI-r
0X2
\0/
(A.4)
Fr
\0
In -2rJ
(-z
??
\O
0/
Likewise
PBPA = (yl Yl')(X2X2') =
ON THE
PERTURBATION
661
OF PSEUDO-INVERSES
andthenonzerosingular
valuesofbothmatrices
areeasilyseentobe thenumbers
r-.Nowconsider
r2-I
PB
r:s
Fr
(F
=
PA=Y1Y1X1X1
h)
_z2
rs
5o
The nonzeroeigenvalues
aretheeigenvalues
ofthismatrix
ofthe2 x 2 matrices
(o'
(ri'Yi
im)
O'i
The matrixdecomposition
in TheoremA.1 has
Notes and references.
intheworksof
itisimplicit
notbeenexplicitly
statedbefore;however,
apparently
DavisandKahan(1970)andBjorkandGolub(1973).Thediagonalelements
ofF
arethecosinesofthe"canonicalangles"betweenthesubspacesR(A) andR (B)
andthecolumnsofX1 and Y1formbiorthogonal
basessubtending
theseangles.
The use ofthesecanonicalbases,particularly
whentheyhavebeentransformed
intotheforms
(A.4),oftenenablesonetoobtainroutine
computational
proofsof
to
geometrictheoremsthatwouldotherwiserequireconsiderableingenuity
establish.
REFERENCES
N. N. ABDELMALEK
(1974), On thesolutionof thelinearleastsquaresproblemand pseudo-inverses,
Computing,13, pp. 215-228.
S. N. AFRIAT (1957), Orthogonaland obliqueprojectors
and thecharacteristics
ofpairsofvectorspaces,
Proc. CambridgePhilos. Soc., 53, pp. 800-816.
A. BEN-ISRAEL
(1966), On errorboundsforgeneralizedinverses,SIAM J. Numer. Anal., 3, pp.
585-592.
A. BEN-ISRAEL
AND T. N. E. GREVILLE
(1974), GeneralizedInverses: Theoryand Applications,
JohnWiley,New York.
A. BJORK AND G. H. GOLUB (1973), Numericalmethodsfor computingangles betweenlinear
subspaces,Math. Comp., 27, pp. 579-594.
T. L. BOULLION
AND P. L. ODELL
(1971), GeneralizedInverseMatrices,JohnWiley,New York.
CHANDLER
DAVIS
AND W. M. KAHAN
(1970), The rotationof eigenvectors
bya perturbation.
III,
SIAM J. Numer.Anal., 7, pp. 1-46.
I. C. GOHBERG
AND M. G. KREIN (1969), Introduction
totheTheoryofNonself-adjointOperators,
AmericanMathematicalSociety,Providence,R.I.
G. H. GOLUB (1965), Numericalmethodsforsolvinglinearleastsquaresproblems,Numer.Math.,7,
pp. 206-216.
G. H. GOLUB AND J.H. WILKINSON (1966), Noteon theiterative
ofleastsquaressolution,
refinement
Numer.Math., 9, pp. 139-148.
and nonlinearleast
G. H. GOLUB AND V. PEREYRA
(1973), The differentiation
of pseudoinverses
squaresproblemswhosevariablesseparate,SIAM J. Numer.Anal., 10, pp. 413-432.
(1975), Differentiation
ofpseudoinverses,
separablenonlinearleastsquaresproblems,and other
tales,manuscript.
R. J.HANSON AND C. L. LAWSON (1969), Extensionsand applicationsoftheHouseholderalgorithm
forsolvinglinearleastsquaresproblems,Math. Comp., 23, pp. 787-812.
J.Z. HEARON AND J.W. EVANS (1968), Differentiable
J.Res. Nat. Bur. Stand.,
generalizedinverses,
Sect. B, 72B, pp. 109-113.
662
G. W. STEWART