Professional Documents
Culture Documents
CU PG-I
Anirban Kundu
Contents
1 Introduction: Two-dimensional Vectors
2.1
Dual Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3
Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4
3 Linear Operators
3.1
3.2
Projection Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3
4 Matrices
12
4.1
4.2
Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3
4.4
Degenerate Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5
Let us recapitulate what we learnt about vectors (for simplicity, consider 2-dimensional vectors in
the cartesian coordinates).
Any vector A can be written as
A = a1 i + a2 j ,
(1)
where i and j are unit vectors along the x- and y-axes respectively, and a1 and a2 are real
numbers. From now on, we will use the shorthand A = (a1 , a2 ) for eq. (1). This is called an
ordered pair, because (a1 , a2 ) 6= (a2 , a1 ). A set of n such numbers where ordering is important
is known as an n-tuple.
Two two-dimensional vectors A = (a1 , a2 ) and B = (b1 , b2 ) can be added to give another
two-dimensional vector C = (c1 , c2 ) with c1(2) = a1(2) + b1(2) .
We can multiply any vector A = (a1 , a2 ) by a real number d to get the vector D = dA.
The individual components are multiplied by d, so the magnitude of the vector increases by
a factor of d.
The null vector 0 = (0, 0) always satisfies A + 0 = 0 + A = A. Also, there is a vector
A = (a1 , a2 ) so that A + (A) = 0.
The scalar product of two vectors A and B is defined as
A.B =
2
X
ai bi .
(2)
i=1
We can also write this simply as ai bi with the convention that every repeated index is summed
over. This is known as the Einstein convention.
Let us consider an assembly of some abstract objects, which we will denote as |i. If we want to
label them, we might call them |1i or |ji. (You will see this object a lot in quantum mechanics; this
is called a ket. In fact, we will develop the idea of vector spaces keeping its application in quantum
mechanics in mind.) Let this assembly be called S. We say that the kets live in the space S. This
will be called a linear vector space if the kets satisfy the following properties.
2
(5)
Thus, 0|ai = |0i , |ai S, and we can safely write 0 for |0i.
This also defines subtraction of vectors,
|ii |ji = |ii + (1)|ji = |ii + |j 0 i ,
(6)
2.1
Dual Space
The scalar product of two vectors |ai and |bi in S is a number, denoted by ha|bi (the symbol h is
called a bra, so that h|i gives a closed bracket. The notation is due to Dirac.) The properties of
the scalar product is as follows.
1. ha|bi = hb|ai . Thus, in general, ha|bi 6= hb|ai, p
but ha|ai is real. Also, ha|ai 0, where the
equality sign comes only if a = 0. This defines ha|ai as the magnitude of the vector |ai.
2. If |di = |ai + |bi, then hc|di = hc|ai + hc|bi is a linear function of and . However,
hd|ci = ha|ci + hb|ci is a linear function of and and not of and .
If somehow ha|bi = hb|ai, the LVS is called a real vector space. Otherwise, it is complex. The
LVS of two-dimensional vectors is a real vector space as A.B = B.A. That of the complex numbers
(example 2 above) is a complex vector space.
In quantum mechanics, the space in which the wavefunctions live is also an LVS. This is known
as the Hilbert space 1 , after the celebrated German mathematician David Hilbert. We can indeed
check that the Hilbert space is an LVS; in particular, that is why the superposition principle in
quantum mechanics holds. The wavefunctions are, however, complex quantities, and the scalar
product is defined as
Z
1 2 d3 x .
h1 |2 i =
(7)
ij = 1 if i = j ,
(8)
(9)
(10)
2.2
Cauchy-Schwarz Inequality
(11)
(12)
ha|aihb|bi hb|aiha|bi .
(13)
Eq. (13) is known as the Cauchy-Schwarz inequality. For ordinary vectors, this just means
|A|2 |B|2 > |A.B|2 | cos | 1 .
(14)
{z
2Reh1|2i
2.3
h3|3i
h1|1i +
(using CS inequality) ,
(15)
h2|2i.
Metric Space
A set R is called a metric space if a real, positive number (a, b) is associated with any pair of
its elements a, b R (remember that a and b need not be numbers) and (1) (a, b) = (b, a); (2)
(a, b) = 0 only when a = b; (3) (a, b) + (b, c) (a, c). The number (a, b) may be called the
distance between a and b. The third condition is nothing but the triangle inequality.
Do
p not confuse (a, b) with ha|bi. In particular, (a, a) = 0 (where a is some point in the LVS)
but ha|ai (where |ai is a vector) defines the length or norm of that vector. For example, h|i = 1
means that the wavefunction has been normalized to unity. More precisely, if one thinks |ai as the
radius vector starting at the origin and ending at the point a, and similarly for b, then (a, b) is
the norm of the vector |ai |bi (or the other way round).
If we have three vectors |ai, |bi, and |ci in an LVS and we define
|1i = |ai |bi ,
(16)
then |1i, |2i, |3i satisfy the triangle inequality, and also the first two conditions of a metric space,
so we can say:
If the scalar product is defined in an LVS, it is a metric space.
Note that the scalar product need not be defined for all linear vector spaces, but we will not
discuss those spaces.
5
(17)
3. Similarly, in two-dimensional plane polar coordinate system, the separation between two
points (r, ) and (r + dr, + d) is
ds2 = dr2 + r2 d2 .
(18)
(19)
(20)
Note the minus sign. This is a special property of the space-time coordinates, called the
Minkowski coordinates. Special relativity tells us that this interval is invariant no matter
which inertial frame you are in. One can trace the extra minus sign in front of the spatial
coordinates to the fact that there are two distinct vector spaces for 4-dimensional vectors, one
dual to the other, unlike the self-dual nature of ordinary 3-dimensional space. When you take
a vector from one space and the dual of another vector from the dual space, there comes the
minus sign, because the dual vector is formed by keeping the time component of the original
vector unchanged, while flipping the sign of the spatial components 2
It is nontrivial to write (20) in the Einstein convention because of the relative minus sign
between time and space coordinates. How one deals with this is discussed in detail later.
2.4
ai |ii = 0
(21)
i=1
necessarily means all ai = 0. If there are at least two ai s that are nonzero, the vectors are called
linearly dependent.
The maximum number of linearly independent vectors in a space is called its dimension. If this
number is finite, the space is finite. If the number is infinite, the space is infinite too.
The three-dimensional space can have at most three linearly independent vectors, that is why
we call it three-dimensional. On the other hand, there are infinitely many independent states for a
2
It can be the other way around, only flipping the time component but keeping the spatial components unchanged.
The only thing that matters is the relative minus sign between the time and the space components.
particle in, say, an infinitely deep one-dimensional potential well (we take the depth to be infinite
so that all such states are bound; for a well of finite depth, there will be a finite number of bound
states and an infinite number of unbound states). When we expand any arbitrary function in a
Fourier series, there are infinitely many sine or cosine functions in the expansion, and they are
linearly independent 3 , so this is another infinite dimensional LVS.
If any vector |ai in an LVS can be written as
|ai =
n
X
ai |ii ,
(22)
i=1
the set of |ii vectors form a basis of the LVS. The number of basis vectors is obviously the dimension
of the space. We say that the basis vectors |ii span the space. The numbers ai are called components
of the vector |ai in the |ii basis (components depend on the choice of basis).
Given a basis, the components are unique. Suppose the vector |ai can be written both as
P
P
ai |ii and bi |ii. Subtracting one from the other, (ai bi )|ii = 0, so by the condition of linear
independence of the basis vectors, ai = bi for all i.
P
Starting from any basis |ai i, where i can be finite or infinite (but these basis vectors need not
be either orthogonal or normalized), one can always construct another orthonormal basis |ii. This
is known as Gram-Schmidt orthogonalization. The procedure is as follows.
1. Normalize the first vector of the original basis:
1
|1i = p
|a1 i .
ha1 |a1 i
(23)
Thus, h1|1i = 1.
2. Construct |20 i by taking |a2 i and projecting out the part proportional to |1i:
|20 i = |a2 i h1|a2 i|1i ,
which ensures
h1|20 i
= 0. Divide
|20 i
(24)
m1
X
hi|am i|ii .
(25)
i=1
It is easy to check that |m0 i is orthogonal to |ii, i = 1 to m 1. Normalize |m0 i to unit norm,
|mi = p
1
|m0 i .
hm0 |m0 i
(26)
Linear Operators
A function f (x) associates a number y with another number x according to a certain rule. For
example, f (x) = x2 associates, with every number x, its square. The space for x and y need not
be identical. For example, if x is any real number, positive, negative, or zero, f (x) is confined only
to the non-negative part of the real number space.
Similarly, we can assign with every vector |xi of an LVS, another vector |yi, either of the same
LVS or of a different one, according to a certain rule. We simply write this as
|yi = O|xi ,
(27)
to
and O is called an operator, which, acting on |xi, gives |yi. We often put a hat on O, like O,
indicate that this is an operator. Unless a possible confusion can occur, we will not use the hat.
We will be interested in linear operators, satisfying
O[|ai + |bi] = O|ai + O|bi .
(28)
(29)
A function f (x) may not be defined for all x; f (x) = x is not defined for x < 0 if both x and f (x)
are confined to be real. Similarly, O|xi may not be defined for all |xi. The set of vectors |xi S
for which O|xi is defined is called the domain of the operator O.
O|xi may take us outside S. The totality of all such O|xi, where |xi is any vector in S and in
the domain of O, is called the range of the operator O. In quantum mechanics, we often encounter
situations where the range is S itself, or a part of it. Well see examples of both.
The identity operator 1 takes a vector to itself without any multiplicative factors: 1|xi =
|xi , |xi S.
The null operator 0 annihilates all vectors in S: 0|xi = |0i = 0 , |xi S.
If A and B are two linear operators acting on S, A = B means A|xi = B|xi , |xi S.
C = A + B means C|xi = A|xi + B|xi , |xi S.
D = AB means D|xi = A[B|xi] , |xi S. Note that AB is not necessarily the same as BA.
A good example is the angular momentum operators in quantum mechanics: Jx Jy 6= Jy Jx .
If AB = BA, the commutator [A, B] = AB BA is zero, and we say that the operators
commute.
The identity operator obviously commutes with any other operator A, as A1|xi = A|xi, and
1[A|xi] = A|xi.
One can multiply an operator with a number. If A|xi = |yi, then A|xi = |yi. Obviously,
A = A.
One can formally write higher powers of the operators. For example, A2 |xi = A[A|xi].
Similarly,
1
1
e A 1 + A + A2 + A3 +
(30)
2!
3!
8
The operator A can also act on the dual space SD . If A|ai = |ci, one may write hb|A|ai = hb|ci.
The vector hd| = hb|A is defined in such a way that hd|ai = hb|ci.
This is quite a common practice in quantum mechanics, e.g.,
h1 |H|2 i =
1 H2 d3 x .
(31)
Note that hb|A is not the dual of A|bi. To see this, consider the operator 1. Acting on |xi,
this gives |xi. The dual of this is hx| , which can be obtained by operating 1 , and not
1, on hx|.
Q.
Q.
Q.
Q.
Q.
Q.
If A and B are linear operators, show that A + B and AB are also linear operators.
If A + B = 1 and AB = 0, what is the value of A2 + B 2 ?
Show that eA eA = 1 .
If [A, B] = 0, show that eA eB = eA+B .
If [A, B] = B, show that eA BeA = eB .
If O|xi = |xi, check whether O is linear. Do the same if O|xi = [|xi] .
3.1
(32)
where we have used the fact that any operator O multiplying 1 gives O.
The inverse in this case is also unique. To prove this, suppose we have two different inverses
1
A1
1 and A2 (whether left or right does not matter any more). Now
1
1
1
1
A1
A1
1 A = 1 A1 AA2 = A2
1 = A2 ,
(33)
(34)
Similarly, (AB)(AB)1 = 1.
Suppose the scalar product is defined in S. If there is an operator B corresponding to an
operator A such that
ha|A|bi = hb|B|ai , |ai , |bi S ,
(35)
then B is called the adjoint operator of A and denoted by A . Thus, it follows that hb|A is the
dual vector of A|bi.
Now, ha|(A ) |bi = hb|A |ai = ha|A|bi, so (A ) = A. Also,
ha|A B |bi = [ha|A ][B |bi] = {[hb|B][A|ai]} = hb|BA|ai = ha|(BA) |bi ,
9
(36)
where we have used the duality property of the vectors. Thus, for any two operators A and B,
A B = (BA) .
(37)
(38)
But the first integral is zero as both wavefunctions must vanish at the boundary of the integration region.
Now, complete the proof by showing that i d/dx is hermitian. This shows that momentum is indeed a
hermitian operator in quantum mechanics.
Another important class of operators is where U = U 1 . They are called unitary operators.
One can write
|U |ai|2 = [ha|U ][U |ai] = ha|U U |ai = ha|U 1 U |ai = ha|ai = ||ai|2 ,
(39)
which means that operation by unitary operators keeps the length or norm of any vector unchanged.
The nomenclature is quite similar to those used for matrices. We will show later how one can
represent 4 the action of an operator to a vector by conventional matrix multiplication.
Note that the combination |aihb| acts as a linear operator. Operating on a ket, this gives a ket;
operating on a bra, this gives a bra.
(|aihb|)|ci = hb|ci|ai ,
hd|(|aihb|) = hd|aihb| .
(40)
Also,
hx|(|aihb|)|yi = (hx|ai)(hb|yi) = [hy|biha|xi] = hy|(|biha|)|xi ,
(41)
3.2
Projection Operators
Consider the LVS S of two-dimensional vectors, schematically written as |xi. Let |ii and |ji be the
two unit vectors along the x- and y-axes. The operator Pi = |iihi|, acting on any vector |xi, gives
hi|xi|ii, a vector along the x-direction with a magnitude hi|xi.
4
10
Obviously, the set Pi |xi is a one-dimensional LVS. It contains all those vectors of S that lie
along the x-direction, contains the null element, and also the unit vector |ii, which can be obtained
by Pi |ii. Such a space S 0 , all whose members are members of S but not the other way round, is
called a nontrivial subspace of S. The null vector, and the whole set S itself, are trivial subspaces.
The operator Pi is an example of the class known as projection operators. We will denote them
by P . These operators project out a subspace of S. Once a part is projected out, another projection
cannot do anything more, so P 2 = P . A projection operator must also be hermitian, since it is
necessary that it projects out the same part of the original space S and the dual space SD . Any
operator that is hermitian and satisfies P 2 = P is called a projection operator.
Suppose P1 and P2 are two projection operators. They project out different parts of the original
LVS. Is P1 + P2 a projection operator too? If P1 = P1 and P2 = P2 , (P1 + P2 ) = P1 + P2 . However,
(P1 + P2 )2 = P12 + P22 + P1 P2 + P2 P1 = (P1 + P2 ) + P1 P2 + P2 P1 ,
(42)
so that P1 P2 + P2 P1 must be zero. Multiply from left by P1 and use P12 = P1 , this gives P1 P2 +
P1 P2 P1 = 0. Similarly, multiply by P1 from the right, and subtract one from the other, to get
P1 P2 P2 P1 = 0 ,
(43)
so that the only solution is P1 P2 = P2 P1 = 0. Projection operators like this are called orthogonal
projection operators. As an important example, for any P , 1 P is an orthogonal projection
operator. They sum up to 1, which projects the entire space onto itself.
In short, if several projection operators P1 , P2 , Pn satisfy
Pi Pj = Pi for i = j ,
then
i Pi
0 otherwise ,
(44)
3.3
If the effect of an operator A on a vector |ai is to yield the same vector multiplied by some constant,
A|ai = a|ai ,
(45)
We call it an eigenvalue equation, the vector |ai an eigenvector of A, and a the eigenvalue A.
If there is even one vector |xi which is a simultaneous eigenvector of both A and B, with
eigenvalues a and b respectively, then A and B commute. This is easy to show, as
(AB)|xi = A(b|xi) = bA|xi = ab|xi , (BA)|xi = B(a|xi) = aB|xi = ab|xi ,
(46)
the same eigenvalue a, any linear combination c|a1 i + d|a2 i is also an eigenvector, with the same
eigenvalue (prove it). This is not true for non-degenerate eigenvectors.
Suppose both A and B have non-degenerate eigenvectors, and [A, B] = 0. Also suppose |xi is
an eigenvector (often called an eigenket) of A with eigenvalue a. We can write
[A, B]|xi = 0|xi = 0 AB|xi = BA|xi A(B|xi) = a(B|xi) ,
(47)
or B|xi is also an eigenvector of A with the same eigenvalue a. But A has non-degenerate eigenvalues; so this can only happen if B|xi is just some multiplicative constant times |xi, or B|xi = b|xi.
Thus, commuting operators must have simultaneous eigenvectors if they are non-degenerate.
One can have a counterexample from the angular momentum algebra of quantum mechanics.
The vectors are labelled by the angular momentum j and its projection m on some axis, usually
taken to be the z-axis. These vectors, |jmi, are eigenvectors of the operator J2 = Jx2 + Jy2 + Jz2
5 . They are also eigenvectors of J but not of J or J . So here is a situation where J2 and J
z
x
y
x
commute but they do not have simultaneous eigenvectors. The reason is that all these |jmi states
are degenerate with respect to J2 with an eigenvalue of j(j + 1)h2 .
The eigenvalues of hermitian operators are necessarily real. Suppose A is hermitian, A = A ,
and A|ai = a|ai. Then
ha|A|ai = aha|ai ,
ha|A|ai = ha|A |ai = ha|A|ai = a ha|ai ,
(48)
as the scalar product ha|ai is real. So a = a , or hermitian operators have real eigenvalues.
If an hermitian operator has two different eigenvalues corresponding to two different eigenvectors, these eigenvectors must be orthogonal to each other, i.e., their scalar product must be zero.
Suppose for an hermitian operator A, A|ai = a|ai and A|bi = b|bi. So,
hb|A|ai = ahb|ai ,
ha|A|bi = hb|A |ai = hb|A|ai = bha|bi
hb|A|ai = bhb|ai ,
(49)
using the fact that b is real and ha|bi = hb|ai. Subtracting one from the other, and noting that
a 6= b, we get ha|bi = 0, or they are orthogonal to each other.
Matrices
An m n matrix A has m rows and n columns, and the ij-th element Aij lives in the i-th row and
the j-th column. Thus, 1 i m and 1 j n. If m = n, A is called a square matrix.
The sum of two matrices A and B is defined only if they are of same dimensionality, i.e.,
both have equal number of rows and equal number of columns. In that case, C = A + B means
Cij = Aij + Bij for every pair (i, j).
The inner product C = AB is defined if and only if the number of columns of A is equal to the
number of rows of B. In this case, we write
C = AB = Cij =
n
X
Aik Bkj ,
(50)
k=1
5
Although we have used the cartesian symbols x, y, z, the angular momentum operators can act on a completely
different space.
12
one can also drop the explicit summation sign using the Einstein convention for repeated indices.
If A is an m n matrix, and B is an n p matrix, C will be an m p matrix. Only if m = p,
both AB and BA are defined. They are of same dimensionality if m = n = p, i.e., both A and B
are square matrices. Even if the product is defined both way, they need not commute; AB is not
necessarily equal to BA, and in this respect matrices differ from ordinary numbers, whose products
always commute.
The direct, outer, or Kr
onecker product of two matrices is defined as follows. If A is an m m
matrix and B is an n n matrix, then the direct product C = A B is an mn mn matrix with
elements Cpq = Aij Bkl , where p = m(i 1) + k and q = n(j 1) + l. For example, if A and B are
both 2 2 matrices,
a11 b11
a11 b21
=
a21 b11
a21 b12
AB=
a11 B a12 B
a21 B a22 B
a11 b12
a11 b22
a21 b12
a21 b22
a12 b11
a12 b21
a22 b11
a22 b12
a12 b12
a12 b22
.
a22 b12
a22 b22
(51)
A row matrix R of dimensionality 1 m has only one row and m columns. A column matrix C
of dimensionality m 1 similarly has only one column but m number of rows. Here, both RC and
CR are defined; the first is a number (or a 1 1 matrix), and the second an m m square matrix.
The unit matrix of dimension n is an n n square matrix whose diagonal entries are 1 and all
other entries are zero: 1ij = ij . The unit matrix commutes with any other matrix: A1 = 1A = A,
assuming that the product is defined both way (so that A is also a square matrix of same dimension).
From now on, unless mentioned explicitly, all matrices will be taken to be square ones.
If two matrices P and Q satisfy PQ = QP = 1, P and Q are called inverses of each other, and
we denote Q by P1 . It is easy to show that left and right inverses are identical, the proof is along
the same line as the proof for linear operators.
The necessary and sufficient condition for the inverse of a matrix A to exist is a nonzero determinant: det A 6= 0. The matrices with zero determinant are called singular matrices and do not
have an inverse. Note that for a square array
a1
b1
c
1
a2
b2
c2
a3
b3
c3
the determinant is defined as ijk... ai bj ck ..., where ijk... is an extension of the usual Levi-Civita
symbol: +1 for an even permutation of (i, j, k, ...) = (1, 2, 3, ...), 1 for an odd permutation, and 0
if any two indices are repeated.
If we strike out the i-th row and the j-th column of the n n determinant, the determinant of
the reduced (n 1) (n 1) matrix is called the ij-th minor of the original matrix. For example,
if we omit the first row (with ai ) and one of the columns in turn, we get the M1j minors. The
determinant Dn for this n n matrix can also be written as
Dn =
n
X
(1)1+j aj M1j .
(52)
j=1
If the i-th row is omitted, the first factor would have been (1)i+j .
As A1 A = 1, (det A1 ) (det A) = 1, as unit matrices of any dimension always have unit
determinant.
13
(53)
(54)
Another thing that remains invariant under a similarity transformation is the trace of a matrix,
P
which is just the algebraic sum of the diagonal elements: tr A = i Aii . Even if A and B do not
commute, their traces commute, as tr (AB) = tr A tr B = tr (BA). It can be generalized: the trace
of the product of any number of matrices remains invariant under a cyclic permutation of those
matrices. The proof follows from the definition of trace, and the product of matrices:
tr (ABC P) =
i,j,k,l...p
(ABC P)ii =
(55)
All the indices are summed over, so we can start from any point; e.g., if we start from the index k,
we get the trace as tr (C PAB).
Note that this is valid only if the matrices are finite-dimensional. For infinite-dimensional matrices,
tr (AB) need not be equal to tr (BA). A good example can be given from quantum mechanics. One can write
both position and momentum operators, x and p, as infinite-dimensional matrices. The uncertainty relation,
written in the form of matrices, now reads [x, p] = ih1. The trace of the right-hand side is definitely nonzero;
in fact, it is infinity because the unit matrix is infinite-dimensional. The trace of the left-hand side is also
nonzero, as tr (xp) 6= tr (px); they are infinite-dimensional matrices too.
(56)
(57)
as the product is nonzero only when i = j = k. We get an identical result for (Bd Ad )ik , so
they always commute. A diagonal matrix need not commute with a nondiagonal matrix.
The complex conjugate A of a matrix A is given by (A )ij = (Aij ) , i.e., by simply taking
the complex conjugate of each entry. A need not be diagonal.
The transpose AT of a matrix A is given by (AT )ij = Aji , i.e., by interchanging the row and
the column. The transpose of an m n matrix is an n m matrix; the transpose of a row
matrix is a column matrix, and vice versa. We have
(AB)Tij = (AB)ji = Ajk Bki = BTik ATkj = (BT AT )ij ,
(58)
or (AB)T = BT AT .
The hermitian conjugate A of a matrix A is given by Aij = (Aji ) , i.e., by interchanging the
row and the column entries and then by taking the complex conjugate (the order of these
operations does not matter). If A is real, A = AT .
14
1 =
0
1
1
0
2 =
0 i
i 0
3 =
exp
i2 = cos + i2 sin .
2
2
2
1 0
0 1
(59)
Q. The Pauli matrices satisfy [i , j ] = 2iijk k and {i , j } = 2ij . Show that for any two vectors
A and B,
(~ .A) (~ .B) = A.B + i~ .(A B) .
(60)
4.1
O1k O2k = 0 .
(62)
Thus, the total number of independent elements is n2 n 12 n(n 1) = 12 n(n 1). Note that
OT O = 1 does not give any new constraints; it is just the transpose of the original equation.
15
Rotation in an n-dimensional space is nothing but transforming a vector by operators which can
be represented (we are yet to come to the exact definition of representation) by n n orthogonal
matrices, with 12 n(n 1) independent elements, or angles. Thus, a 2-dimensional rotation can be
parametrized by only one angle; a 3-dimensional rotation by three, which are known as Eulerian
angles 6 .
One can have an identical exercise for the n n unitary matrix U. We start with 2n2 real
elements, as the entries are complex numbers. The condition
UUij = Uik Ukj = Uik Ujk = ij
(63)
gives the constraints. There are again n such equations with the right-hand side equal to 1, which
look like
X
|U1k |2 = 1
(64)
k
for i = j = 1, and so on. All entries on the left-hand side are necessarily real. There are n C2 =
1
2 n(n 1) conditions with the right-hand side equal to zero, which look like
X
U1k U2k = 0 .
(65)
However, the entries are complex, so a single such equation is actually two equations, for the real
and the imaginary parts. Thus, the total number of independent elements is 2n2 nn(n1) = n2 .
Again, U U = 1 does not give any new constraints; it is just the hermitian conjugate of the original
equation.
4.2
Representation
Suppose we have an orthonormal basis |ii, so that any vector |ai can be written as in (22). If the
space is n-dimensional, one can express these basis vectors as n-component column matrices, with
all entries equal to zero except one, which is unity. For example, in a 3-dimensional space, one can
write the orthonormal basis vectors as
1
0
0
|1i = 0 , |2i = 1 , |3i = 0 .
0
0
1
(66)
Of course there is nothing sacred about the orthonormal basis, but it makes the calculation easier.
The vector |ai can be expressed as
a1
|ai = a2 .
(67)
a3
Consider an operator A that takes |ai to |bi, i.e., A|ai = |bi. Obviously |bi has same dimensionality
as |ai, and can be written in a form similar to (67). The result is the same if we express the operator
A as an n n matrix A with the following property:
Aij aj = bi .
(68)
We now call the matrix A a representation of the operator A, and the column matrices a, b
representations of vectors |ai and |bi respectively.
Examples:
6
16
1. In a two-dimensional place, suppose A|1i = |1i and A|2i = |2i. Then a11 = 1, a22 = 1,
a12 = a21 = 0, so that
1 0
A=
.
(69)
0 1
2. In a three-dimensional space, take A|1i = |2i, A|2i = |3i, A|3i = |1i. Thus, a21 = a32 =
a13 = 1 and the rest entries are zero, and
0
A= 1
0
0
0
1
1
0 .
0
(70)
3. Suppose the Hilbert space is 2-dimensional (i.e., the part of the original infinite-dimensional
space in which we are interested) and the operator A acts like A|1 i = 12 [|1 i + |2 i] and A|2 i =
4.3
If there is a square matrix A and a column matrix a such that Aa = a, then a is called an
eigenvector of A and is the corresponding eigenvalue. Again, this is exactly the same that we got
for operators and vectors, eq. (45).
A square matrix can be diagonalized by a similarity transformation: Ad = RAR1 . For a diagonal matrix, the eigenvectors are just the orthonormal basis vectors, with the corresponding diagonal
entries as eigenvalues. (A note of caution: this is strictly true only for non-degenerate eigenvalues,
i.e., when all diagonal entries are different. Degenerate eigenvalues pose more complication which
will be discussed later.) If the matrix A is real symmetric, it can be diagonalized by an orthogonal
transformation, i.e., R becomes an orthogonal matrix. If A is hermitian, it can be diagonalized by
a unitary transformation:
Ad = UAU ,
(71)
where U = U1 . While the inverse does not exist if the determinant is zero, even such a matrix
can be diagonalized. However, determinant remains invariant under a similarity transformation, so
at least one of the eigenvalues will be zero for such a singular matrix.
Trace also remains invariant under similarity transformations.
Thus,
it is really easy to find
a b
out the eigenvalues of a 2 2 matrix. Suppose the matrix is
, and the eigenvalues are 1
c d
and 2 . We need to solve two simultaneous equations,
1 2 = ad bc , 1 + 2 = a + d ,
(72)
1
1
1
1
1 .
1
(73)
The determinant is zero (all minors are zero for A) and there must be at least one zero eigenvalue.
How to know how many eigenvalues are actually zero?
17
Suppose the ij-th element of an n n matrix A is denoted by aij . If the system of equations
a11 x1 + a12 x2 + aan xn = 0 ,
b11 x1 + b12 x2 + ban xn = 0 ,
(74)
have only one unique solution then the equations are linearly independent and the matrix is nonsingular, i.e., det A is nonzero. In this case no eigenvalue can be zero, and the matrix is said to be
of rank n.
If one of these equations can be expressed as a linear combination of the others, then no unique
solution of (74) is possible. The determinant is singular, i.e., A1 does not exist, and one of
the eigenvalues is zero. If there are m linearly dependent rows (or columns) on n m linearly
independent rows (or columns), the matrix is said to be of rank n m, and there are m number of
zero eigenvalues.
Only one row of (73) is independent; the other two rows are identical, so linearly dependent,
and the rank is 1. Therefore, two of the eigenvalues are zero. The trace must be invariant, so the
eigenvalues are (0, 0, 3).
The eigenvectors are always arbitrary up to an overall sign. Consider the matrix A =
1
1
1
.
1
1
= 0,
1
(75)
which boils down to ( 2) = 0, so the eigenvalues are 0 and 2 (this can be checked just by
looking at the determinant and trace, without even caring about the secular equation). For = 0,
the equation of the eigenvector is
10
1
1
10
x
y
= 0,
(76)
or x + y = 0. Thus, we can choose the normalized eigenvector as (1/ 2, 1/ 2), but we could have
taken the minus sign in the first component
too.
eigenvector corresponding
Similarly,
the second
2 7
will have a zero eigenvalue? What is the other
6 x
eigenvalue? Show that in this case the second row is linearly dependent on the first row.
Q. What is the rank of the matrix whose eigenvalues are (i) 2, 1, 0; (ii) 1, 1, 2, 2; (iii) i, i, 0, 0?
Q. The 3 rows of a 3 3 matrix are (a, b, c); (2a, b, c); and (6a, 0, 4c). What is the rank of this
matrix?
Q. Write down the secular equation for the matrix A for which a12 = a21 = 1 and the other elements
are zero. Find the eigenvalues and eigenvectors.
4.4
Degenerate Eigenvalues
If the eigenvalues of a matrix (or an operator) are degenerate, the eigenvectors are not unique.
Consider the operator A with two eigenvectors |xi and |yi having the same eigenvalue a, so that
A|xi = a|xi ,
A|yi = a|yi .
18
(77)
Any linear combination of |xi and |yi will have the same eigenvalue. Consider the combination
|mi = |xi + |yi, for which
A[|xi + |yi] = (A|xi) + (A|yi) = a[|xi + |yi] = a|mi .
(78)
Thus one can take any linearly independent combination of the basis vectors for which the eigenvalues are degenerate (technically, we say the basis vectors that span the degenerate subspace) and
those new vectors are equally good as a basis. One can, of course, find an orthonormal basis too
using the Gram-Schmidt method. The point to remember is that if a matrix, or an operator, has
degenerate eigenvalues, the eigenvectors are not unique.
Examples:
1. The unit matrix in any dimension has all degenerate eigenvalues, equal to 1. The eigenvectors
can be chosen to be the standard orthonormal set, with one element unity and the others zero.
But any linear combination of them is also an eigenvector. But any vector in that LVS is a linear
combination of those orthonormal basis vectors, so any vector is an eigenvector of the unit matrix,
with eigenvalue 1, which is obvious: 1|ai = |ai.
a
c
b
d
p1
q1
and
p2
. The matrix A + 1 must have the same eigenvectors, as they are also the eigenvectors of the
q2
2 2 unit matrix 1. The new eigenvalues, 1 and 2 , will satisfy
1 + 2 = (a + 1) + (d + 1) = a + d + 2 = 1 + 2 + 2 ,
1 2 = (a + 1)(d + 1) bc = (ad bc) + a + d + 1 = 1 2 + 1 + 2 + 1 ,
(79)
1
A = 0
0
0
0
1
0
1 ,
0
(80)
for which the secular equation is ( 1)(2 1) = 0, so that the three eigenvectors are 1, 1,
and 1. First, we find the eigenvector for the non-degenerate eigenvalue 1, which gives x = 0 and
y + z = 0. So a suitably normalized eigenvector is
0
1
|1i = 1 .
2 1
(81)
For = 1, the only equation that we have is y z = 0 and there are infinite possible ways to solve
this equation. We can just pick up a suitable choice:
0
1
|2i = 1 .
2 1
(82)
The third eigenvector, if we want the basis to be orthonormal, can be found by the Gram-Schmidt
method. Another easy way is to have the cross product of these two eigenvectors, and we find
h3| = (1, 0, 0).
19
4.5
One can write a function of a square matrix just as one wrote the functions of operators. In fact,
to a very good approximation, what goes for operators goes for square matrices too. Thus, if |ai is
an eigenvector of A with eigenvalue a, then A2 |ai = a2 |ai and An |ai = an |ai.
Suppose A is some n n matrix. Consider the determinant of 1 A, which is a polynomial in
, with highest power of n , and can be written as
det(1 A) = n + cn1 n1 + + c1 + c0 .
(83)
det(1 A) = n + cn1 n1 + + c1 + c0 = 0
(84)
The equation
Eq. (84) is known as the secular or characteristic equation for A. The n roots correspond to n eigenvalues of A. The Cayley-Hamilton theorem states that if we replace by A in (84), the polynomial
in A should be equal to zero:
An + cn1 An1 + + c1 A + c0 = 0 .
(85)
(86)
from (84). This is true for all eigenvectors, so the matrix polynomial must identically be zero.
see what we exactly mean by the Cayley-Hamilton theorem, consider the matrix A =
To
a
c
b
. The characteristic equation is
d
a
c
b
= 2 (a + d) + (ad bc) = 0 .
d
(87)
a
c
b
d
a
c
b
d
(a + d)
a
c
b
d
+ (ad bc)
1
0
0
1
(88)
20