Professional Documents
Culture Documents
Linear Algebra
Introduction
The back cover of Gilbert Strangs book Introduction to Linear Algebra summarizes all of linear algebra: Ax = b Ax = b Ax = x Av = u
(N N) (M N) (N N) (M N)
If you understand these equations, then you know almost everything there is to know about linear algebra theory, and you need not continue reading. In this class, we cant cover the whole of linear algebra (which is typically taught over a period of one or two semesters). Instead, we cover just a few concepts from linear algebra that are important to electrical engineering.
Matrices
Denition 1. A matrix is an M N array of numbers that has M rows and N columns. Example 2.1. 1 2 3 A= 2 5 2 6 3 1 In MATLAB, one would create this matrix using the following command:
A = [1 2 3; 2 5 2; 6 -3 1]
The elements of the rows are separated by spaces, and the columns delimited by semicolons.
2.1
A Matrix is an operator
A matrix (A) acts on a vector x to produce the vector Ax. You can think of this like a system with x as the input and Ax as the output. Example 2.2. 1 x= 2 1 1 2 3 1 Ax = 2 5 2 2 6 3 1 1
Linear Algebra
The output will be a 3 1 vector. Well get to the mechanics of matrix multiplication in a moment. In MATLAB, one would create this vector and perform the matrix multiplications using the following commands:
Figure 1 shows graphically what operation the matrix performs. The matrix rotates a vector v so that is becomes v . y vy
vy
2 1
v x
vx
vx
Figure 1: The vector v is rotated by from v. To derive this matrix, we use trigonometry. v x = v cos 1 vy = v sin 1 and similarly, vx = v vy = v cos 2 sin 2
Linear Algebra
Substituting these expressions into the expression for v x and vy , and using vy = v x tan 1 , we get vx = vx (cos 1 cos sin 1 sin ) cos1 = v x cos vy sin
= =
cos sin
vx +
sin cos
vx vy
vy ,
thus v = Rv.
2.2
Matrix multiplication
2 1
Linear Algebra
One way to consider operation of matrix multiplication is that it is the dot product of the rows of the matrix with the column of the vector. (In this section of the course, we will write the dot product in matrix notation.) 2 3 1 1 = 7 Av = 2 6 4 2 1 Another way to consider the operation of matrix multiplication is as a linear combination of the columns of the matrix. Av = 2 3 4
+1
1 2
6 8
1 2
7 6
This way is the way that I want you to think of the operation of matrix multiplication, because this rst part of the course is largely all about linear combinations of vectors, matrices, and function. The operation of the matrix on a vector produces a vector that is a linear combination of the columns of the matrix. Multiplication of a matrix with a matrix can be thought of in a similar way, except that each elements of the resulting matrix is a linear combination of the columns of the rst matrix in the multiplication. b11 b12 a11 a12 a13 M = AB = b21 b22 a21 a22 a23 b31 b32
b11
a11 a21
+ b21
a12 a22
+ b31
a13 a23
b12
a11 a21
+ b22
a12 a22
+ b32
a13 a23
Vector spaces
The concept of a vector space is fundamental to understanding linear systems in many elds in Electrical Engineering. Well dene a vector space, and then introduce the column space of a matrix. There are only two fundamental linear operations that we can perform on vectors. 1. Addition: x + y 2. Multiplication by a scalar: c x Notice that the results of these operations is a vector in the same space as the original vector exists.
Linear Algebra
and
2 1
Both of these vectors exist in R2 . Multiplying a vector by a scalar produces another vector in the same space as the rst vector: 2 3 2
6 4
2 1
1 3
We can combine the fundamental linear operations to form a subspace of x and y. Denition 2. The subspace of x and y contains all linear combinations of x and y: c x + dy
3.1
Now, remember that for b = Ax, b is a linear combination of the column vectors of the matrix A. The vector b is restricted to existing in some subspace, where that subspace is dened by the matrix. Example 3.2. Here is simple example to demonstrate the column space of a matrix. Consider the following matrix, A= 1 2 2 4 .
Here is a picture of that shows the two column vectors (well call them v1 and v2 ) in the matrix A. If we perform the only two operations (multiplication of the vector by a scalar and addition of the vectors), it should be clear that the resulting vector must lie on the line in which the column vectors lie. Therefore, although the matrix is 2 2, the column space of the matrix is one dimensional, i.e., it does not span R2 .
Linear Algebra
v2
v1
Another way to see this limitation of the matrix is through the linear system of equations the matrix encodes. Ax = b 1 2 2 4 which can be written as, x1 + 2x2 = b1 2x1 + 4x2 = b2 . Solving there equations, we nd that 2b1 b2 = 0. What this says is that the resulting vector b must have components b1 and b2 that satisfy the equation 2b1 b2 = 0. We are not allowed to pick any arbitrary numbers for b1 and b2 , only those that make the vector lie on the line 2b1 b2 = 0. x1 x2
b1 b2
Example 3.3. The vectors 1 u = 1 0 form the columns of the matrix A, 1 0 0 A = 1 1 0 . 0 1 1 These vectors are shown in Figure 2. 0 v= 1 1 0 u= 0 1
Linear Algebra 3
w u 1 v 2
x1 x = x2 , x3
can the vector Ax lie anywhere in R3 ? Your undergraduate class in linear algebra answered this question in great detail. In this class, we illustrate this concept with some simple examples. Say we know the vector Ax = b. What combination of u, v, and w produces the vector b? (Is there a combination?) Youve seen this question before as a system of linear equations: x1 = b1 x1 = b1 x2 = b1 + b2 x2 x1 = b2 x3 x2 = b3 x3 = b1 + b2 + b3 Substituting these expressions into the expression for Ax, we get 1 0 0 b = b1 1 (b1 + b2 ) 1 (b1 + b2 + b3 ) 0 0 1 1
= b1 u + (b1 + b2 )v + (b1 + b2 + b3 )w .
We see that we are free to choose any b1 , b2 , b3 . In other words, any vector b can be formed from the vectors u, v, w. The column space of A spans R3 .
Example 3.4. Now we modify A from the previous example and call it C: 1 0 1 C = 1 1 0 . 0 1 1
Linear Algebra
b1 + b2 + b3 = 0
which is the equation of a plane. We are not free to choose any b1 , b2 , b3 . We have to choose b1 , b2 , b3 on the plane. The vector b points to the plane. The column space of C does not span R3 because we cant form any vector we want using a linear combination of the columns of C. This restriction is because the third column is a linear combination of the rst two columns: 1 0 1 1 1 + 1 1 = 0 0 1 1
1st col. 2nd col. 3rd col.
In electrical engineering applications, we try to nd sets of vectors that span the whole space (a basis).
3.2
Denition 3. A basis is a set of linearly independent vectors that span a space. Denition 4. A set of vectors {v1 , . . . , v N } is linearly independent if c1 v1 + . . . + c N v N = 0 only for c1 = c2 = . . . = c N = 0. Denition 5. Two vectors v1 and v2 are orthogonal if vT v2 =0. 1 To normalize a vector, divide the vector by the length of the vector, or another way of writing this is v = vT v = 1. Denition 6. A set of vectors forms an orthonormal basis if the vectors span the subspace, and 0 for i = j vT v j = . i 1 for i = j
Linear Algebra
3.3
The Gram-Schmidt procedure nds an orthonornal basis from a matrix with independent column vectors. We start with three vectors (the method is extendible to n vectors): a1 , a2 , a3 . We want to produce othonormal vectors: q1 , q2 , and q3 . To do this, we form the orthogonal set of vectors b1 , b2 , b3 , and normalize them to produce the orthonormal basis q1 , q2 , q3 . The Gram-Schmidt method works by projecting a vector onto another vector. The projection of one vector v onto another vector u is proju (v) = uT v u. uT u
It is easy to see how this expression works. The dot product in the numerator is u v cos where is the angle between the vectors. The denominator is u u . The denominator cancels u in the numerator, and normalizes the last factor of the expression (u). Therefore, the result is a vector of length v cos and pointing in the direction of u. We can now write the Gram-Schmidt procedure using the projection operator notation.
T b1 a2 T b1 b1
b1
p1 a2 a1 = b1
Step 3 Then
b2 = a2 p1 = a2
T b1 a2 T b1 b1
b1
p1 a2 a2 = b1
b2
T b1 a2
T T b1 a2 b1 T b1 b1
b1 = 0
Linear Algebra
10
T b1 a3 T b1 b1
b1
projb ( a3 ) =
2
T b2 a3 T b2 b2
b2
and we can construct a vector in the b1 b2 plane from these components p2 = projb ( a3 ) + projb ( a3 ) .
1 2
Then, using the same idea that was used above to construct b2 , we can see that b3 = a3 p2 = a3
T b1 a3 T b1 b1
b3
b1
T b2 a3 T b2 b2
b2 . a3
Note that, in the gure, a3 does not (in general) lie in the b1 b2 plane. b2 p2 b1
Step 1 Choose b1 = a1 :
1 b1 = a1 = 1 0
Linear Algebra
11
Step 2 Find p1 :
T b1 a2 =
1 1 0
T b1 b1 =
1 1 0
2 0 =2 2 1 1 = 2 0
1 2 p1 = 1 2 0
Step 3 Find b2 :
2 1 1 0 2 1 = 1 b2 = 2 2 0 2
1 1 0
1 1 =0 2
3 1 1 3 6 1 + 6 1 b3 = 2 6 3 0 2 1 1 = 1
2 6 3
Linear Algebra
12
1 1 q1 = 1 2 0
1 1 q2 = 1 6 2
1 1 q3 = 1 3 1
Some comments on the Gram-Schmidt procedure. The Gram-Schmidt procedure can easily be extended to any number of vectors. However, MATLAB does not implement the Gram-Schmidt procedure because it can lead to numerically unstable results. In MATLAB, on would use qr (QR factorization). The columns of the output matrix will be the basis functions.
3.4
We can perform the same operations on matrices and functions that we do on vectors, i.e., multiply by a scalar and add. Therefore, in the same way as we have created vectors spaces, we can create matrix spaces and function spaces. A matrix space is a linear combination of matrices. For example, A1 = 1 0 0 1 A2 = 0 1 0 0 A3 = 0 0 1 0 A4 = 0 0 0 1 .
This basis can be used to represent matrices of the form M = c1 A1 + c2 A2 + c3 A3 + c4 A4 . A function space is a linear combination of functions: f ( x ) = a0 + a1 cos x + b1 sin x + a2 cos 2x + . . . For example, solutions to differential equations are function spaces: d2 y = y dx2 d2 y =y dx2 y = c sin x + d cos x y = ce x + de x
In the same way that we can dene an orthonormal basis for a vector space, one can nd an orthonormal basis for matrix and function spaces.
Linear Algebra
13
4.1
In matrix notation: u= z y A= 0 1 1 0
If 1 and 2 are eigenvalues of A, and x1 and x2 are eigenvectors of A, then we can write u = c 1 e 1 t x 1 + c 2 e 2 t x 2 . We can check that these equations (note that there are two equations in the expression above) are solutions to the system of differential equations. LHS: RHS: du = c 1 1 e 1 t x 1 + c 2 2 e 2 t x 2 dt A u = c 1 e 1 t A x 1 + c 2 e 2 t A x 2
= c 1 e 1 t 1 x 1 + c 2 e 2 t 2 x 2
A x1 = 1 x1 A x2 = 2 x2
Writing u as a linear combination of eigenvectors simplied the differential equations (they became decoupled). It is remarkable that on one side of the equations, we took the derivative and on the other side we multiplied by a matrix, yet both sides are the same.
Linear Algebra
14
2. Markov Processes (Markov Chains) x1 = A x0 x2 = A x1 x3 = A x2 . . . so in general x k = A x k 1 = Ak x 0 In many problems, we want to know x10 , or x100 , or maybe limk xk . If x is an eigenvector of the matrix A and the associated eigenvalue is , then Ax = x A2 x = AAx = A x = 2 x A3 x = AAAx = AA x = A x = 3 x . . . so in general Ak x = k x
In general, the initial state of a system u0 is unlikely to be an eigenvector of the system, but you know from earlier that we can express u0 as a linear combination of basis vectors. Lets use the eigenvectors as our basis. Let the matrix A have two eigenvectors, x1 and x2 , with associated eigenvalues 1 and 2 . Write u0 = c1 x1 + c2 x2 then u1 = Au0
= A( c1 x1 + c2 x2 ) = c1 1 x1 + c2 2 x2
. . .
k k u k = c1 1 x1 + c2 2 x2
The eigenvector basis is a good basis in which to work (if you can). Notice that the eigenvectors for Ak are the same as those for A.
Linear Algebra
15
4.2
What are eigenvalues and eigenvectors? An eigenvector is a vector for which the vector Ax is in the same direction as x. x x Ax Ax
Figure 3: A two-dimensional graphical interpretation of eigenvectors. Preview (for linear systems): eigenvalues are the gains associated with each fundamental mode (harmonic) of the system. Example 4.1. Permutation matrix A= 0 1 1 0 y= y1 y2
y2 y1
y1 y2
x1 x2
x1 x2
1 1
=1
1 1
x
Try x1 =
1 1
0 1 1 0
Ax
1 1
= 1
1 1
x
Linear Algebra
16
4.3
We want to know how to nd the s and xs such that Ax = x . Note that there are, in general, going to be multiple (,x) pairs. To nd eigenvalues, one solves the characteristic equation for the matrix. We start with Ax = x IAx = Ix Ax = Ix multiply both sides by I
so
(A I) x = 0
In the above equation, the value is subtracted from each of the diagonal components of A. For there to be a solution other than x = 0, the matrix A I must be singular (a singular matrix is a square matrix that has no inverse), so det(A I) = 0 . This equation is called the characteristic or eigenvalue equation. Example 4.2. A= 3 1 1 3
= (3 )(3 ) 1 = 2 6 + 8 = ( 4)( 2)
so 1 = 4 2 = 2
In this class, we are not going to go through the formal procedure for nding the eigenvectors. Youll nd that procedure in any linear algebra textbook. Instead, well nd the
Linear Algebra
17
(A I) x1 =
1 1 1 1
x1 = 0 x1 = 1 1
1 = 2
(A I) x2 =
1 1 1 1
x2 = 0 x2 =
1 1
Check:
A x1 = 1 x 3 1 1 3 1 1
4 4
=4
1 1
A x2 = 2 x 3 1 1 3
1 1
2 2
=2
1 1
4.4
Example 4.3. Eigenvectors and eigenvalues can tell one about the steady state of a system. The fraction of rental cars in Seattle starts at
1 50
= 0.02.
49 50
The fraction of rental cars outside Seattle starts at Every month, 80% of the cars in Seattle stay. Every month, 20% of the cars in Seattle leave.
= 0.98.
Every month, 5% of the cars outside Seattle come in. Every month, 95% of the cars outside Seattle stay outside.
Linear Algebra
18
We want to gure out what percentage of cars are in Seattle and what percentage of cars are outside Seattle after k months. This problem can be described by a Markov Chain. We form the state transition matrix: A= 0.80 0.05 0.20 0.95
(This matrix is actually the transverse of what is usually called the state transition matrix in probability theory.) To see how to use this matrix, imagine that at some point there are 80 cars in Seattle, and 20 cars outside Seattle. At the next month, the number of cars in Seattle will have changed to (0.8)(80) + (0.05)(20) = 64 + 1 = 65 cars, because 80% of the cars in Seattle stay and 5% of the cars outside Seattle come into Seattle. We are going to compute the percentage of cars that exist in Seattle and outside Seattle (see the starting conditions given at the start of the example). We encode this information in a state vector: 0.02 u0 = 0.98 After 1 month, u1 = 0.80 0.05 0.20 0.95 0.065 0.935 , 0.02 0.98
=
and after two months, u2 =
0.09875 0.90125
Does this system stabilize? If so, to what percentages? Use eigenvectors and eigenvalues to compute the an answer. The eigenvalues and eigenvectors of the matrix A are: 1 = 1 2 = 0.75 x1 = x2 = 0.2 0.8
1 1
Now, we write u0 as a linear combination of the eigenvectors of the system: u0 = 0.02 0.98
= c1 x1 + c2 x2 = c1
0.2 0.8
+ c2
1 1
Linear Algebra
19
Note that u0 is not an eigenvector of the system. It could have been an eigenvector, but in general, it will not be an eigenvector. We now solve for c1 and c2 . 0.02 = 0.2c1 c2 0.98 = 0.8c1 + c2 Solving these equations, we nd that c1 = 1 Therefore, u0 = 0.2 0.8 c2 = 0.18
+ 0.18
1 1
Now, we can easily nd the future states: u1 = Au0 = A 0.2 0.8 0.2 0.8 0.2 0.8
+ 0.18 + 0.182
k + 0.182
1 1 1 1 1 1 1 1
= 1
k u k = Ak u 0 = 1
0.2 0.8
+ 0.18(0.75)k
Therefore, 20% of the cars will eventually be in Seattle, and 80% outside Seattle. Notice that the eigenvector with eigenvalue 1 became the steady state.
4.4.1
where, p00 is the probability that the system will transition from state 0 to state 0 (i.e., remains in state 0), p01 is the probability that the system will transition from state 1 to state 0, p10 is the probability that the system will transition from state 0 to state 1, and p11 is the probability that the system remains in state 1.
Linear Algebra
20
4.5
From The Anatomy of a Large-Scale Hypertextual Web Search Engine: PageRank or PR( A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web.
4.6
Diagonalizing a matrix
In MATLAB: diag(A). For N independent eigenvectors of A (N N), create a matrix S with columns that are the eigenvectors | | | S = x1 x2 x N | | | Multiplying both sides by A: | | AS = A x1 x2 | |
| | | | x N = A x1 A x2 A x N | | | |
columns are linear combinations of the columns of the matrix (written in matrix notation)
| | | = 1 x1 2 x2 N x N | | | 1 | | | 2 x1 x2 x N = ... | | | N
= S
where is a diagonal matrix and is called the eigenvalue matrix. AS = S A = SS1 A2 = SS1 SS1 = S2 S1 . . . Ak = Sk S1
Linear Algebra
21
If all eigenvalues of A are less than 1, then Ak 0 as k (i.e., a system such as a Markov Chain will eventually come to rest).
Special matrices
(This section is not complete.) There are many matrices that have special properties. For example, symmetric matrices (AT = A) have real eigenvalues and a full set of orthogonal eigenvectors. An orthogonal matrix is a square matrix with orthonormal columns. If Q is an orthogonal matrix, then QT = Q1 . A special matrix that youve seen in these notes is a Markov matrix (M), in which all elements are greater than 0, and the elements in each column sum to 1. For this type of matrix, the largest eigenvalue is 1, and the columns of Mk approach the steady state eigenvector Ms = s. Knowledge of these properties can be very useful in working with these matrices, and obviously, the special properties of the matrix are the properties of the system implemented (or modeled) by the matrix. Your undergraduate classes probably dedicated some time to describing many of these special matrices and properties.
Motivation: the fast Fourier transform (FFT) is an efcient way of multiplying by the Fourier matrix. The Fourier matrix is therefore very useful, but it contains complex numbers, and we need to modify some matrix and vector operations slightly to make the mathematics make sense. First, we need to dene the length of a vector that contains complex numbers. For vectors that contain only real numbers, the length of the vector is the square root of the inner product: l = xT x for x real Now, what happens if a vector z contains complex numbers: z1 z2 z= . . . zN It turns out that z z=
T
z1 z2 zN
z1 z2 . . . zN
Linear Algebra
22
1 j
1 j
=0
but the vector obviously does not have 0 length. To x this problem, we dene the length of the vector as z
2
= ( z )T z = z H z
where zH is called the Hermitian of z (it is the conjugate transpose). The inner product between two (complex) vectors becomes y H x. We also need to adapt the denition of a symmetric matrix (AT = A) to handle matrices that contain complex numbers. A symmetric complex matrix is dened by the Hermitian transpose (instead of simply the transpose for real matrices): AH = A where AH = (A )T . Example 6.1. A= AH = (A )T = so the matrix is Hermitian. 2 3+j 3j 5 2 3+j 3j 5
=A
With these denitions, we are now in a position to dene the complex equivalents of perpendicular (orthogonal) vectors and unitary (orthogonal) matrices. Denition 7. A set of complex vectors q1 , q2 , . . ., q N , is said to be mutually orthonormal if 0 i=j qiH q j = 1i = j Denition 8. A unitary matrix (the complex matrix equivalent of an orthogonal matrix) satises QH Q = I Example 6.2. A matrix Q with column vectors q1 , q2 , . . ., q N , where q1 , q2 , . . ., q N , is a set of mutually orthonormal vectors, is a unitary matrix: | | | Q = q1 q2 q N | | |
Linear Algebra
23
6.1
i =0 k
x [i ] wk
We can rewrite the discrete Fourier transform in matrix notation: X = FN x where FN = Example 6.3. 1 1 1 1 1 j 1 j F4 = 1 1 1 1 1 j 1 j The columns of the matrix are orthogonal and form a basis. Note that the matrix is not unitary (as written), because the column vectors are not mutually orthonormal. One needs a scaling factor to normalize the column vectors. The Fourier transform of a four-element vector 1 1 x= 1 1 is 1 1 1 . . . 1 wN w2 N . . . 1 w2 N w4 N . . .
.. .
1
N w N 1 w2N 1 N . . .
N 1 w N 1 w2N 1 N
wN
( N 1)2
1 1 1 1 1 1 1 j 1 j X= 1 1 1 1 1 1 j 1 j 1
1 0 = 0 0
One way to see that this is the correct result is that the Fourier transform of a DC signal is a delta.
Youll learn much more about the Fourier transform and related topics in EE 518. Well cover a little more in the next class.