Appendix A

Appendix A1
Review of Linear Algebra

Matrix and vector
An m n matrix A is an array of m rows and n columns of elements:
mn
A aij
a11 a12
a
a22
21
am1 am 2
a1n
a2 n
amn
where aij denotes the element in the ith row and jth column of A, often referred
to as the (i,j)-element of A.
The elements in a matrix can be anything of interest (numbers, functions or

symbols), but we will focus on matrices with real numbers as elements.
A vector of dimension n is a n 1 matrix.
1
The transpose of an m n matrix A, denoted by AT is an n m matrix with aij

at its jth row and ith column:
A aij
nm
a
ji
mn
a11
a
12

a1n
a21 am1
a22 am 2
a2 n amn
In particular, a vector v of dimension n can be written as

v v1 v2 vn
or vT v1 v2 vn
A matrix A is said to be symmetric if AT A , or equivalently, aij a ji for all

i, j 1, , n m (it must be a square matrix).
mn bij mn aij bij mn

a matrix: A aij
aij
mn
mn
The sum of matrices: A B aij
The scalar multiple of
mn and B bij nl :
The product of A aij
ml
AB C cij
with cij
aik bkj ai1b1 j ai 2b2 j ainbnj
k 1
Note that AB BA in general.
In particular, for v v1 v2 vn and u u1 u2 un ,

T
u v v u ui vi u1v1 u2v2 un vn
T
i 1
An n n matrix I is called an identity matrix if it has the form

1
0
I diag 1,1, ,1
0
0
It satisfies AI IA A for any n n matrix A.

3
Quadratic form
For any n n matrix A aij and vector x x1 x2 xn , the product
xT Ax x1
x2
a11
a
xn 12

a1n
a21 an1 x1
a22 an 2 x2 n n
aij xi x j
i 1 j 1

a2 n ann xn
is called a quadratic form. In particular,
a11 a12 x
2
2
y
a
x
a
a
xy
a
y
12
21
22
y 11
a
a
21 22
An n n symmetric matrix A is said to be:

positive definite if xT Ax 0 for any n 1 vector x 0 ;
positive semi-definite if xT Ax 0 for any n 1 vector x .
4
Determinant
For a 2 2 matrix A aij , its determinant is denoted and defined by
A a11a22 a12 a21
For a 3 3 matrix A aij , the minor M ij of aij is the determinant of the

2 2 matrix obtained by deleting row i and column j from A.
The co-factor Aij of aij is defined by Aij (1)i j M ij .

The determinant of a 3 3 matrix A can be calculated by
A ai1 Ai1 ai 2 Ai 2 ai 3 Ai 3
(expansion by row i); or
A a1 j A1 j a2 j A2 j a3 j A3 j (expansion by column j)
for any row i = 1, 2, 3 or any column j = 1, 2, 3.
Similarly, we can calculate A for an n n matrix A with n = 4,5,.

5
For example,
2
3 2
0 2
0 3
A 0 3 2 2
4
5
6 8
3 8
3 6
3 6 8
2 24 (12) 4 0 6 5 0 (9) = 141

(expansion by row 1); or
A 2
3 2
6
(3)
3 2
2 24 (12) 3 8 15 = 141
(expansion by column 1).

In particular,
a11
a21 an1
a22 an 2
ann
a11a22 ann
Linear independence of vectors
Vectors v1, v2 ,, vk are said to be linearly independent if
1v1 2v2 k vk 0
holds only for scalars 1 2 k 0 .
Vectors v1, v2 ,, vk are said to be linearly dependent if
1v1 2v2 k vk 0
holds for at least one i 0 .
Equivalently, v1, v2 ,, vk are linearly dependent if there exists a vi such that,
for some scalars j , j i ,
j
vi j v j
j i
j i i
vj
( i 0 )
That is, one of v1, v2 ,, vk is a linear combination of the others.

7
Rank of matrix
For an m n matrix A, the maximum number of linearly independent rows

(columns) is called the row (column) rank of A.
It can be proven that row and column ranks of A are always equal. The
common value the row and column ranks of A is defined to be the rank of A,
and denoted by Rank(A).
Obviously, Rank( A) min(m, n) for an m n matrix A.
A row operation on a matrix includes: (i) multiply a row by a scalar;
(ii) interchange two rows; and (iii) add a scalar multiple of one row to
another.
Rank(A) can be found by performing row operations on matrix A until
reaching an upper triangular matrix. The number of non-zero rows in this
upper triangular matrix is equal to Rank(A).
Example A1.1.
1 3 2
2 8 8
A
3 6 9
4 10 1
1
R3 1.5 R 2 0
R 4 R 2 0
0
2
1
R 2 2 R1
R3 3R1 0
R 4 4 R1 0
3 2 0
2 4 2
0 9 3
0 3 1
3 2
2 4
3 3
2 7
1
0
3
R
3 0
0
2
3
3 2 0
2 4 2
0 9 3
0 0 0
Thus Rank(A) = 3.
Note: The matrix A need not be square in this way to determine its rank.
Inverse matrix
For an n n matrix A , its inverse matrix, denoted by A1 , is an n n matrix

(if exists) such that AA1 A1 A I . If A1 exists, A is said to be invertible.
An n n matrix A is invertible if and only if A 0 , which is equivalent to
Rank(A) = n; and A is said to have full rank in this case.
The inverse A1 of a square matrix A, if exists, can be found by row
operations as follows:
A I Row operations I B
(A1.1)
Then A1 B .
An n n matrix A is said to be orthogonal if AAT AT A I , or AT A1.
A a1 an is orthogonal if and only if a1, , an are orthonormal vectors
in the sense that aiT ai 1 and aiT a j 0 for i j , 1 i, j n .
10
Example A1.2.
0 1 1
0 1 1
A 1 0 1 A I 1 0 1
3 2 1
3 2 1
1 0 1
0 1 1
3 2 1
R1 R 2
R 3 6 1
0 1
0 1 1
0 0 1
1 0 0
0 1 0
0 0 1
0 1 0 R 33 R1 1 0 1
1 0 0 0 1 1
0 0 1 R 3 2 R 2 0 0 6
0 R1 R 3 1 0 0
1
0
0 0 1 0
1 3 1 2 1 6 R 2 R 3 0 0 1
0
0 1 0
1 0 0
2 3 1
1 3
23
13
1 6
1 2 1 6
12
1 6
12
1 3 1 2 1 6
2 3 1
1
A1 2 3 1 2 1 6 4 3 1
1 6
1 3 1 2
2 3 1
11
nn can be determined by
Alternatively, A1 of A aij
nn
1
A Aij
A
1
(A1.2)
where Aij is the co-factor of aij .
For example, the inverse of a 2 2 matrix A aij is given by

a12
1 a11
A
a
a
21 22
1 A11

A A21
A12
1 a22

A22
A a12
a21
a11
a22 a12
1
a11a22 a12 a21 a21 a11
provided A a11a22 a12 a21 0 .

For n 2 , method (A1.1) is usually more efficient than (A1.2).
12
Linear equations
A system of linear equations can be expressed in matrix form as

Ax b
(A1.3)
where A is an m n matrix, x is an n 1 vector and b is an m 1 vector, with x

as the unknown to be solved, while b is given.
(A1.3) consists of m equations for n unknowns in x x1 xn .
T
The solutions to (A1.3) have three possibilities:

A unique solution;
No solution;
Infinitely many solutions.
If m n , a unique solution is impossible. We will focus on the case with

m n , i.e., A is a square matrix.
13
Assume A to be an n n square matrix from now on.

Ax b has a unique solution x A1b if and only if A is invertible, or
equivalently, A 0 or A has full rank n.
In particular, the homogeneous equation Ax 0 has a unique solution
x 0 if and only if A is invertible.
If A 0 (hence Rank( A) n ), the homogeneous equation Ax 0 must have
infinitely many solutions.
If Rank( A) k n , n k of the elements x1, x2 ,, xn in the solution to
Ax 0 can be taken free. Thus the set of all solutions to Ax 0 form a linear
space with n k dimensions.
For non-homogeneous equation Ax b with b 0 and A 0 ,

there is no solution if Rank A Rank A b ;
there are infinitely many solutions if Rank A Rank A b .
14
Eigenvalues and eigenvectors
Given an n n matrix A, if Av v for some scalar and vector v 0 , then

is called an eigenvalue of A and v is an eigenvector corresponding to .
An eigenvalue of A and its corresponding eigenvector v must satisfy
A I v 0
(A1.4)
This can be viewed as a homogeneous equation with unknown v.

Equation (A1.4) has a non-zero solution if and only if
A I 0
(A1.5)
Since eigenvector v 0 , any eigenvalue of A must satisfy equation (A1.5).

For an n n matrix A, A I is a polynomial of with degree n.
Hence equation (A1.5) has n roots, labeled as 1, 2 ,, n (they need not be
distinct), which give all eigenvalues of A.
For each eigenvalue solved from (A1.5), there is at least one eigenvector.
15
If v1, v2 ,, vk are eigenvectors corresponding to distinct eigenvalues

1, 2 ,, k ( k n ), then they are linearly independent.
This can be seen as follows. If v1, v2 ,, vk are linearly dependent, then one of
them can be expressed as a linear combination of other linearly independent
vectors from v1, v2 ,, vk . Without loss of generality, let
v1 2v2 m vm 0 ( 2 m k )
(A1.6)
Since Avi i vi , i 1, 2,, n , 1v1 Av1 and (A1.6) imply
1 2v2 mvm A 2v2 mvm 22v2 mmvm
1 2 2v2 1 m mvm 0
1 2 2 1 m m 0
(since v2 ,, vm are linearly independent).

Thus 2 m 0 as 1, 2 , , k are distinct. This contradicts (A1.6)
and hence v1, v2 ,, vk must be linearly independent.
16
If the eigenvalues 1, 2 ,, n of A are all distinct, then A has n linearly

independent eigenvectors v1, v2 ,, vn . If not, it is possible (but not necessary)
that A has fewer than n linearly independent eigenvectors.
If is a single root of equation (A1.5), there is only one linearly independent
eigenvector v corresponding to .
If m is the largest number such that 1 is a factor of A I , then 1 is
m
said to be an eigenvalue of multiplicity m.

For an eigenvalue 1 of multiplicity m:
There are k m linearly independent eigenvectors v1, v2 ,, vk ;
Any other eigenvector must be a linear combination c1v1 ck vk of
v1, v2 ,, vk , where c1,, ck are scalars;
There exist m linearly independent vectors v1, v2 , , vm , each satisfies:
A I v 0 for some l 1, 2,, m.

l
(A1.7)
17
Similar Transform
Let A be a square matrix with an eigenvalue and its corresponding

eigenvector v. If V is an invertible matrix, then
AV V 1v V 1 Av V 1 v V 1v
(A1.8)
Since V is invertible and v 0 , we must have V 1v 0 . Hence (A1.8) shows

that is also an eigenvalue of V 1 AV ; in other words, A and V 1 AV have
the same eigenvalues.
V 1 AV is called a similar transform of A. Therefore we have shown that
eigenvalues are invariant under a similar transform.
If an n n matrix A has n linearly independent eigenvectors v1, v2 , , vn , then
V v1 v2 vn has an inverse V 1 .
Since Av j j v j , j 1, , n,
AV A v1 v2 vn Av1
Av2 Avn 1v1 2v2 n vn

18
Let v j v1 j
v2 j vnj , j 1, , n , and D diag 1 , 2 ,, n . Then

T
AV 1v1 2v2
v11 v12
v
21 v22
vn1 vn 2
1v11 2v12
v
1 21 2 v22
n vn
1vn1 2vn 2
v1n 1 0
v2 n 0 2
vnn 0 0
n v1n
n v2 n
n vnn
0
0
VD

(A1.9)
Thus V 1 AV D is a diagonal matrix. That is, A can be diagonalised by a

similar transform.
This is possible even if the eigenvalues of A are not all distinct.
19
Example A1.3. The matrix

1 0 0
A 0 1 0
1 1 2
has one double eigenvalue: 1 2 1 and one single eigenvalue 3 2 .

It is easy to find three linearly independent eigenvectors v1, v2 , v3 to form the
matrix V v1 v2 v3 to diagonalise A:
A 1I v1
v2 A I v1
0 0 0 1 0
v2 0 0 0 0 1 0 and
1 1 1 1 1
1 0 0 0
A 2 I v3 0 1 0 0 0 V v1 v2
1 1 0 1
1 0 0
v3 0 1 0
1 1 1
It can be easily checked that V 1 AV diag 1, 2 , 3 diag 1,1, 2 .

20
Triangularisation
An n n matrix A with fewer than n linearly independent eigenvectors cannot

be diagonalised. However, it can be triangularised by a similar transform:
V 1 AV U an upper triangular matrix
(A1.10)
The transform matrix V in (A1.10) can be obtained by solving the linear

equations in (A1.7):
A i I l v 0 ,
l 1, 2, , mi
(A1.11)
for each eigenvalue i of A with multiplicity mi , i = 1,2,,k, where

m1 mk n .
Equations (A1.11) can give n linearly independent vectors v1, v2 , , vn , and
then V v1 v2 vn satisfies (A1.10).
To see why, look at the case with m = 2 for example. Let be an eigenvalue
of multiplicity 2 for a 2 2 matrix A.
21
If A I v 0 has only one linearly independent solution v1, then any other
eigenvector of A has the form cv1 for some scalar c 0 .
Let v2 be a solution to A I v 0 , linearly independent of v1. Then
A I v2 is an eigenvector, which implies A I v2 cv1 for some c 0 ,
so that Av2 cv1 v2 . Thus
2
AV A v1 v2 Av1
Av2 v1 cv1 v2
v1 v2 0 cv1 V 0 cv1
(A1.12)
T
T
Since V 1v1 V 1v2 V 1 v1 v2 V 1V I cV 1v1 c 1 0 c 0 ,
it follows from (A1.12) that V 1 AV is upper triangular:
V 1 AV V 1 V 0 cv1 I 0 cV 1v1
1 0 0 c c
0 1 0 0 0
22
Orthogonal transform
By the spectral theorem in matrix theory, any n n symmetric real matrix A

has real eigenvalues 1,, n and there is an orthogonal matrix V such that
V T AV V 1 AV diag 1,, n
(A1.13)
which is referred to as an orthogonal transform of A.

To obtain this orthogonal matrix V , we can first find linearly independent
eigenvectors u1,, un of A corresponding to 1,, n (which exist by the
spectral theorem). Then ui and u j are orthogonal (i.e., uiT u j 0 ) for i j .
If 1 m is an eigenvalue of multiplicity m and u1,, um (1 m n ) are
the corresponding eigenvectors, take
v1 u1, v1 v1
v1, v2 u2 v1T u2 v1, v2 v2
vm um v1T um v1 vmT 1um vm , vm vm
v2 , ,
vm , where v vT v
23
It is easy to check that v1, , vm are orthonormal eigenvectors corresponding

to the eigenvalue 1 m .
Do this for each eigenvalue of A to obtain orthonormal eigenvectors v1, , vn
corresponding to 1,, n . Let V v1 vn . Then
v1T
v1T v1 v1T vn 1 0

I
V V v1 vn
vT
vT v vT v 0 1
n n
n
n 1
Thus V is an orthogonal matrix and satisfies (A1.13).

Furthermore, it can be shown that a symmetric matrix A is positive definite if
and only if all eigenvalues of A are positive; and A is positive semi-definite if
and only if all eigenvalues of A are nonnegative.
Thus if A is an n n positive semi-definite matrix, then (A1.13) holds with an
orthogonal matrix V and i 0 for i 1, , n .
24
Square root of a matrix
Any matrix B satisfying B 2 A is a square root of A, which is not unique.

However, a positive semi-definite matrix has a unique positive semi-definite
square root. To see this, recall that (A1.13) holds with an orthogonal matrix V
and i 0 , i 1, , n , if A is positive semi-definite. Hence we can define
A1 2 V diag
1 ,, n V T
Then A1 2 is the unique positive semi-definite square root of A:

12
A1 2 A1 2 V diag
V diag
1 ,, n V T V diag
1 ,, n V T
1 ,, n V T V diag 1,, n V T
V V T AV V T VV T AVV T A
(since VV T V TV I for orthogonal matrix V).

25
Appendix A2
Law of iterated expectation

Given any two continuous random variables X and Y, let
f x, y denote the joint density of (X,Y);
f X x and fY y the marginal densities of X and Y, respectively;
f X |Y x | y the conditional density of X given Y = y.
Then
f x, y f X |Y x | y fY y and f X x
Define
f x, y dy
g y E X | Y y x f X |Y x | y dx
Then by definition, E X | Y g Y .
26
It follows that
E E[ X | Y ] E g Y g y fY y dy
x f
X |Y
x | y fY y dxdy
x f x, y dxdy x f x, y dy dx xf X x dx E X
Thus we have the law of iterated expectation:

E X E E[ X | Y ]
(A2.1)
This holds for discrete and mixed distributions as well.

More generally, for any bivariate function h x, y ,
E h X , Y E E[h X , Y | Y ] E h X , y Y y fY y dy
(A2.2)
if Y is continuous, and (because Pr E E I E for any event E)

Pr h X , Y A Pr h X , y A | Y y fY y dy
(A2.3)
27
Appendix A3
Maximum Likelihood Estimator

Consistency
Let X 1, X 2 , , X n be independent and identically distributed (i.i.d.) random

variables with a parametric distribution function F x | and representative
X ~ F x | .
First consider a univariate parameter with true value 0 .
Let l | X i denote the contribution of X i to the likelihood function,
si
log l | X i
2
and vi 2 log l | X i
The log-likelihood function is

n
log L log l | X i
i 1
28
Define
n

log L si
i 1
and
2

2 log L vi
i 1
If the sample is complete with true density f x | 0 , then
E s1
log f x | f x | 0 dx
f x | 0 f
dx
f x |
Assume that the support of f x | does not depend on . Then

E s1 0

f x |
dx
(1)
0
0
f x | dx
0
(A3.1)
29
Hence
E 0 nE s1 0 0
(A3.2)
Let be the maximum likelihood estimator (MLE) of 0 such that () 0 .

Then by Taylor expansion,
0 () 0 0 0
(A3.3)
By the law of large numbers and (A3.1), as n ,

1
1
0 E s1 0 0 and
0 E v1 0
n
n
Hence by (A3.3), for large n,
E s1 0
0
0
0 n
E v1 0
n
(A3.4)
This explains the consistency of the MLE.

30
Asymptotic normality
By the central limit theorem and (A3.1),
1
1 n
0
si 0 N 0, Var s1 0
n
n i 1
(A3.5)
By (A3.3),
n 0
Var
s
1
1 0
0 N 0,
2
n
E v1 0
(A3.6)
Moreover, let f1 f X 1 | . Then

1 2 f1
2 log f1
1 f1 1 f1
v1

2

2
f1 f1
f1 2
1 2 f1
1 2 f1
log f1
2
s1

2
f1
f1 2

(A3.7)
31
By (A3.7) and similar to the derivation of (A3.1), we get

E v1 0
E s12
2 f
1
f x | 0 dx
2
f x | 0
0
E s12
0 2 (1) E s12 0

0
(A3.8)
By (A3.1) and (A3.8),
Var s1 0 E s12 0 E s1 0
E v1 0
(A3.9)
Consequently,
Var 0 nVar s1 0 nE v1 0 E 0 I 0
(A3.10)
where I ( ) is the Fishers information.
32
It follows from (A3.6) and (A3.10) that
E v1 0
0 ~ N 0,
n E v1 0
1
1
N 0,
N 0,
nE v1 0
I 0
approximately for large n.
Therefore the asymptotic distribution of the MLE is given by
~ N 0 ,
This also shows that, for large n, the variance of the MLE of 0 can be
approximated by
Var
I 0
and estimated by Var

I ()
33
Multivariate parameter vector
For a multivariate vector of k parameters, the above results remain valid in

a multivariate version.
The Taylor expansion in (A3.3):
0 0 () 0 0
(A3.11)
still holds, but with a k 1vector 0 and a k k matrix 0 .
The consistency in (A3.4) now becomes

1 1
0 n 0
0
n
E v1 0
E s1 0 0
By the multivariate central limit theorem, (A3.5) remains valid, and similar
arguments to (A3.8) show that the variance matrix in (A3.5) is
T
Var s1 0 E s1 0 s1 0 E v1 0
(A3.12)
34
Combine (A3.5), (A3.11), (A3.12) and the law of large numbers, we get
1
1
1
0
I 0 0 0 n 0
n
n
n
N 0, E v1 0 in distribution as n
(A3.13)
For a k 1random vector X with E[ X ] 0 , we have Var X E[ XX T ], so

that with any m k constant matrix A,
T
Var AX E AX AX A E[ XX T ] AT A Var( X ) AT
(A3.14)
Thus (A3.13) shows that asymptotically as n ,

Var() I 0
nE v1 0 I 0
I 0 I 0 I 0
I 0
and consequently,
~ N 0 , I 0
) I () 1
with variance estimator Var(
35
Moreover, it follows from (A3.12) that I 0 nE v1 0 nVar s1 0 is

a positive-definite matrix, and hence has a positive-definite square root
I 0
12
nE v1 0
12
with inverse I 0
1 2
E v1 0
n
1 2
As a result, (A3.13) and (A3.14) imply
I 0
12
I 0
1 2
I 0
N 0, E v1 0
1 2
E v1 0
1 2
E v1 0 E v1 0
1
I 0 0
n
1 2
N 0, I k
where I k is the k k identity matrix.
It follows that
I 0 0 k2
(A3.15)
36
MLE for U 0, distribution
Let X1, , X n be a (complete) random sample from the U 0, distribution.

Then the likelihood of is
0
L n I0 x ,, x n I 0 x x n
1
n
(1) ( n )
if x( n )
if x( n )
where x(1) min x1 , x2 , , xn and x( n ) max x1 , x2 , , xn
Thus L reaches maximum at x( n ) .

Therefore the maximum likelihood estimator of is X ( n ) .
The cdf of X ( n ) is given by
Pr X ( n )
x
x Pr X1 x, , X n x I0 x I x

37
It follows that for large n,
x
x
Pr n X ( n ) x Pr X ( n ) Pr X ( n )
n
n
1
x x
1
n
n
Consequently,
lim Pr n X ( n )
x
x
x lim 1
n
n
n
This shows that the limiting distribution of n n X ( n ) is

exponential with mean .
Thus the MLE of for the uniform distribution over interval 0, is not
asymptotically normal.
38
Properties of MLE with survival data
The above arguments for consistency and asymptotic normality are based on
complete data. For survival data subject to censoring, the results remain valid,
but the derivations become more complex.
For example, the contribution l | X i to the likelihood becomes
1 i
l | X i f X i | i S X i |
where i is the censoring indicator of X i .
(A3.1) and (A3.8) still hold with the log-likelihood of X i :

log l | X i i log f X i | 1 i log S X i |
(A3.16)
under any censoring distribution.
Let X* denote the failure time (not subject to censoring) with cdf F x |
and C the censoring random variable.
39
Given C = c, the cdf of X X * C is

F x | 0 if x c
Pr X x Pr X c x
1
if x c
That is, given C = c, X has a mixed distribution with density f x | 0 for
x c and a mass at c with probability
Pr X c Pr X * c 1 F c | 0 S c | 0
Note that the censoring indicator is I{ X * c}. Therefore by (A3.16),

s1
log l 1 f
1 S
1
f
S
(A3.17)
where for convenience we write
l l | X1 ,
f f X1 |
and
S S X1 |
40
It follows that
1 f
log l
1 S
E s1 C E
C E
I X * C
I X * C C
S

f x | 0 f x |
S C | 0 S C |
dx
f x |
S C |
E s1 0 E E s1 0 C E
f x |
S C |
dx
(A3.18)
0
If the integration and differentiation in (A3.18) are interchangeable, then
E s1 0 E

F C | S C |
S C |
f x | dx
E
0
0
E F c | S c | E (1) 0

0
0
(A3.19)
41
Furthermore, (3.16)(A3.17) imply

T
T
T
2 log l
log l
1 f
1 S

v1

1
T

f
S
1 f f T 1 2 f
1 S S T 1 2 S
1
2
2
T
T
f
f
S S
1 f
1 f
1 S
1 S
1
1
S
S
f
f
1 2 f
1 2S
1
T
T
f
S
s1 s1
1 2 f
1 2S
I
T X * C
T X * C
f
S
(A3.20)
42
Thus similar arguments to (A3.19) lead to

2
E
T
E v1 0 E s1 0 s1 0
E s1 0 s1 0
E s1 0 s1 0
2S C |
f x | dx
E
F C | S C |
T
(A3.21)
E
1
E
s
1 0 1 0
T
By the law of large numbers and central limit theorem, (A3.19) and (A3.21)
show that the following properties of the MLE remain true for survival data:
(i) 0 in probability as n ; and
1
(ii) ~ N 0 , I 0
approximately for large n.

43
MLE for U 0, with censoring
If the data include both uncensored and censored points, let x j be the largest
censored point, then j 0 and
1 i n . The likelihood becomes

1 i
1 i xi
L 1
i 1
n
1 i
xi
(A3.22)
i 1
for x( n ) max x1,, xn and L 0 if x( n ) .

Extend L in (A3.22) to x j . Then
L( )

n n 1 i
n log 1 i log xi
i 1 xi
i 1
as x j ( x j )
1 i
n
xj
i j xi
n 1 i 0 as
44
Thus 0 has a solution x j and is a maximum point of the
extended L over x j . Then the MLE of is max , x( n ) , since the

actual likelihood L 0 for x( n ) , which implies x( n ) .
If x( n ) is censored, then x j x( n ) . Hence x( n ) .
If x( n ) is uncensored ( x( n ) x j ), then either x( n ) or x( n ) is possible
(depending on the data). When x( n ) , the MLE of is x( n ) .
To see an example, let n 2 and x2 x1 1. Suppose that x1 1 is censored
and x2 x( n ) is uncensored. Then for x j x1 1,
1
2

0 2
1 ( 1)
2
Hence 2 if 1 x2 x( n ) 2 ; or x( n ) x2 if x2 2 .
45
Furthermore, by (A3.18),
E s1 0 E
I x dx

1 IC IC

0
C 0 C
2 I x dx 2 IC E
2 IC
2
0
0
0
0
C 0
C
IC IC 2 IC
E
2
0
0
0
0
0
0
C
E 2 IC 2 IC 2 IC E IC
0
0
0
0
0
0
0
1
E IC Pr C 0
0
0
0
(A3.23)
46
Similarly by (A3.20),
E v1 0
E s12
E s12
0 E
0 E
2 f x |
2
2
2S C |
dx
I
dx
3 x
I
C
3 0
2C
2C 0 2C
2
2
E s1 0 E
3 IC E s1 0 E 2 IC
3
0
0
0
0
2
2
E s1 0 2 Pr C 0
(A3.24)
(A3.23) and (A3.24) show that

E s1 0 0 and Var s1 0 E s12 0 E v1 0
(A3.25)
if and only if Pr C 0 0 , or Pr C 0 1.
47
Thus the MLE of in a censored U 0, remains asymptotically normal if

() 0 and Pr C 1.
0
Recall that a key regularity condition for the properties of MLE is that the
support of the failure distribution does not depend on unknown parameters,
which can generally ensure interchangeable integration with differentiation in
(A3.1) and (A3.8), or in (A3.19) and (A3.21), in order to obtain (A3.25).
For example, U 0, does not satisfy this condition and in this case,
f
dx
1
dx

0

0
2 dx
0
0

1
f dx
0
This is, however, not a necessary condition. Even if it fails, (A3.25) may still
hold, such as in a censored U 0, with Pr C 0 1.
When the support of the failure distribution depends on unknown parameters,
we need to check (A3.25) specifically instead of relying on the general theory.
48

Appendix A

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Appendix A

Uploaded by

Copyright:

Available Formats

Appendix A1

Review of Linear Algebra

The elements in a matrix can be anything of interest (numbers, functions or

The transpose of an m n matrix A, denoted by AT is an n m matrix with aij

In particular, a vector v of dimension n can be written as

A matrix A is said to be symmetric if AT A , or equivalently, aij a ji for all

mn bij mn aij bij mn

The sum of matrices: A B aij

The scalar multiple of

The product of A aij

aik bkj ai1b1 j ai 2b2 j ainbnj

Note that AB BA in general.

In particular, for v v1 v2 vn and u u1 u2 un ,

An n n matrix I is called an identity matrix if it has the form

It satisfies AI IA A for any n n matrix A.

For any n n matrix A aij and vector x x1 x2 xn , the product

is called a quadratic form. In particular,

An n n symmetric matrix A is said to be:

For a 2 2 matrix A aij , its determinant is denoted and defined by

A a11a22 a12 a21

For a 3 3 matrix A aij , the minor M ij of aij is the determinant of the

The co-factor Aij of aij is defined by Aij (1)i j M ij .

(expansion by row i); or

for any row i = 1, 2, 3 or any column j = 1, 2, 3.

Similarly, we can calculate A for an n n matrix A with n = 4,5,.

2 24 (12) 4 0 6 5 0 (9) = 141

(expansion by column 1).

Linear independence of vectors

Vectors v1, v2 ,, vk are said to be linearly independent if

That is, one of v1, v2 ,, vk is a linear combination of the others.

For an m n matrix A, the maximum number of linearly independent rows

For an n n matrix A , its inverse matrix, denoted by A1 , is an n n matrix

where Aij is the co-factor of aij .

For example, the inverse of a 2 2 matrix A aij is given by

a11a22 a12 a21 a21 a11

provided A a11a22 a12 a21 0 .

A system of linear equations can be expressed in matrix form as

where A is an m n matrix, x is an n 1 vector and b is an m 1 vector, with x

The solutions to (A1.3) have three possibilities:

If m n , a unique solution is impossible. We will focus on the case with

Assume A to be an n n square matrix from now on.

For non-homogeneous equation Ax b with b 0 and A 0 ,

Eigenvalues and eigenvectors

Given an n n matrix A, if Av v for some scalar and vector v 0 , then

This can be viewed as a homogeneous equation with unknown v.

Since eigenvector v 0 , any eigenvalue of A must satisfy equation (A1.5).

If v1, v2 ,, vk are eigenvectors corresponding to distinct eigenvalues

Since Avi i vi , i 1, 2,, n , 1v1 Av1 and (A1.6) imply

1 2v2 mvm A 2v2 mvm 22v2 mmvm

(since v2 ,, vm are linearly independent).

If the eigenvalues 1, 2 ,, n of A are all distinct, then A has n linearly

said to be an eigenvalue of multiplicity m.

A I v 0 for some l 1, 2,, m.

Let A be a square matrix with an eigenvalue and its corresponding

Since V is invertible and v 0 , we must have V 1v 0 . Hence (A1.8) shows

Av2 Avn 1v1 2v2 n vn

v2 j vnj , j 1, , n , and D diag 1 , 2 ,, n . Then

Thus V 1 AV D is a diagonal matrix. That is, A can be diagonalised by a

Example A1.3. The matrix

has one double eigenvalue: 1 2 1 and one single eigenvalue 3 2 .

It can be easily checked that V 1 AV diag 1, 2 , 3 diag 1,1, 2 .

An n n matrix A with fewer than n linearly independent eigenvectors cannot

The transform matrix V in (A1.10) can be obtained by solving the linear

for each eigenvalue i of A with multiplicity mi , i = 1,2,,k, where