Professional Documents
Culture Documents
Questions:
1. Please can you introduce yourself and your role at NTU?
2. What are your main research interests?
3. How do you normally publish your research outputs?
4. What does open access mean to you?
5. Why is open access important to you?
6. What are the benefits of open access publishing?
7. What advice would you give colleagues in relation to open access publishing?
Possible things to consider for inclusion in your answers
What advice would you give colleagues in relation to open access publishing?
If not published research through Gold OA, still can be RCUK and REF compliant by
submitting bibliographic data and full text to IRep as soon as accepted for
publication
Deposit in IRep is easy
Help is available - The Library Research Team can provide guidance on Choosing
where to publish; Copyright; Embargoes and APCs
Principal compooem analy,;" or rcA.;s a technique tha! is "'idely u<ed for appli.
and data v;,ualizatiOll (Jolliffe, 2(02). It;s also kno.... " as tile Karoan.n·I..,;"" tran,·
f~.
lbcrc an: t....o commonly used definitions of PeA that giye rise to the >arne
algorithm. PeA can be defined as the unhog<lnal projtttion of the data O/1tO a lo....er
dimensionallincar space. kno....n as the pri/lcip.al $uh.•p.aa. soch that the \'ariance of
the projttted data i' ma~imi,e<J (1I",.lIing. 1933). Equi"alemly,;t can be defined as
tbe linear projection that minimi"'. the average projttlion cost. defined as t~ mean
squa.-ed distance !letween the data [>Oint< and tbeir p<ojtttioo, (Pearson, 19(1). The
Con,ider a dala set <If obser"\lations {x,,} where" = 1..... S, and x" i, a
Euclidean variable "'ilh dimen,ionality D. Our goal is to project If>/:: data onto a
'pace ha"ing dimen,ionality M < D" hile Ill3Jli",i,illg the "ariallCe ofthe projttted
data. For the !noll..nl. we 'hall assume that tbe "alue of M is g;\·en. Latcr in this
the data.
We can define the direction of this space using a D-dimensional vector Ul, which
for convenience (and without loss of generality) we shall choose to be a unit vector
so that ufUl = 1 (note that we are only interested in the direction defined by Ul,
not in the magnitude of Ul itself). Each data point X n is then projected onto a scalar
value ufX n . The mean of the projected data is ufx where x is the sample set mean
given by
(12.1)
1N
S = - "(xn - x)(xn - x)T NLJ . n=l
(12.2)
(12.3)
Appendix E
We now maximize the projected variance UfSUl with respect to Ul. Clearly, this has
we introduce a Lagrange multiplier that we shall denote by AI, and then make an
unconstrained maximization of
(12.4)
By setting the derivative with respect to Ul equal to zero, we see that this quantity
(12.5)
(12.6)
and so the variance will be a maximum when we set Ul equal to the eigenvector
having the largest eigenvalue AI. This eigenvector is known as the first principal
component.
choosing each new direction to be that which maximizes the projected variance
Exercise 12.1
Section 12.2.2
Appendix C
amongst all possible directions orthogonal to those already considered. If we consider the general
case of an M -dimensional projection space, the optimal linear projection for which the variance of
the projected data is maximized is now defined by
M largest eigenvalues >'1, ... ,AM. This is easily shown using proof by induction.
and the covariance matrix S ofthe data set and then finding the M eigenvectors of S
corresponding to the M largest eigenvalues. Algorithms for finding eigenvectors and
can be found in Golub and Van Loan (1996). Note that the computational cost of
computing the full eigenvector decomposition for a matrix of size D x Dis O(D3).
If we plan to project our data onto the first M principal components, then we only
need to find the first M eigenvalues and eigenvectors. This can be done with more
efficient techniques, such as the power method (Golub and Van Loan, 1996), that
Because we wish to find a sequential sampling scheme, we shall suppose that a set of samples and
weights have been obtained at time step n, and that we have subsequently observed the value of
xn+1, and we wish to find the weights and samples at time step n + 1. We first sample from the
distribution p(zn+1|Xn). This is 646 13. SEQUENTIAL DATA straightforward since, again using Bayes’
theorem p(zn+1|Xn) = p(zn+1|zn, Xn)p(zn|Xn) dzn = p(zn+1|zn)p(zn|Xn) dzn = p(zn+1|zn)p(zn|xn,
Xn−1) dzn = p(zn+1|zn)p(xn|zn)p(zn|Xn−1) dzn p(xn|zn)p(zn|Xn−1) dzn = l w(l) n p(zn+1|z(l) n )
(13.119) where we have made use of the conditional independence properties p(zn+1|zn, Xn) =
p(zn+1|zn) (13.120) p(xn|zn, Xn−1) = p(xn|zn) (13.121) which follow from the application of the d-
separation criterion to the graph in Figure 13.5. The distribution given by (13.119) is a mixture
distribution, and samples can be drawn by choosing a component l with probability given by the
mixing coefficients w(l) and then drawing a sample from the corresponding component. In summary,
we can view each step of the particle filter algorithm as comprising two stages. At time step n, we
have a sample representation of the posterior distribution p(zn|Xn) expressed as samples {z (l) n }
with corresponding weights {w(l) n }. This can be viewed as a mixture representation of the form
(13.119). To obtain the corresponding representation for the next time step, we first draw L samples
from the mixture distribution (13.119), and then for each sample we use the new observation xn+1
to evaluate the corresponding weights w(l) n+1 ∝ p(xn+1|z (l) n+1). This is illustrated, for the case of
a single variable z, in Figure 13.23. The particle filtering, or sequential Monte Carlo, approach has
appeared in the literature under various names including the bootstrap filter (Gordon et al., 1993),
survival of the fittest (Kanazawa et al., 1995), and the condensation algorithm (Isard and Blake, 1998)