Professional Documents
Culture Documents
Introduc.on
:
Learning
from
Data
Dr
Gavin
Brown
Machine
Learning
and
Op.miza.on
Research
Group
Learning
from
Data
(No
defini.on
of
a
field
is
perfect
–
the
diagram
above
is
just
one
interpreta.on,
mine
;-‐)
Learning
from
Data
The
world
is
drowning
in
data.
Book
sales
:
Amazon
makes
250,000
sales/deliveries
per
day
Gene+cs
:
100,000
genes
sequenced
while-‐u-‐wait
(almost)
Search
:
~10
billion
Google
Images
/
48hrs
per
min
uploaded
to
YouTube
Health
records
:
NHS
plan
to
have
60m
electronic
records
in
place
by
2015
This
theme
studies
algorithms
that
enable
us
to
extract
meaning
from
data.
Learning
from
Data
COMP61011
COMP61021
Founda.ons
of
Modeling
&
Visualiza.on
Machine
Learning
of
High
Dimensional
Data
Predic+on
Descrip+on
Lecturer:
Dr
Gavin
Brown
Machine
Learning
and
Data
Mining
Spam
emails
How
can
we
predict
if
something
is
spam/genuine?
Machine
Learning
and
Data
Mining
HISTORICAL
HEALTH RECORDS
x1
x2
Label
98.7
157.6
1
93.6
138.8
0
42.8
171.9
0
92.8
154.5
1
Learning
Algorithm
(Weeks
3-‐4)
Period
1
Period
2
Oct/Nov
Nov/Dec
COMP61011
COMP61021
Founda.ons
of
Modeling
&
Visualiza.on
Machine
Learning
of
High
Dimensional
Data
Predic+on
Descrip+on
Lecturer:
Dr
Ke
Chen
Modeling
and
Visualiza.on
of
High
Dimensional
Data
Gene
Maps
The
human
body
has
about
24,000
ac.ve
genes
–
soon
you
will
be
able
to
buy
your
own
gene
map
for
a
few
hundred
pounds.
How
can
we
visualize
this?
Modeling
and
Visualiza.on
of
High
Dimensional
Data
Image
processing
Gesture
recogni.on
–
how
can
we
represent
the
mo.on
of
a
human
with
so
many
complex
joints
and
angles?
Pre-‐requisite
knowledge
•
Vectors
•
Matrix
proper+es,
e.g.
determinant,
rank,
inverse
•
Vector
Space
proper+es,
e.g.
orthonormal
basis
•
Eigenvectors
and
Eigenvalues
•
Matrix
Calculus,
e.g.
deriva?ves
in
matrix
form
•
Op+misa+on
basics,
e.g.
Lagrange
mul?pliers
Learning
from
Data
…..
Prerequisites
MATHEMATICS
This
is
a
mathema+cal
subject.
You
must
be
comfortable
with
probabili+es
and
algebra.
PROGRAMMING
You
must
be
able
to
program,
and
pick
up
a
new
language
rela.vely
easily.
We
provide
support
for
Matlab.
http://studentnet.cs.manchester.ac.uk/pgt/COMP61011
http://studentnet.cs.manchester.ac.uk/pgt/COMP61021
Matlab
MATrix LABoratory
1. If
you
don’t
like
maths.
61011
is
reasonably
challenging.
61021
is
HARD.
Another
valid
name
for
machine
learning
is
“Computa.onal
Sta.s.cs”.
3. If
you
have
the
“I
want
to
use
machine
learning
to
do
X”
syndrome
This
is
a
real
technical
subject.
It’s
not
magic.
BTW…
You
will
learn
nothing
about
“Big
Data”,
or
how
to
deal
with
it
Syllabus
COMP61011
(Founda.ons
of
Machine
Learning)
• Linear
Models
• Support
Vector
Machines
• Nearest
Neighbour
Methods
• Decision
Trees
• Combining
Models
-‐
ensemble
methods,
mixtures
of
experts,
boos.ng
• Feature
Selec.on
• Probabilis.c
Classifiers
and
Bayes
Theorem
• Algorithm
assessment
-‐
overfilng,
generalisa.on,
comparing
two
algorithms