You are on page 1of 26

CRMs

Sinead
Williamson

Background

Lévy processes
Completely random measures and related
Completely
random
models
measures

Applications
Normalized
random
Sinead Williamson
measures
Neutral-to-the-
right Computational and Biological Learning Laboratory
processes
Exchangeable University of Cambridge
matrices

January 20, 2011


Outline

CRMs

Sinead
Williamson
1 Background
Background

Lévy processes

Completely 2 Lévy processes


random
measures

Applications 3 Completely random measures


Normalized
random
measures
Neutral-to-the-
right
processes 4 Applications
Exchangeable
matrices Normalized random measures
Neutral-to-the-right processes
Exchangeable matrices
A little measure theory

CRMs

Sinead
Williamson
Set: e.g. Integers, real numbers, people called James.
Background
May be finite, countably infinite, or uncountably infinite.
Lévy processes

Completely
Algebra: Class T of subsets of a set T s.t.
random
measures
1 T ∈T.
Applications
2 If A ∈ T , then Ac ∈ T .
Normalized
random
3 If A1 , . . . , AK ∈ T , then ∪K
k=1 Ak = A1 ∪ A2 ∪ . . . AK ∈ T
measures
Neutral-to-the-
(closed under finite unions).
right
processes
4 If A1 , . . . , AK ∈ T , then ∩K
k=1 Ak = A1 ∩ A2 ∩ . . . AK ∈ T
Exchangeable
matrices (closed under finite intersections).
σ-Algebra: Algebra that is closed under countably infinite
unions and intersections.
A little measure theory

CRMs

Sinead
Williamson

Background

Lévy processes Measurable space: Combination (T , T ) of a set and a


Completely
random
σ-algebra on that set.
measures
Measure: Function µ between a σ-field and the positive
Applications
Normalized
reals (+ ∞) s.t.
random
measures 1 µ(∅) = 0.
Neutral-to-the-
right 2 For all countable collections P of disjoint sets
processes
Exchangeable A1 , A2 , · · · ∈ T , µ(∪k Ak ) = k µ(Ak ).
matrices
Probability measures

CRMs

Sinead
Williamson

Background Probability distribution: Measure P on some measurable


Lévy processes space (Ω, F) s.t. P(Ω) = 1.
Completely
random
Intuition: Subsets = events; measures of subsets =
measures probability of that event.
Applications
Normalized Discrete probability distribution: assigns measure 1 to a
random
measures
Neutral-to-the-
countable subset of Ω.
right
processes
Exchangeable
Continuous probability distribution: assigns measure 0 to
matrices
singletons x ∈ Ω.
Atoms: singletons with positive measure.
Representing the real world

CRMs

Sinead
Williamson

Background Kolmogorov: Two types of object - experimental observations,


Lévy processes and the random phenomena underlying them.
Completely
random
measures
Real world Mathematical world
Applications
Normalized Random phenomena Probability space (Ω, F, P)
random
measures
Neutral-to-the-
right
processes Experiment Algebra
Exchangeable
matrices

Experimental observations Collection of random variables


Representing the real world

CRMs
Random variables X : (Ω, F) → (SX , SX ) are mappings
Sinead
Williamson from the underlying probability space to our observation
Background
space.
Lévy processes This mapping, combined with the probability distribution
Completely on (Ω, F), induces a probability distribution
random
measures µX := P ◦ X −1 on the observation space.
Applications We call µX the distribution of our observations.
Normalized
random

X
measures
Neutral-to-the-
right
processes
Exchangeable
matrices

SX

Characteristic functions

CRMs

Sinead
Williamson
Often, it is useful to represent random variables and
Background
probability distributions in terms of their characteristic
Lévy processes
function.
Completely
random For a random variable X taking values in Rd with
measures

Applications
distribution µX ,
Normalized
Z
e ihuy i µX (dy ) = E[e ihuy i ]
random
measures ΦX (u) =
Neutral-to-the-
right Rd
processes
Exchangeable
matrices
If µX admits a density (i.e. µX (dy ) = p(y )ν(dy )), then
the characteristic function is the Fourier transform of that
density.
Infinitely divisible distributions

CRMs

Sinead
Williamson

Background
We say a probability measure µ is infinitely divisible if, for each
Lévy processes n ∈ N:
Completely We can write µ as the n-fold self-convolution
random
measures µ(n) ∗ · · · ∗ µ(n) of some distribution µ(n) .
Applications
Normalized
(Equivalently) The nth root Φ(n) of the characteristic
random
measures function of µ is the characteristic function of some
Neutral-to-the-
right
processes
probability measure.
Exchangeable
matrices (Equivalently) For any X ∼ µ, we can write
X = ni=1 X (i) , where X (i) ∼ µ(n) .
P
(The celebrated) Lévy-Khintchine formula

CRMs
Theorem: Lévy-Khintchine
Sinead
Williamson
A distribution µ on Rd is infinitely divisible iff its characteristic
Background function Φµ can be represented in the form:
Lévy processes

Completely 1
random Φµ (u) = exp ihb, ui − hu, Aui
measures 2
Applications
Z
Normalized
random
+ (e ihu,zi − 1 − ihu, ziI (|z ≤ 1))ν(dz, ds) ,
measures (Rd −{0})×SX
Neutral-to-the-
right
processes
Exchangeable
matrices
for some uniquely defined vector b ∈ Rd , positive-definite
symmetric matrix A, and measure ν on Rd satisfying:
Z
(|z|2 ∧ 1)ν(dz, ds) < ∞ .
Rd −{0}×SX
Notation

CRMs

Sinead
Williamson

Background

Lévy processes
We call:
Completely
random
measures
b the drift;
Applications A the Gaussian covariance matrix;
Normalized
random
measures
ν the Lévy measure;
Neutral-to-the-
right
processes
the triplet (A, ν, b) the generating triplet.
Exchangeable
matrices
Lévy processes

CRMs

Sinead
Williamson A Lévy process is a stochastic process X = (Xt )t≥0 s.t.
Background 1 X0 = 0.
Lévy processes 2 X has independent increments, i.e. for each n ∈ N and
Completely
random
each t1 ≤ · · · ≤ tn+1 , the random variables
measures
(Xti+1 − Xti , 1 ≤ i ≤ n) are independent.
Applications
Normalized 3 X is stochastically continuous, i.e. for every  > 0 and
random
measures
Neutral-to-the-
s ≥ 0,
right
processes lim P(|Xt − Xs | > ) = 0 . (1)
Exchangeable s→t
matrices

4 Sample paths of X are right-continuous with left limits.


A Lévy process is homogeneous if its increments are stationary
– i.e. if the distribution of Xt+s − Xt does not depend on t.
Lévy processes and infinite divisibility

CRMs

Sinead
Williamson

Background

Lévy processes
Theorem: Infinite divisibility
Completely
random
Xt is infinitely divisible for all t ≥ 0.
measures

Applications
Normalized
Proof
random
measures
Neutral-to-the-
(Homogeneous case) Since X has independent increments, we
right
processes can write Xt as the sum of n independent random variables for
Exchangeable
matrices any n ∈ N. Therefore, Xt is infinitely divisible.
Lévy processes and infinite divisibility

CRMs

Sinead
Williamson
Infinite divisibility means the Lévy-Khintchine formula
Background
holds.
Lévy processes
So, we can describe a Lévy process in terms of a drift
Completely
random vector, a Gaussian covariance matrix and a Lévy measure.
measures

Applications
A related result - the Lévy-Itô decomposition, tells us that
Normalized
random
any Lévy process can be decomposed into the
measures
Neutral-to-the-
superposition of three Lévy processes:
right
processes A continuous, deterministic process, governed by the drift.
Exchangeable
matrices A continuous, random process (Brownian motion),
governed by the Gaussian covariance matrix.
A pure-jump, random process, governed by the Lévy
measure.
Subordinators

CRMs

Sinead
Williamson

Background A subordinator is a Lévy process with strictly increasing


Lévy processes
sample paths.
Completely
random A Lévy process on R+ has increasing sample paths iff:
measures
A = 0 ← no Gaussian component.
Applications
Normalized
random
R ≥ 0 ← deterministic component is strictly nondecreasing.
b
measures ν(dz × R+ ) = 0 ← no negative jumps.
Neutral-to-the- R(−∞,0)
right
processes (0,1]
zν(dz × R+ ) < ∞ ← ensures conditions of Lévy
Exchangeable
matrices
process.
If 0 < ν < ∞, then X has countably infinite jumps.
Completely random measures

CRMs

Sinead
Williamson Random measure: Mapping M : (Ω, F) → (SM , SM ),
where (SM , SM ) is a set of measures.
Background

Lévy processes Completely random measure (CRM): Random measures


Completely where (SM , SM ) is a set of measures such that µ(A1 ) and
random
measures µ(A2 ) are independent whenever A1 and A2 are disjoint.
Applications CRMs can be decomposed into three parts:
Normalized
random
measures
1 An atomic measure with random atom locations and
Neutral-to-the-
right
random atom masses.
processes
Exchangeable 2 An atomic measure with (at most countable) fixed atom
matrices
locations and random atom masses.
3 A non-random measure.

Parts 2 and 3 can be easily dealt with, so we only consider


part 1.
Completely random measures and Lévy processes

CRMs
CRM: Distribution over measures that assign independent
Sinead
Williamson masses to disjoint subsets.
This distribution is infinitely divisible, so Lévy-Khintchine
Background

Lévy processes
applies.
Completely
CRMs are closely related to Lévy processes:
random If X is a subordinator, then the measure M defined so
measures
M(t, s] = Xt − Xs is a CRM.
Applications
Normalized
If M is a completely random measure on R+ , then it’s
random
measures cumulative function is a subordinator.
Neutral-to-the-
right
processes
Just as a subordinator (with ν > 0) has a countably
Exchangeable
matrices infinite number of jumps, a CRM assigns positive mass to
a countably infinite number of locations:

X
M= πi δti ,
i=1
where πi > 0 for all i.
Completely random measures and Poisson
processes
CRMs

Sinead
Williamson

Background

Lévy processes Can catgorize atoms as (size, location) pairs in some space
Completely R+ × SX .
random
measures Define a Poisson point process on this space with Lévy
Applications measure ν(dz, ds).
Normalized
random
measures Events of Poisson point proces give size and location of
Neutral-to-the-
right
processes
atoms of CRM.
Exchangeable
matrices Homogeneous CRM ↔ ν(dz, ds) = νz (dz)νs (ds).
Example: Gamma process

CRMs

Sinead
Williamson

Background
Let H be a measure over some space (SX , SX ).
Lévy processes

Completely Distribution over measures such that the mass assigned to


random
measures
a given subset A ∈ S is distributed according to
Applications Gamma(c, αH(ds)), c, α > 0.
Normalized
random
measures
Such a distribution is a CRM with Lévy measure
Neutral-to-the-
right
processes
αe −cz
Exchangeable
matrices ν(dz, ds) = dzH(ds) .
z
Normalized random measures

CRMs

Sinead
Williamson

Background

Lévy processes
Completely random measures are distributions over
Completely measures with random (finite) total measure.
random
measures In Stats and ML, we are often interested in probability
Applications measures.
Normalized
random
measures Obvious solution: Normalize!
Neutral-to-the-
right
processes Example: Dirichlet process = normalized Gamma process.
Exchangeable
matrices
Example: Normalized stable process.
Survival analysis

CRMs
Objective: Estimate distribution over time T at which a
Sinead
Williamson specified event occurs for a given individual.
Background
Examples:
Lévy processes Deaths of patients in a study.
Completely
random
Failure times of mechanical components.
measures
Time at which a user leaves a website.
Applications
Normalized
random
Observations:
measures
Neutral-to-the-
right
Observe individuals i = 1, . . . , n over time.
processes
Exchangeable
matrices
Record times Ti = ti ∈ R+ at which events occur.
Right-censoring:
Each individual i is observed over some time interval [0, ci ].
If Ti > ci , the event is unobserved (censored) for
individual i.
Representing distribution over event times

CRMs
Cumulative distribution
R t function
Sinead
Williamson
F (t) = P(T < t) = 0 f (u)du.
f (t)
Background
Hazard rate h(t) = 1−F (t) .
Rt
Lévy processes Cumulative hazard (def. 1): H(t) = 0 h(u)du.
Completely Cumulative hazard (def. 2): A(t) = −log (1 − F (t)).
random
measures Definitions coincide if the cdf is continuous.
2
Applications
Normalized 1.8
CDF
random Hazard rate
measures 1.6
Cumulative hazard
Neutral-to-the-
right 1.4
processes
Exchangeable 1.2
matrices
1

0.8

0.6

0.4

0.2

0
0 5 10 15
time
Neutral-to-the-right processes

CRMs
Doksum (1974): A random distribution function F (t) is
Sinead
Williamson neutral-to-the-right if, for each k > 1 and t1 < · · · < tk ,
the normalised increments
Background
F (t2 ) − F (t1 ) F (tk ) − F (tk−1 )
Lévy processes
F (t1 ), ,··· ,
Completely 1 − F (t1 ) 1 − F (tk−1 )
random
measures are independent.
Applications
Normalized
Doksum (1974): F (t) is neutral-to-the-right iff its
random
measures cumulative hazard (def. 2) is the cumulative function of a
Neutral-to-the-
right
processes
completely random measure.
Exchangeable
matrices Hjort (1990): F (t) is neutral-to-the-right iff its cumulative
hazard (def. 1) is the cumulative function of a completely
random measure.
In both cases, F (t) is conjugate under observed and
right-censored observations (Ferguson and Phadia, 1979;
Hjort, 1990).
Example: Beta process

CRMs

Sinead
CRM with Lévy measure
Williamson

Background
ν(dz, ds) = c(s)z −1 (1 − z)c(s)−1 dzH(ds) ,
Lévy processes
where c is a non-negative, p/w continuous function and H
Completely
random is a (def. 2) hazard function.
measures

Applications Note: Lévy measure depends on atom location


Normalized
random (inhomogeneous).
measures
Neutral-to-the-
right Discrete measure with atom masses in (0, 1).
processes
Exchangeable
matrices Intuition: Infinitesimal limit of beta-distributed atom
masses.
Survival analysis intuition:
Atom location = time.
Atom size = probability of event at that time, given
survival until that time.
Application: Exchangeable matrices

CRMs

Sinead
Williamson

Background A sequence is exchangeable if any permutation of that


Lévy processes sequence has equal probability.
Completely
random de Finetti: There exists an underlying measure,
measures
conditioned on which, the sequence is iid.
Applications
Normalized
random
Recipe for exchangeable distribution: Combine a
measures
Neutral-to-the-
distribution over measures with an appropriate (*cough*
right
processes conjugate) likelihood.
Exchangeable
matrices
Example: Dirichlet process + “multinomial” distribution
→ Chinese restaurant process.
Application: Exchangeable matrices

CRMs

Sinead
We can use CRMs to define exchangeable distributions
Williamson
over matrices with infinite columns.
Background Each column corresponds to an atom of the
Lévy processes CRM-distributed measure.
Completely
random
measures

Applications
Beta process + Bernoulli Gamma process + Poisson
Normalized likelihood likelihood
random
measures → Indian Buffet process → infinite gamma-Poisson process
Neutral-to-the-
right
processes
(Griffiths and Ghahramani, 2005) (Titsias, 2007)
Exchangeable
matrices

5 4 2 2 1 0 0 1 0
4 4 3 2 0 2 1 0 0 0
6 2 3 4 0 0 2 0 0 0
3 5 1 0 3 1 0 1 0 0 0
5 3 4 1 1 2 0 0 0 0 0 0
4 4 2 2 2 0 1 0 0 0

You might also like