Professional Documents
Culture Documents
Sinead
Williamson
Background
Lévy processes
Completely random measures and related
Completely
random
models
measures
Applications
Normalized
random
Sinead Williamson
measures
Neutral-to-the-
right Computational and Biological Learning Laboratory
processes
Exchangeable University of Cambridge
matrices
CRMs
Sinead
Williamson
1 Background
Background
Lévy processes
CRMs
Sinead
Williamson
Set: e.g. Integers, real numbers, people called James.
Background
May be finite, countably infinite, or uncountably infinite.
Lévy processes
Completely
Algebra: Class T of subsets of a set T s.t.
random
measures
1 T ∈T.
Applications
2 If A ∈ T , then Ac ∈ T .
Normalized
random
3 If A1 , . . . , AK ∈ T , then ∪K
k=1 Ak = A1 ∪ A2 ∪ . . . AK ∈ T
measures
Neutral-to-the-
(closed under finite unions).
right
processes
4 If A1 , . . . , AK ∈ T , then ∩K
k=1 Ak = A1 ∩ A2 ∩ . . . AK ∈ T
Exchangeable
matrices (closed under finite intersections).
σ-Algebra: Algebra that is closed under countably infinite
unions and intersections.
A little measure theory
CRMs
Sinead
Williamson
Background
CRMs
Sinead
Williamson
CRMs
Sinead
Williamson
CRMs
Random variables X : (Ω, F) → (SX , SX ) are mappings
Sinead
Williamson from the underlying probability space to our observation
Background
space.
Lévy processes This mapping, combined with the probability distribution
Completely on (Ω, F), induces a probability distribution
random
measures µX := P ◦ X −1 on the observation space.
Applications We call µX the distribution of our observations.
Normalized
random
X
measures
Neutral-to-the-
right
processes
Exchangeable
matrices
SX
Ω
Characteristic functions
CRMs
Sinead
Williamson
Often, it is useful to represent random variables and
Background
probability distributions in terms of their characteristic
Lévy processes
function.
Completely
random For a random variable X taking values in Rd with
measures
Applications
distribution µX ,
Normalized
Z
e ihuy i µX (dy ) = E[e ihuy i ]
random
measures ΦX (u) =
Neutral-to-the-
right Rd
processes
Exchangeable
matrices
If µX admits a density (i.e. µX (dy ) = p(y )ν(dy )), then
the characteristic function is the Fourier transform of that
density.
Infinitely divisible distributions
CRMs
Sinead
Williamson
Background
We say a probability measure µ is infinitely divisible if, for each
Lévy processes n ∈ N:
Completely We can write µ as the n-fold self-convolution
random
measures µ(n) ∗ · · · ∗ µ(n) of some distribution µ(n) .
Applications
Normalized
(Equivalently) The nth root Φ(n) of the characteristic
random
measures function of µ is the characteristic function of some
Neutral-to-the-
right
processes
probability measure.
Exchangeable
matrices (Equivalently) For any X ∼ µ, we can write
X = ni=1 X (i) , where X (i) ∼ µ(n) .
P
(The celebrated) Lévy-Khintchine formula
CRMs
Theorem: Lévy-Khintchine
Sinead
Williamson
A distribution µ on Rd is infinitely divisible iff its characteristic
Background function Φµ can be represented in the form:
Lévy processes
Completely 1
random Φµ (u) = exp ihb, ui − hu, Aui
measures 2
Applications
Z
Normalized
random
+ (e ihu,zi − 1 − ihu, ziI (|z ≤ 1))ν(dz, ds) ,
measures (Rd −{0})×SX
Neutral-to-the-
right
processes
Exchangeable
matrices
for some uniquely defined vector b ∈ Rd , positive-definite
symmetric matrix A, and measure ν on Rd satisfying:
Z
(|z|2 ∧ 1)ν(dz, ds) < ∞ .
Rd −{0}×SX
Notation
CRMs
Sinead
Williamson
Background
Lévy processes
We call:
Completely
random
measures
b the drift;
Applications A the Gaussian covariance matrix;
Normalized
random
measures
ν the Lévy measure;
Neutral-to-the-
right
processes
the triplet (A, ν, b) the generating triplet.
Exchangeable
matrices
Lévy processes
CRMs
Sinead
Williamson A Lévy process is a stochastic process X = (Xt )t≥0 s.t.
Background 1 X0 = 0.
Lévy processes 2 X has independent increments, i.e. for each n ∈ N and
Completely
random
each t1 ≤ · · · ≤ tn+1 , the random variables
measures
(Xti+1 − Xti , 1 ≤ i ≤ n) are independent.
Applications
Normalized 3 X is stochastically continuous, i.e. for every > 0 and
random
measures
Neutral-to-the-
s ≥ 0,
right
processes lim P(|Xt − Xs | > ) = 0 . (1)
Exchangeable s→t
matrices
CRMs
Sinead
Williamson
Background
Lévy processes
Theorem: Infinite divisibility
Completely
random
Xt is infinitely divisible for all t ≥ 0.
measures
Applications
Normalized
Proof
random
measures
Neutral-to-the-
(Homogeneous case) Since X has independent increments, we
right
processes can write Xt as the sum of n independent random variables for
Exchangeable
matrices any n ∈ N. Therefore, Xt is infinitely divisible.
Lévy processes and infinite divisibility
CRMs
Sinead
Williamson
Infinite divisibility means the Lévy-Khintchine formula
Background
holds.
Lévy processes
So, we can describe a Lévy process in terms of a drift
Completely
random vector, a Gaussian covariance matrix and a Lévy measure.
measures
Applications
A related result - the Lévy-Itô decomposition, tells us that
Normalized
random
any Lévy process can be decomposed into the
measures
Neutral-to-the-
superposition of three Lévy processes:
right
processes A continuous, deterministic process, governed by the drift.
Exchangeable
matrices A continuous, random process (Brownian motion),
governed by the Gaussian covariance matrix.
A pure-jump, random process, governed by the Lévy
measure.
Subordinators
CRMs
Sinead
Williamson
CRMs
Sinead
Williamson Random measure: Mapping M : (Ω, F) → (SM , SM ),
where (SM , SM ) is a set of measures.
Background
CRMs
CRM: Distribution over measures that assign independent
Sinead
Williamson masses to disjoint subsets.
This distribution is infinitely divisible, so Lévy-Khintchine
Background
Lévy processes
applies.
Completely
CRMs are closely related to Lévy processes:
random If X is a subordinator, then the measure M defined so
measures
M(t, s] = Xt − Xs is a CRM.
Applications
Normalized
If M is a completely random measure on R+ , then it’s
random
measures cumulative function is a subordinator.
Neutral-to-the-
right
processes
Just as a subordinator (with ν > 0) has a countably
Exchangeable
matrices infinite number of jumps, a CRM assigns positive mass to
a countably infinite number of locations:
∞
X
M= πi δti ,
i=1
where πi > 0 for all i.
Completely random measures and Poisson
processes
CRMs
Sinead
Williamson
Background
Lévy processes Can catgorize atoms as (size, location) pairs in some space
Completely R+ × SX .
random
measures Define a Poisson point process on this space with Lévy
Applications measure ν(dz, ds).
Normalized
random
measures Events of Poisson point proces give size and location of
Neutral-to-the-
right
processes
atoms of CRM.
Exchangeable
matrices Homogeneous CRM ↔ ν(dz, ds) = νz (dz)νs (ds).
Example: Gamma process
CRMs
Sinead
Williamson
Background
Let H be a measure over some space (SX , SX ).
Lévy processes
CRMs
Sinead
Williamson
Background
Lévy processes
Completely random measures are distributions over
Completely measures with random (finite) total measure.
random
measures In Stats and ML, we are often interested in probability
Applications measures.
Normalized
random
measures Obvious solution: Normalize!
Neutral-to-the-
right
processes Example: Dirichlet process = normalized Gamma process.
Exchangeable
matrices
Example: Normalized stable process.
Survival analysis
CRMs
Objective: Estimate distribution over time T at which a
Sinead
Williamson specified event occurs for a given individual.
Background
Examples:
Lévy processes Deaths of patients in a study.
Completely
random
Failure times of mechanical components.
measures
Time at which a user leaves a website.
Applications
Normalized
random
Observations:
measures
Neutral-to-the-
right
Observe individuals i = 1, . . . , n over time.
processes
Exchangeable
matrices
Record times Ti = ti ∈ R+ at which events occur.
Right-censoring:
Each individual i is observed over some time interval [0, ci ].
If Ti > ci , the event is unobserved (censored) for
individual i.
Representing distribution over event times
CRMs
Cumulative distribution
R t function
Sinead
Williamson
F (t) = P(T < t) = 0 f (u)du.
f (t)
Background
Hazard rate h(t) = 1−F (t) .
Rt
Lévy processes Cumulative hazard (def. 1): H(t) = 0 h(u)du.
Completely Cumulative hazard (def. 2): A(t) = −log (1 − F (t)).
random
measures Definitions coincide if the cdf is continuous.
2
Applications
Normalized 1.8
CDF
random Hazard rate
measures 1.6
Cumulative hazard
Neutral-to-the-
right 1.4
processes
Exchangeable 1.2
matrices
1
0.8
0.6
0.4
0.2
0
0 5 10 15
time
Neutral-to-the-right processes
CRMs
Doksum (1974): A random distribution function F (t) is
Sinead
Williamson neutral-to-the-right if, for each k > 1 and t1 < · · · < tk ,
the normalised increments
Background
F (t2 ) − F (t1 ) F (tk ) − F (tk−1 )
Lévy processes
F (t1 ), ,··· ,
Completely 1 − F (t1 ) 1 − F (tk−1 )
random
measures are independent.
Applications
Normalized
Doksum (1974): F (t) is neutral-to-the-right iff its
random
measures cumulative hazard (def. 2) is the cumulative function of a
Neutral-to-the-
right
processes
completely random measure.
Exchangeable
matrices Hjort (1990): F (t) is neutral-to-the-right iff its cumulative
hazard (def. 1) is the cumulative function of a completely
random measure.
In both cases, F (t) is conjugate under observed and
right-censored observations (Ferguson and Phadia, 1979;
Hjort, 1990).
Example: Beta process
CRMs
Sinead
CRM with Lévy measure
Williamson
Background
ν(dz, ds) = c(s)z −1 (1 − z)c(s)−1 dzH(ds) ,
Lévy processes
where c is a non-negative, p/w continuous function and H
Completely
random is a (def. 2) hazard function.
measures
CRMs
Sinead
Williamson
CRMs
Sinead
We can use CRMs to define exchangeable distributions
Williamson
over matrices with infinite columns.
Background Each column corresponds to an atom of the
Lévy processes CRM-distributed measure.
Completely
random
measures
Applications
Beta process + Bernoulli Gamma process + Poisson
Normalized likelihood likelihood
random
measures → Indian Buffet process → infinite gamma-Poisson process
Neutral-to-the-
right
processes
(Griffiths and Ghahramani, 2005) (Titsias, 2007)
Exchangeable
matrices
5 4 2 2 1 0 0 1 0
4 4 3 2 0 2 1 0 0 0
6 2 3 4 0 0 2 0 0 0
3 5 1 0 3 1 0 1 0 0 0
5 3 4 1 1 2 0 0 0 0 0 0
4 4 2 2 2 0 1 0 0 0