You are on page 1of 92

Statistical signal processing

 Instructor
• Jun-Won Choi
• Office: 공업센터별관 303-1
• Phone: 02-2220-2316
• Office hour: meeting can be scheduled personally via email
Topics to be covered
 Basic probabilities and random variables
 Random process
 Detection theory (Bayesian, Min-max, Neaman-Pearson, Composite
hypothesis)
 Bayesian estimation theory (MMSE, MAP, Kalman, Forward-backward
algorithm)
 Deterministic parameter estimation theory (MVUE, Cramer-Rao lower
bound, sufficient statistics, ML, EM)
 Bayesian network (graph-based inference)
 Sequential Monte Carlo methods (Particle filter, MCMC)
Introduction
 Two approaches to signal processing

Signal is a vector!
Signal is a random process!

Vector signal processing


Statistical signal processing
Random variable
 H. Stark and J. W. Woods, “Probability, Statistics, and Random
Processes for Engineers, Fourth Edition”, PEARSON
Definition of probability

Sample space  = the set of all outcomes

Event Probability P(E)


(= subset of sample space)

Individual outcome 
Basic probability
 Probability space

Probability measure
Sample space
Event space

 Three axioms for probability measure


Basic probability
 Property
Basic probability
 Independence
• Two events are independent if and only if

 Conditional probability
• The probability of the event A given the event B
Probability
 Total probability
Probability
 Example
• In a certain country
• 60% of registered voters are Repulicians
• 30% are Democrats
• 10% are Independents
• When those voters were asked about increasing military spending.
• 40% of Repulicans opposed it
• 65% of the Democrats opposed it
• 55% of the Independents opposed it
• What is the probability that a randomly selected voter in this county
opposes increased military spending?
Probability
 Bayes’ formula

P[ B , E j ] P[ B | E j ]P[ E j ]
P[ E j | B ]  
P[ B ] P[ B ]

• Let {𝐸𝑖 } be a set of disjoint and exhaustive events, then


Probability
 Example
• We know that a registered voter has opposed against increased military
spending. What is the probability that this voter is a Democrat?
Definition of random variable
 Mapping the original sample space Ω to real line ℝ.
 Event is mapped to the subset B of ℝ.
 The inverse image of B under the every Borel sets in ℝ should be
events.

Probability space: ( ¡ ,  , PX )
Basic random variable
 Random variable
• Cumulative distribution function (CDF)
Cumulative distribution function
 Example
• A bus arrives at random in (0,T]. Let the random variable X denote the time
of arrival. Suppose that the bus is equally likely or uniformly likely to come
at any time within (0,T]. What is 𝐹𝑋 (𝑥) ?
Probability density function
 Probability density function

dF X ( x )
f X ( x) 
dx

• Properties
𝑥
1. 𝐹𝑋 𝑥 = ‫׬‬−∞ 𝑓𝑋 𝜉 𝑑𝜉 = 𝑃 𝑋 ≤ 𝑥 .
𝑥
2. 𝑃 𝑥1 < 𝑋 ≤ 𝑥2 = ‫ 𝑥׬‬2 𝑓𝑋 𝜉 𝑑𝜉 = 𝐹𝑋 𝑥2 − 𝐹𝑋 𝑥1
1
Basic random variable
 Probability of A
Probability mass function
 Probability mass function of discrete random variable

PX ( x i )  Pr( X  x i )

 CDF of discrete random variable

FX ( x )  P [ X  x ]  P X ( xi )
all x i  x
Probability density function
 Gaussian random variable N (  ,  )
2

1  ( x  )2 
  
1 2   2 
f X ( x)  e 
  x  
2 2

f X ( x ) dx  1

Continuous random variables
 Rayleigh distribution Mean 

2
x  x 2 / 2 2 
f X ( x)  e u( x) Variance 
2 
 2

2  2 

X ~ N ( 0 ,  2 ), Y ~ N ( 0 ,  2 ) are independent

X 2 Y2 is distributed by Rayleigh dist.


Continuous random variables
 Exponential distribution Mean 

1
f X ( x)  ex / u( x) Variance 

X ~ N ( 0 ,  2 ), Y ~ N ( 0 ,  2 ) are independent
X 2  Y 2 is distributed by Exponential dist. with
 2
Continuous random variables
 Uniform distribution Mean 1
(a  b)
2
 1
 axb Variance 1
f X ( x)   b  a 12
(b  a )2

 0 otherwise
Continuous random variables
 Laplacian distribution Mean
0

f X ( x) 
1

exp  2 | x | /  Variance
2
2
Continuous random variables
 Chi-square distribution with DOF k Mean
k

1
f X ( x)  x ( n / 2 1 ) e  x / 2 u ( x ) Variance 2k
2 n/2
(n / 2)

𝑋1 , … , 𝑋𝑘 are independent, standard normal r.v.s


k
Q  X i2 is distributed by chi-square
i 1

Gamma function

 ( n  1)! if n is positive integer



 ( n )   n 1  x

x e dx if n is complex numbers with a positive real part
0
Continuous random variables
 Gamma distribution Γ(𝛼, 𝛽) (α > 0, 𝛽 > 0)
Scale parameter
   1   x
f X ( x)  x e u( x) Shape parameter
 ( )

Mean 

Variance 
2
Continuous random variables
Discrete and mixed random variables
 Probability mass function for discrete r.v.
PX ( x )  P ( X  x )
 P( X  x)  P( X  x)
 FX ( x )  FX ( x  )

If CDF is a continuous function, 𝑃𝑋 𝑥 = 0

 For discrete r.v.,

FX ( x )  P [ X  x ]  P X ( xi )
all xi  x
Discrete random variables
 Bernoulli random variable B

 q, k  0

PB ( k )   p , k  1
 0 , else

 Binomial random variable K with parameters n and p

 n  k n k
  p q 0k n
PK ( k )    k 

 0 else
Discrete random variables
 Poisson random variable X

  k 
 e k0
PX ( k )   k !
 0 else

 Geometric random variable K with parameters, p>0, q>0 (p+q=1)

 pq k k0
PX ( k )  
 0 else
Joint distribution
 Joint CDF

FX ,Y ( x , y )  P[ X  x , Y  y ]

 Joint PDF
2
f X ,Y ( x , y )  [ F XY ( x , y )]
xy
Joint distributions and densities
 Marginalization

f X ( x)  f XY ( x , y ) dy


fY ( y )  f XY ( x , y ) dx

Joint distributions and densities
 Conditional density
Joint distributions and densities
 Independent random variable

F XY ( x , y )  F X ( x ) FY ( y )
f XY ( x , y )  f X ( x ) f Y ( y )

f X ( x |Y  y)  f X ( x)
fY ( y | X  y )  fY ( y )
Expected value of random variable
 Expected value or mean of random variable X

E[ X ]   xf X ( x ) dx


 Expected value of random variable Y=g(X)

E [ g ( X )]   g ( x) f X ( x ) dx

Expected value of random variable
 Conditional expectation

MMSE estimate of X given Y


Expected value of random variable
 Variance of random variable

VAR ( X )  E [( X  E [ X ]) 2 ]  E [ X 2 ]  ( E [ X ]) 2
Expected value of a random variable
 Expected value of Z=g(X,Y)

E[Z ]   zf Z ( z ) dz

 

   g ( x, y ) f XY ( x , y ) dxdy
  
Conditional expectation
 Conditional expectation of X given that the event B has occurred.

E[ X | B]   xf X |B ( x | B ) dx Continuous

Conditional expectation
 Property of conditional expectation

E [Y ]  E [ E [Y | X ]]
Basic random variable
 Expectation
Joint moments
 Correlation of X and Y.

Cor ( X , Y )  E [ XY ]

 Covariance of X and Y.

Cov ( X , Y )  E [( X  X )( Y  Y )]
 E [ XY ]  X Y
Joint moments
 We say that X and Y are uncorrelated when Cov ( X , Y )  0

E [ XY ]  E [ X ] E [Y ]

 If X and Y are uncorrelated, then Cov ( X  Y )  Cov ( X )  Cov (Y )


Joint moments
 Schwarz’s inequality
Joint moments
 If X and Y are independent, they are correlated.

f XY ( x , y )  f X ( x ) f Y ( y ) E [ XY ]  E [ X ] E [Y ]

 The converse is not true!


Random vector
 A random vector contains multiple RVs.
 Consider 𝑛 dimensional random vector X  ( X 1 , X 2 ,..., X n )

• CDF FX ( x )  P [ X 1  x1 ,..., X n  x n ]

 n FX ( x )
• PDF f X (x ) 
 x1 ...  x n
Joint distribution and density
 Joint CDF of two random vector
• Joint density of 𝑿 = 𝑋1 , … , 𝑋𝑛 and 𝐘 = 𝑌1 , … , 𝑌𝑚

FX , Y ( x , y )  P [ X  x , Y  y ]

• Marginalization
 

f X (x )   ...  f X ,Y ( x , y )dy 1 ... dy m


 
Expectation vectors and covariance matrices
 The expected value of 𝑋 = (𝑋1 , … , 𝑋𝑛 )𝑇

 1   E [ X 1 ] 
E [ X ]       
   
  n   X [ X n ]

 

i   ...  x i f X ( x1 ,..., x n ) dx 1 ... dx n


 

 x i f X ( x i ) dx i

Expectation vectors and covariance matrices
 The covariance matrix of X

K  E X  μ X  μ  
T

K is a symmetric matrix

K ij  E [( X i   i )( X j   j )]  E [( X j   j )( X i   i )]  K ji

  12  K 1n 
 
K    
 K n1   n21 

Expectation vectors and covariance matrices
 The correlation matrix R is given by

R  E XX T 

 The covariance matrix K and the correlation matrix R are related by

K  E XX T   μμ T
Expectation vectors and covariance matrices
 We call two random vectors X and Y to be uncorrelated if

E XY T   μ X μ Y
T

 We call two random vectors X and Y to be independent if

f X, Y ( x, y )  f X ( x ) f Y ( y )
Properties of covariance matrix
 For any nxn matrix M, a matrix M is said to be positive semi-definite if

z T Mz  0

 A matrix M is said to be positive definite if

z T Mz  0

A covariance matrix K is always positive semi definite.


Multidimensional Gaussian Law
 For joint Gaussian random vector 𝑿 = (𝑋1 , … , 𝑋𝑛 )𝑇 , the pdf is given by

1  1 
f X (x)  exp   x  μ  K XX1 x  μ 
T

( 2 ) n / 2 | K XX |1 / 2  2 

 If the elements of the random vector X are independent,

1  n 1x  
2

f X (x ) 
( 2 ) n / 2  1 ... n
exp    

i i
 
 i 1 2   i  

 12 0 
 
K XX   
 0  n2 

Multidimensional Gaussian Law
 Let X be an n-dimensional RV with covariance matrix 𝑲𝑿𝑿 and the
mean vector 𝝁𝑿 .
 Then, 𝒀 = 𝑨𝑿 is an n-dimensional RV with covariance matrix 𝑲𝒀𝒀 =
𝑨𝑲𝑿𝑿 𝑨𝑻 and the mean vector 𝝁𝒀 = 𝑨𝝁𝑿 .

μ Y  E[Y ] 
K YY  E Y  μ Y Y  μ Y 
T

 E [ AX ]  E  AX  Aμ  AX  Aμ  
T
X X

 A E  X  μ  X  μ  A
 AE[X ] T T
X X
 Aμ X
 AK XX A T
Multidimensional Gaussian Law
 Example
• A zero-mean normal random vector 𝑋 = (𝑋1 , 𝑋2 )𝑇 has covariance matrix
𝑲𝑿𝑿 given by 3 1
 
K XX  
 1 3

 1 1

• If 1  2 2 X
find the distribution of Y.
Y 
2 1 1 

 2 2
Whitening transformation
 A random vector X is called “white” if 𝑲𝑿𝑿 = 𝑰.

 Consider a random vector X which has a covariance matrix 𝑲𝑿𝑿 .


Assume that the eigen decomposition of 𝑲𝑿𝑿 is given by 𝑲𝑿𝑿 = 𝑼𝚲𝑼𝑻 .

 Then, The linear transformation Y=𝚲−𝟏/𝟐 𝑼𝑻 X is white.


Random process
 Peyton Z. Peebles, “Probability, random variables and random signal
principles”, McGRAW-HILL
Random process
 What is random process?
• Introduce time index to a random variable

Event s at X (t, s )
time t

(s is omitted)

X (t )

• Random process represents ensemble of time functions.


• A realization of ensemble is called sample path.
Random process
 Several sample paths

Random process becomes a


random variable when time
index is fixed.
Random process
 Joint CDF

F X ( x1 ; t1 )  P ( X ( t1 )  x1 )
F X ( x1 , x 2 ; t1 , t 2 )  P ( X ( t1 )  x1 , X ( t 2 )  x 2 )
F X ( x1 ,..., x n ; t1 ,..., t n )  P ( X ( t1 )  x1 ,..., X ( t n )  x n )

 Joint PDF
dF X ( x1 ,..., x n ; t1 ,..., t n )
f X ( x1 ,..., x n ; t1 ,..., t n ) 
dx1 ...dx n
Random process
 Expectation

 ( t )  E [ X ( t )]   xf ( x ; t ) dx


 Autocorrelation

R XX ( t1 , t 2 )  E [ X ( t1 ) X ( t 2 )]

  x1 x 2 f ( x1 , x 2 ; t1 , t 2 ) dx1dx 2

Random process
 A random process is said to be stationary if all its statistical properties
do not change with time.

f X ( x1 ,..., x n ; t1 ,..., t n )  f X ( x1 ,..., x n ; t1   ,..., t n   )

 A random process is said to be wide sense stationary if

 ( t )   ( t  V)   0

R XX ( t1 , t 2 )  R XX ( t1  V, t 2  V)
R XX ( ) @R XX ( t 2  t1 )

Note that the autocorrelation is a function of only


time difference 𝜏 = 𝑡2 − 𝑡2
Random process
 Exercise) Check whether the following random process is wide-sense
stationary

X ( t )  A cos( 0 t   )
Uniform (0,2𝜋)
Random process
 Ergodic process
• For a stationary process, consider a sample average
T
1
x  lim
T  2T  x ( t ) dt
T
Sample path
T
1
R XX ( )  lim
T  2T  x ( t ) x ( t   ) dt
T

• Ergodic process should satisfy the following two properties

x  0

R XX ( )  R XX ( )
Random process
 Property of autocorrelation function for W.S.S random process

| R XX ( ) | R XX (0)
R XX (  )  R XX ( )
R XX (0)  E  X 2 ( t ) 
Random process
 Cross-correlation function

R XY ( t1 , t 2 )  E [ X ( t1 )Y ( t 2 )]

• When X and Y are jointly wide sense stationary

R XY ( )  E [ X ( t )Y ( t   )]

• When X and Y are independent

R XY ( )  E [ X ( t )] E [Y ( t   )]   X  Y
Random process
 W. S. S. Gaussian random process

1  1 
f X ( x1 ,..., x n ; t1 ,..., t n )  exp   ( x  μ ) T   1 ( x  μ ) 
(2 ) n |  |  2 

E  X (t )  
 ( i , j )  C XX ( ti  t j )  R XX ( ti  t j )   2

Auto-covariance

𝑖, 𝑗 -th entry of Σ
Random process
 Poisson random process
• The number of guests visiting a restaurant
Random process
 Poisson random process
• First order joint PDF

  t
k
 t e E [ X ( t )]   t
P  X (t )  k  
k!

• Second order joint PDF

P  X ( t1 )  k1 , X ( t 2 )  k 2   P  X ( t 2 )  k 2 | X ( t1 )  k 1  P  X ( t1 )  k 1 
k 2  k1
  t1    t1
  ( t2  t1 )    ( t2  t1 )
k1
e e

k1 ! ( k 2  k1 )!
Random process
 Frequency analysis of random process
• Frequency analysis of deterministic waveform

X ( j )   x ( t ) e  j t dt


x (t )   X ( j ) e j t d 


• Fourier transform would not exist for random process


Random process
 Power spectral density (PSD)
• Consider a W. S. S. random process 𝑋(𝑡).
• The PSD of 𝑋 𝑡 is given by

S XX ( )   R XX ( ) e  j d 


• Why is it called “power spectral density”?


Random process
 Power spectral density (PSD)
• Signal power Sample path
T
1
P  lim
T  2T  x 2 ( t )dt
T
• Since 𝑃 is a random variable, take expectation
T
1
Power  E [ P ]  lim
T  2T  E [ x 2 ( t )]dt
T T
• Use Parseval’s theorem
X T ( j )   x ( t ) e  j t dt
T  T
1
 x 2 ( t )dt 
2  | X T ( j ) |2 d 
T 

Power density
 
1 E | X T ( j ) | 2
1
Power 
2  lim
T  2T
d 
2 S XX ( )d 
 
Random process
 Properties of the PSD
S XX ( )  0
S XX (   )  S XX ( ) if X ( t ) is real
S XX ( ) is real

Autocorrelation 𝑅𝑋𝑋 (𝜏) and the power spectral density 𝑆𝑋𝑋 (𝜔)
are Fourier transform pair

S XX ( )   R XX ( ) e  j d 


R XX ( )   S XX ( ) e j d 

Random process
 White noise process

N0 N0
R XX ( )   ( ) S XX ( ) 
2 2
Random process
 Rayleigh fading channel Rayleigh r.v.
 x 2 /2
f X ( x )  xe u( x)
Random process
 Rayleigh fading channel
• Autocorrelation

R XX ( )  J 0  2 f d 

Zero-th order Bessel function


• PSD

 1
 |  | f d
2
  
S XX ( )    f d 1   
  d 
f

 0 otherwise
Random process
 Martingale process
Random process
 Markov process
• The present state summarizes the all past trajectories relevant to the future.
• Conditional independence
• X and Z are conditionally independent given Y

Equivalent
Random process
 Markov process
Random process
 Discrete-state Markov process

Day 1 Day 2 Day 3 Day 4

Cloudy Cloudy Cloudy Cloudy

Sunny Sunny Sunny Sunny

Rainy Rainy Rainy Rainy


Random process
 Discrete-state Markov process
• Probability state vector

• Probability transition vector


Random process
 Using the chain rule, the joint distribution of Markov process is given by

 If Markov process is in equilibrium distribution,


Hidden Markov models
 Hidden Markov models (HMM)

• Markov chain process is not observable.


• Observation 𝑌𝑖 depends only on 𝑍𝑖 . (Example. 𝑌𝑖 = 𝑍𝑖 + 𝑁𝑖 )
• Various applications to signal processing and machine learning.
Random process
 Cross power spectral density

S XY ( )   R XY ( ) e  j d 


S YX ( )   RYX ( ) e  j d 

Random process
 Linear system with random input

LTI system

X (t ) h (t ) Y (t )

 

Y (t )   h ( ) X ( t  ) d    X ( ) h ( t  ) d 
 
Random process
 Linear system with random input
• Expectation of the output

 
 Y  E [Y ( t )]   h ( ) E [ X ( t  )]d     h ( )d   X
   

• Auto correlation & PSD

RYY ( )  R XX ( ) * h (  ) * h ( )

2
S YY ( )  H ( j ) S XX ( )
Random process
 Linear system with random input
• Cross correlation

R XY ( )  R XX ( ) * h ( ) RYX ( )  R XX ( ) * h (  )

S XY ( )  S XX ( ) * H ( j ) S YX ( )  S XX ( ) * H (  j )
Random process
 Cross power spectral density

S XY ( )   R XY ( ) e  j d 


S YX ( )   RYX ( ) e  j d 

Random process
 Linear system with random input

LTI system

X (t ) h (t ) Y (t )

 

Y (t )   h ( ) X ( t  ) d    X ( ) h ( t  ) d 
 
Random process
 Linear system with random input
• Expectation of the output

 
 Y  E [Y ( t )]   h ( ) E [ X ( t  )]d     h ( )d   X
   

• Auto correlation & PSD

RYY ( )  R XX ( ) * h (  ) * h ( )

2
S YY ( )  H ( j ) S XX ( )
Random process
 Linear system with random input
• Cross correlation

R XY ( )  R XX ( ) * h ( ) RYX ( )  R XX ( ) * h (  )

S XY ( )  S XX ( ) * H ( j ) S YX ( )  S XX ( ) * H (  j )

You might also like