Professional Documents
Culture Documents
All real signals are random because any measurable signals are corrupted by
noise. When unknown parameters are not directly measurable, we go for
estimation and estimate them from known measurable parameters. So some
data or measurements are prerequisite for initiating estimation. Since noise
is zero mean process, averaging can minimize noise to a large extend. The
steps in estimation are: Modeling like a physical or mathematical model, Data
collection, estimation of unknown parameters and finally validation of
results. Example for estimation is DSB SC signal transmission and reception.
In discrete time estimation, noise samples will differ with time, so estimated
parameter also varies with time. Uncertainty about which bit is transmitted is
the problem in discrete systems. The pdf of the estimate depends upon the
noise model, form of estimation function and the original signal structure.
The assumptions about noise are noise are Gaussian which holds always true
due to central limit theorem and noise is assumed to be zero mean white due
to which noise samples are uncorrelated. Sometimes whiteness assumption
is removed. If not zero mean, we can make it zero mean. Since noise is
assumed to be Gaussian, x(n) will also be Gaussian and the estimates will
also be Gaussian (x(n) = s(n) + w(n)). The desired properties of the
estimates are their mean i.e.; E(x); should be equal to the original parameter
and variance should be small.
We can find many estimators. Among this many may be unbiased also. But
only a few will have minimum variance. Unbiased means expectation of the
estimate is original parameter. MVU estimators can also be called as MMSE
unbiased estimators. MVU estimators sometimes does not exist due to two
reasons; there may not be any unbiased estimator or none of the unbiased
estimators has a uniformly minimum variance. So to find whether the
estimator derived is indeed the MVUE or to find if there does exist some
other MVUE we use Crammer Rao Lower Bound (CRLB) and sees if some
estimator satisfies it or we restrict to linear unbiased and find the MVLU. We
cannot always find the MVU estimator.
CRLB is a lower bound on the variance of any unbiased estimator. None of the
unbiased estimators can have a variance less than CRLB. CRLB tells us the
best we can ever expect to be able to do with an unbiased estimator (Study
CRLB theorem. Some uses of CRLB are feasibility studies (i.e. can we meet
our specifications), judgement of proposed estimators, can sometimes
provide form for MVU estimator and demonstrates the importance of
physical or signal parameters to the estimation problem. (If variance is zero
then x is deterministic and there is no need for CRLB).
Using the curvature of the pdf, we are measuring the accuracy. 1st derivative
gives slope, second derivative gives curvature, sharpness is measured using
curvature. If curvature increases, pdf concentration increases, accuracy
increases and variance decreases. This is why we take second derivative in
CRLB theorem. To find this curvature in general, we average over the random
vector to get the average curvature.
In finding CRLB, after finding the second derivative, if result still depends on
x, it means it doesn’t give a uniform curvature. So average out x by taking the
expected value with respect to x. Otherwise skip this step. The result may still
depend on θ. So evaluate at each specific value of θ desired. Dependence on x
and θ means there is a non-uniform bound. CRLB is found for the estimation
problem and not for the estimator; so for an estimation problem, CRLB is
unique. An estimator that is unbiased and attains the CRLB is said to be an
“Efficient Estimator”. Not all estimators are efficient. Not even all MVU
estimators are efficient. If the first partial test fails, we cannot find the
efficient estimator (but does it exist (doubt)).
CRLB for signals marked star. If signal is very sensitive to parameter change,
then CRLB is small and get very accurate estimate.
Transformation of parameters: If α=g(θ), then CRLBα = (𝜕𝑔(θ)/𝜕θ)2 *CRLBθ.
To find estimate of SNR (A/σ2) given estimate of variance.
If θ has an efficient estimator, then 𝛼̂=g(θ̂)is an efficient estimator of α if g(θ)
has the form g(θ) = a θ + b (Proof marked star).
For vector case instead of variance we take the covariance matrix and Fisher
Information matrix. Diagonal elements in the inverse Fisher Information
matrix give the CRLB bounds. Expression for (m,n)th element of Fisher
Information matrix is marked star. Inverse Fisher Information matrix is
positive semidefinite (star). CRLB definition from text or Note star (In note
after some pages).
Transformation of parameter formula for vector parameter case by finding
the Jacobian (star).
Linear Model with WGN: Since an MVU estimator is not always guaranteed,
we can define a class of model for which we can find MVU estimators. i.e. if
the n point data can be modeled as x=H θ + W (where x is N by 1, H is N by P,
θ is P by 1, W is N by 1), then we can guarantee an MVU estimator for this and
can find the MVU estimator and CRLB as usual. The MVU estimator for this is
given by θ̂ = (HTH)-1HTX and variance or CRLB by σ2(HTH)-1 (Derivation
marked star). (HTH)-1 is the pseudo inverse of H (since we cannot take the
inverse of H due to its dimension). For (HTH)-1 to exist H should have rank p
(Doubt – why?). θ̂ is a linear transformation of a Gaussian vector x. So θ̂ has
also a Gaussian distribution. i.e. θ̂ ~ (θ, σ2(HTH)-1). Also the estimator is
efficient because first derivative can be written in that form. If we can write
in that form, that means variance is CRLB itself. By taking E(θ̂), we can see
that it is unbiased. Examples for linear models is curve fitting, dc level in
AWGN, line fitting, Fourier analysis, System Identification etc (Equation and
matrix for curve fitting, line fitting Fourier analysis and System Identification
marked star). In Linear Models with known parameters, i.e. if X= H θ + S +
W, take θ̂ as θ̂ = (HTH)-1HT(X-S). If noise is coloured, then θ̂ = (HTC-1H)-1HTC-
1X and CRLB or covariance is (HTC-1H)-1. For DC level in AWGN, H is I.
BLUE: Sometimes an MVU estimator does not occur or cannot be found. Till
now we have taken pdf and derived all estimators. But sometimes pdf may
not be available. So we go for a suboptimal estimator BLUE. In BLUE, we need
to know only the first and second moments of pdf. Estimator is restricted to
be linear in data here. There may be a large number of linear estimators but
the best linear estimator or BLUE is the one which is unbiased and has
minimum variance. For DC level in AWGN, BLUE is the MVU estimator itself.
So BLUE is optimal here. But for some cases it is suboptimal. The
prerequisites of BLUE are H must be deterministic to get unbiasedness, input
sequence has to be deterministic and noise must have zero mean with a
known positive definite covariance matrix C. then θ̂BLUE = (HTC-1H)-1HTC-1X
and variance is (HTC-1H)-1. θ̂BLUE is weighted by C-1. So if a measurement is
more noisy, it is given less weightage. So we get a better result. BLUE is
optimal (i.e. MVUE only if noise is Gaussian (derivation of BLUE important
points marked star). Also here we have assumed that the data model is linear.
But if it is actually not linear, then also the estimator is not optimal even if the
noise is Gaussian. I think in linear models in the previous paragraph also if a
C-1 term comes, it is not efficient then.
MLE: If pdf is known then first we try to find MVUE and if not found we go for
MLE. So if pdf is known MLE is the best estimator. It is optimal for large
enough data and it is a “Turn – the – Crank” method. So MLE is asymptotically
unbiased and optimal. For MVU, we don’t have such method. For small data, it
is not optimal and it is computationally complex also. It is found by finding
the parameter values which maximizes the log likelihood function. Example
for MLE when MVUE does not exist is dc level with variance of noise as the
DC value A. In this case it is asymptotically efficient also. If a truly efficient
Estimator exists, then the ML procedure finds it. i.e. θ̂ML ~ 𝒩(θ, I-1(θ). The
size of data, i.e. N is found by Monte Carlo Simulations.
E(A4) = μ4 + 3σ4 + 6 μ2 σ2.
If we find MLE for phase estimator (here we cannot find MVUE), we get the
implementation of a correlator. It comes as a tan-1 of the ratio of (Q/I). (We
find MLE by minimizing something like error (Just see the example of phase
estimation to know how (star)). In invariance property of MLE (i.e. MLE for
transformed parameter α=g(θ)), this can be a one – to – one transformation
or not. If it is not one-to-one, then we have to take the modified likelihood
function (i.e. we take likelihood functions for all the values of θ that maps to α
and find the maximum of these and tries to maximize that to find α. See text
for the property (Theorem 7.2)). If we do not have a closed form expression
for the MLE we go for numerical methods like a Brute force method
(computing p(x; θ) on a fine grid of values) or Greedy max algorithm (but
this may not converge, sometimes will converge but to a local minimum, so
initial guess is important) or for Newton-Raphson method (equation star).
For vector parameter case also study Theorem 7.3. Vector MLE is also
asymptotically unbiased and efficient and invariance property also holds. If
model is linear and Gaussian, we need not go for MLE because it will give the
MVU estimator itself. i.e. θ̂ = (HTC-1H)-1HTC-1X and variance is (HTC-1H)-1.
cannot establish that it an efficient estimator because we do not know the pdf
and MVU. Weighted LS is ∑𝑁−1 𝑛=0 wn(x[n] − s[n]) , wn is weight given to
2