You are on page 1of 9

A sub-band-based feature

reconstruction
approach for robust speaker
recognition
By: RAJAT SHUKLA(1209531025)
SACHIN PAL(1209531027)
GUIDED BY: Mrs SHILPI SHUKLA

INTRODUCTION
Speaker recognition is the identification of a person
from characteristics of voices (voice biometrics). It is
also called voice recognition. There is a difference
between speaker recognition (recognizing who is
speaking) and speech recognition (recognizing what
is being said).
In automatic speaker or speech recognition the lack of
robustness has remained a major challenge.

HUMAN SPEECH
The human speech contains numerous discriminative
features that can be used to identify speakers.
Speech contains significant energy from zero
frequency up to around 5 kHz.
Objective of automatic speaker recognition is to
extract, characterize and recognize the information
about speaker identity.
The property of speech signal changes markedly as a
function of time.

ROBUSTNESS
Robustness is the ability of a computer system to
cope with errors during execution.
Robustness can also be defined as the ability of an
algorithm to continue operating despite abnormalities
in input, calculations, etc.
In communication robustness means the strength of
the speech signal originally should be same at the
receivers end.
NO DATA SHOULD BE LOST OR DAMAGED.

FEATURE RECONSTRUCTION
Feature extraction starts from an initial set of
measured data and builds derived values (features)
intended to be informative, non redundant, facilitating
the subsequent learning and generalization steps, in
some cases leading to better human interpretations.
Feature extraction is related to dimensionality
reduction.
General dimensionality reduction technique used is
PCA(Principle Component Analysis).

PCA
PCA was invented in 1901 by Karl Pearson.
The main purposes of a principal component analysis
are the analysis of data to identify patterns and
finding patterns to reduce the dimensions of the
dataset with minimal loss of information.

General information on PCA


Approximation of data matrix, X = TP + E
T (Scoring Matrix)
Data Matrix
General steps of PCA :
* Pretreatment of data: scaling

X
t1 t2
P1
E
P2

(Nois

e)

P (Loading Matrix)
PC2

X2
0

Q1

Q2

* Calculate eigen values and eigen vector


(PC1,PC2, which constitutes loading matrix)
* Calculate scores, [X][PT]-1= [T]
PC2 is orthogonal to PC1
Eigen value (1 and 2 ) decide the length of the
major and minor axes of the ellipse
Q1 (slope of major axis) : ratio of elements of
eigen vector of the corresponding high 1
Q2 (slope of minor axis): ratio of eigen vector
of the corresponding second high 2

2
PC1
0

* Calculate Covariance / Correlation matrix

X1

BASIC BLOCK DIAGRAM

Clean
speech(extract
Mel log-spectral
log-spectral

divide into 2
SB SB1:
channel 1 to
P/2 SB2:
channel P/2+1
to P

Execute
reconstruction
on SB1 and
SB2(increased
robustness)

Recombine
the
reconstructed
vectors

Speech output

You might also like