You are on page 1of 42

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink


J. J.-S. Roger Jang ( ) ) CS Dept, TsingTsing-Hua Univ, Taiwan ( ) )
http://www.cs.nthu.edu.tw/~jang jang@cs.nthu.edu.tw

2006

Speech/Audio Signal Processing in MATLAB/Simulink

About Me
Experiences:
1993-1995: The MathWorks, Inc. 1995-now: CS Dept., Tsing Hua Univ., Taiwan

Research interests
Speech/Audio Signal Processing, Fuzzy Logic, Neural Networks, Pattern Recognition, Biometric Identification, Document Classification, Webbased Technologies

Programming languages:
MATLAB, C, JavaScript, VBScript, Perl

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Outline
Wave file manipulation
Reading, writing, recording ...

Time-domain processing
Delay, filtering, sptools

Frequency-domain processing
Spectrogram

Pitch determination
Auto-correlation, SIFT, AMDF, HPS ...

Others
Formant estimation, speech coding
3

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Toolbox/Blockset Used
MATLAB Simulink Signal Processing Toolbox DSP Blockset

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

MATLAB Primer
Before you start, you need to get familiar with MATLAB. Please read MATLAB Primer at the following page: http://neural.cs.nthu.edu.tw/jang/demo/demoDownload. asp Exercise: 1. Please plot two curves y=sin(2*t) and y=cos(3*t) in the same figure. 2. Please plot x vs. y where x=sin(2*t) and y=cos(3*t).
2010/8/26 5

2006

Speech/Audio Signal Processing in MATLAB/Simulink

To Read a Wave File


To read a MS .wav file (PCM format only): wavread
y = wavread(file) [] = wavread(file, [n1, n2]) [y, fs, nbits, opts] = wavread(file) [] = wavread(file, n) [y, fs, nbits] = wavread(file)

If the wav file is stereo, y will be a two-column matrix.

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

To Read a Wav File


Example (wavRead01.m):
[y, fs] = wavread('singapore.wav'); plot((1:length(y))/fs, y); xlabel('Time in seconds'); ylabel('Amplitude');

Exercise
1. Plot the waveform of rrrrr.wav. Use MATLABs zoom button to find the consecutive curling R occurs. 2. Plot the two-channel waveform in flanger.wav.
7

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Solution to the Previous Exercise


wavRead02.m:
[y, fs] = wavread(flanger.wav); subplot(2,1,1), plot((1:length(y))/fs, y(:,1)); subplot(2,1,2), plot((1:length(y))/fs, y(:,2));

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

To Play Wav Files


To play sound using Windows audio output device: wavplay, sound, soundsc
wavplay(y, fs) wavplay(y, fs, async): non-blocking call wavplay(y, fs, sync): blocking call sound(y, fs) soundsc(): autoscale the sound

Example (wavPlay01.m)
[y, fs] = wavread(rrrrr.wav); wavplay(y, fs);

Exercise
9

2010/8/26

Follow the example to play flanger.wav.

2006

Speech/Audio Signal Processing in MATLAB/Simulink

To Read/Play Using DSP Blocks


To read/play sound using DSP Blockset:
DSP Blockset/DSP Sources/From Wave File DSP Blockset/DSP Sinks/To Wave Device

Example:

Frame-based operation!

Exercise:
Create a model as shown above.
10

2010/8/26

10

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Solution
Solution to the previous exercise: slWavFilePlay01.mdl

11

2010/8/26

11

2006

Speech/Audio Signal Processing in MATLAB/Simulink

To Write a Wave File


To write MS wave files: wavwrite
wavwrite(y, fs, nbits, wavefile) nbits must be 8 or 16. y must have two columns for stereo data. Amplitude values outside [-1,1] are clipped.

Example (wavWrite01.m)
[y, fs] = wavread(rrrrr.wav); wavwrite(y, fs*1.2, 8, testout.wav); !start testout.wav

Exercise
Try out the above example.
12

2010/8/26

12

2006

Speech/Audio Signal Processing in MATLAB/Simulink

To Record a Wave File


To record wave files:
1. Use the recording utility under WinXP. 2. Use wavrecord under MATLAB. 3. Use From Wave Device under Simulink, under DSP Blocksets/Platform Specific IO/Windows (Win32)

Example
1. Go ahead and try WinXP recording utility! 2. Try wavRecord01.m 3. Try slWavFileRecord01.mdl

Exercise:
Try out the above examples.
13

2010/8/26

13

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Time-Domain Speech Signals


A typical time-domain plot of speech signals:

Amplitude: volume or intensity Frequency: pitch

14

2010/8/26

14

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Changing Wave Playback Param.


To control the play of a sound:
Normal: wavplay(y, fs) High volume: wavplay(2*y, fs) Low volume: wavplay(0.5*y, fs) High pitch (and faster): wavplay(y, 1.2*fs) Low pitch (and slower): wavplay(y, 0.8*fs)

Exercise:
Try wavPlay01.m and trace the code. Create wavPlay02.m such that you can record your own voice on the fly.

15

2010/8/26

15

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Time-Domain Signal Processing


Take-home exrecise:
How to get a high pitch with the same time span?

16

2010/8/26

16

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Synthetic Sounds
Use a sine wave generator (under DSP blocksets) to produce sounds
Single frequency:

Multiple frequencies:

Amplitude modulation:

Exercise:
17

2010/8/26

Create the above models.

17

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Solution
Solution to the previous exercise: sineSource01 sineSource02 sineSource03

18

2010/8/26

18

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Delay in Speech/Audio
What is a delay in a signal?
y(n) --> y(n-k)

What effects can delay generate?


Echo Reverberation Chorus Flanging

19

2010/8/26

19

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Single Delay in Audio Signal


Block diagram:
Input u(n)

-k

Output y(n) = u(n) + a*u(n-k)

Simulink model:

Exercise:
Create the above model.
20

2010/8/26

20

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal


How to create karaoke effects:
a
Input u(n)

-k

Output y(n)

y(n) = u(n) + a u(n-k) + a 2u(n-2k) + a 3u(n-3k) ...

Simulink model:

21

2010/8/26

21

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal


Parameter values:
Feedback gain a < 1 Actual delay time = k/fs

Exercise:
Create the above model and change some parameters to see their effects. Modify the model to take microphone input (so you can start singing karaoke now!) Use a configurable subsystem to include all possible input files and the microphone. (See next page.)

22

2010/8/26

22

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal


How to use configurable subsystem block?
1. Create a library (say, wavinput.mdl)

2. Get a block of configurable subsystem 3. Fill the dialog box with the library name

23

2010/8/26

23

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Audio Flanging
Flanging sound:
A sound similar to the sound of a jet plane flying overhead, or a "whooshing" sound Pitch modulation due to a variable delay

Simulink demo:
dspafxf.mdl (all platforms) dspafxf_nt.mdl (for 95/98/NT)

24

2010/8/26

24

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Audio Flanging
Simulink model:

Original spectrogram:

Modified spectrogram:

25

2010/8/26

25

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Signal Processing Using sptool


To invoke sptool, type sptool.

26

2010/8/26

26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Speech Production
How is speech produced?
Speech is produced when air is forced from the lungs through the vocal cords (glottis) and along the vocal tract.

Analogy to System Theory:


Input: air forced into the vocal cords Output: media vibration System (or filter): vocal tract Pitch frequency: frequency of the input Formant frequency: resonant frequency
2010/8/26 27

27

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Source Filter Model of Speech


The source-filter model of speech production:
Speech is split into a rapidly varying excitation signal and a slowly varying filter. The envelope of the power spectra contains the vocal tract information.

28

Two important characteristics of the model are fundamental (pitch) frequency (f0) and formants 2010/8/26 (F1, F2, F3, ) 28

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Frame Analysis of Speech Signal


Speech wave form :

Zoom in

Overlap Frame

29

2010/8/26

29

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Spectrogram
Spectrogram (specgram.m) displays short-time frequency contents:

Wave form :

Spectrogram :

30

2010/8/26

30

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Real-time Spectrogram
Try dspstfft_win32:

Spectrum:

Spectrogram:

31

2010/8/26

31

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Pitch and Formants


Pitch and formants can be defined visually:
First formant F1 Pitch period = 1/f0 Second formant F2

32

2010/8/26

32

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Spectrogram Reading
Spectrogram Reading
http://cslu.cse.ogi.edu/tutordemos/SpectrogramRe ading/spectrogram_reading.html

Waveform:

Spectrogram:

33

2010/8/26

compute

33

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Pitch Determination Algorithms


Time-domain:
Auto-correlation AMDF (Average Magnitude Difference Function) Gold-Rabiner algorithm (1969)

Frequency-domain:
Cepstrum (Noll 1964) Harmonic product spectrum (Schroeder 1968)

Others:
SIFT (Simple inverse filter tracking) Maximum likelihood Neural network approach

34

2010/8/26

34

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Autocorrelation of Each Frame


Let s(k) be a frame of size 128.
 

s(k): s(k-L):
L=30

x(30) = dot prod. of overlapped = sum(s(31:128).*s(1:99)

Autocorrelation x(L):
35

2010/8/26

30

Pitch period

35

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Autocorrelation via DSP Blockset


Real-time autocorrelation demo:

Exercise:
Construct the above model and try it.
36

2010/8/26

36

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Pitch Tracking via Autocorrelation


Real-time pitch tracking via autocorrelation: pitch2.mdl

37

2010/8/26

37

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Formant Analysis
Characteristics of formants:
Formants are perceptually defined. The corresponding physical property is the frequencies of resonances of the vocal tract. Formant analysis is useful as the position of the first two formants pretty much identifies a vowel.

Computation methods:

38

Peak picking on the smoothed spectrum Peak picking on the LP spectrum Factoring for the LP roots Fitting of mixture of Gaussians
38

2010/8/26

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Formant Analysis
Track Draw:
A package for formant synthesis with options to sketch formant tracks on a spectrogram. http://www.utdallas.edu/~assmann/TRACKDRAW/t rackdraw.html

Formant Location Algorithm


MATLAB code by Michelle Jamrozik http://ece.clemson.edu/speech/files.htm

39

2010/8/26

39

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Speech Waveform Coding


Time domain coding
PCM: Pulse Code Modulation DPCM: Differential PCM ADPCM: Adaptive Differential PCM (dspadpcm.mdl)

Frequency domain coding


Sub-band coding Transform coding

Speech Coding in MATLAB


http://www.eas.asu.edu/~speech/education/educ1.ht ml
40

2010/8/26

40

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Conclusions
Ideal tools for speech/audio signal processing:
MATLAB Simulink Signal Processing Toolbox DSP Blockset Reliable functions: well-established and tested Visible graphical algorithm design tools High-level programming language yet C-compatible Powerful visualization capabilities

Advantages:

Easy debugging Integrated environment


41

2010/8/26

41

2006

Speech/Audio Signal Processing in MATLAB/Simulink

References
[1] Discrete-Time Processing of Speech Signals, by Deller, Proakis and Hansen, Prentice Hall, 1993 [2] Fundamentals of Speech Recognition, by Rabiner and Juang, Prentice Hall, 1993 [3] Effects Explained, http://www.harmonycentral.com/Effects/effects-explained.html [4] TrackDraw, http://www.utdallas.edu/~assmann/TRACKDRAW/t rackdraw.html [5] Speech Coding in MATLAB, http://www.eas.asu.edu/~speech/education/educ1. html
42

42

2010/8/26

You might also like