You are on page 1of 14

SYNOPSIS REPORT

(July 2017 - September 2017)


On

AUTOMATIC TONIC IDENTIFICATION


IN INDIAN ART MUSIC

BY

Mahesh Y. Pawar
In
MTech. Signal Processing (Electronics &
Telecommunication Engineering)

MIS NO: 121697010


Under the guidance of
Dr. S. P. Mahajan

DEPARTMENT OF ELECTRONICS AND


TELECOMMUNICATION ENGINEERING
COLLEGE OF ENGINEERING, PUNE
29th September 2017
Contents
1 Introduction 1

2 Literature Survey 3

3 Problem Statement 6

4 Objectives 6

5 Scope 6

6 Methodology 6

7 Resources required 7
7.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
7.2 Toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

8 Plan of execution 8

9 Targeted Publications 9
9.1 Conferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
9.2 Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

i
List of Figures
1 Iterative Tonic identification System . . . . . . . . . . . . . . 2
2 Method for Pitch class distribution from Audio . . . . . . . . 4
3 Block Diagram of Proposed Method . . . . . . . . . . . . . . . 7

ii
List of Tables
1 Expected time line of the project . . . . . . . . . . . . . . . . 8

iii
1 Introduction
Tonic is a basic requirement for singers especially in Indian Classical Mu-
sic, where the drone instrument is played at the background and all other
accompanying instruments are tuned with drone instrument. Tonic is the
basic pitch used for ragas rendition. Tonic is recognized and full pitch range
is explored. That is all the tones in the musical progression are constantly re-
lated to this tonic pitch. Therefore automatic tonic identification will become
important prerequisite for automatic raga recognition. Here, various features
are extracted from both instrumental music and vocal excerpt. These fea-
tures are used further in tonic candidate selection. Main goal of this work is
to extract Musical Information, utilize Indian Art Music for automatic tonic
identification and real-time implementation.
Initially audio signal is transformed using multi-pitch histograms which
is then used to construct the pitch histograms. Tonic is identified using
prominent peaks of a pitch histogram. Then the best classification technique
is used to select prominent peak among selected candidates which represent
the tonic. The second method also analyzes the predominant melody pitches
to estimate the tonic octave.
The fundamental frequency estimation is divided into pre-processing, F0-
exctraction and post-processing Predominant melody extraction aims at esti-
mating the predominant melody line from polyphonic audio music recording.
For the pitch class distribution, features such as pitch-class distributions
(PCD), pitch-class profiles (PCP), Harmonic pitch-class profiles or croma
features are used [1].
Sinusoidal extraction done by spectral transform, peak picking and sinu-
soid amplitude and frequency correction. These extracted pitches are used
to compute a salience function, a time-frequency representation indicating
salience of different pitches over time. Tonic pitch candidates are computed
using salience function. Each candidate represented by frequency and am-
plitude value. Pitch candidates are generated by detecting peaks of salience
function followed by computing pitch histogram and extracting candidates
as a peaks of this histogram [1,5].
These peaks are nothing but the prominent pitches of the lead instrument,
vocal part or other predominant accompanying instruments present in the
audio recording at every point in time. Finally correct candidate is selected
by using both frequency and amplitude of each of the 10 or more extracted
candidates.
Second method divided into two stages, the tonic pitch class identification,
performed using a multi-pitch analysis, similar to first method and tonic oc-
tave identification using predominant melody information. Therefore second

1
method is applicable for both vocal and instrumental music. Tonic pitch-class
identification in second method involves two main steps, multi-pitch analysis
and candidate selection. Second method differs from first method at candi-
date selection step. The difference is that the class labelling strategy that is
followed to train the classifier. Class labels assigned to each instance in this
method is the best rank of the tonic pitch-class amongst all the candidates.
Tonic identification system aims at recursively utilizing all the available
data and obtaining results with maximum confidence. Tonic is identified
using both audio data and metadata [1].

Figure 1: Iterative Tonic identification System

2
2 Literature Survey
Great work in tonic identification is done in paper “Automatic Tonic
Identification in Indian Art Music”, [1] authors proposed the two methods
which are evaluated for ICM. Results for Male Singers shows accuracy of
93.83% compared to those sung by Female singers which is 90.3%. The
results shows good results for of the male singers because training dataset is
substantially populated by male singer performances (77.8%).The frequency
range for the computation of pitch histograms was selected based on the
overall high accuracy and therefore, due dominance of male performances
the selected frequency range appears to be biased towards male singers. The
most frequent error types were selecting fifth (Pa) or the fourth (Ma) as the
tonic in another octave. In these cases a fundamental cause for an error is the
confusion between Ma and Pa tuning cases, leading to error. In Tonic pitch
identification it is observed that for the songs performed by female singers,
the percentage of Ma type errors is considerably high (6.02%). First Method
obtains 93.05% .
In paper “Automatic Carnatic Raga Identification using Octave Map-
ping and Note Quantization”, [2] author proposed a method to identify
Melakartha ragas from audio recordings of Carnatic Classical Music. The
proposed method was classified into an acoustic model and a Music Lan-
guage Model. Modified autocorrelation method in acoustic model detects
pitch frequency of musical note being played. In Music language model de-
tected pitch frequencies were mapped from different musical octave to the
middle octave (octave C4) and quantized to the standard frequencies to ac-
count for the characteristics variations of swaras in Carnatic Music. From
these frequencies the unique frequencies with highest occurrences were cal-
culated and raga was identified. The algorithms was tested on recordings of
human vocals, the veena and the piano. The accuracy of raga identification
for these recordings was 80.56% and 89.81% respectively.
In paper “Comparison of fundamental frequency detection methods and
introducing simple self-repairing algorithm for musical applications”, [8] au-
thor presents the comparison of five commonly used methods for fundamental
frequency detection in speech signal, exactly in vocal and melodic instru-
ment signals. The highest efficiency in fundamental frequency detection was
reached by Autocorrelation function (ACF) and modified autocorrelation
function (MACF) [2]. Self-repairing algorithm is also described in this paper
and it can be defined as a useful tool for correction of inaccurately found
fundamental frequencies related to relevant musical notes.
In paper “Students-t Mixture model based multi-instrument recognition
in polyphonic music”, [7] author proposed a method for problem of multi-

3
instrument recognition in polyphonic music signals, where individual instru-
ments are modeled within a stochastic framework using Students-t Mixture
Models (tMMs). Mixture of these instrument models are imposed on the
polyphonic signal model. The Mixture weights are estimated in a latent
variable framework from polyphonic data using an Expectation Maximiza-
tion (EM) algorithm. The output of an algorithm is an Instrument Activity
Graph (IAG), using which it is possible to find out the instruments that are
active at a given time. An average F-ratio of 0.75 is obtained for polyphonies
containing 2-5 instruments.
In this paper “Melody Extraction from Polyphonic Music Signals: Ap-
proaches, Applications and Challenges”, [3] in this article proposed melody
extraction algorithms, which aim to produce a sequence of frequency values
corresponding to the pitch of the dominant melody from a musical record-
ing. This article covers a comprehensive comparative analysis of melody
extraction algorithms. In this article, challenges in the melody extraction,
evaluation methodology, algorithmic performance and development is dis-
cussed. The generic structure of the multi-pitch analysis systems proposed
by Salamon & Gomez [4]. Typically multi-pitch analysis part of these sys-
tems done using Signal representation, Salience function computation and
F0-candidate extraction. In predominant melody extraction various salient
based algorithms are proposed in this article. Here pitch salience function
is computed from audio signal. Using this pitch salience function, potential
F0-candidates are extracted based on their prominence in the salience func-
tion. Later, applying tracking rules the system identifies the F0-trajectory
that best represents the melody line.
In paper “knowledge based signal processing approach to tonic identifi-
cation in Indian classical music”, [5] authors described several techniques for
detecting tonic pitch value in Indian Classical Music. Here author explores
detection of tonic by processing the pitch histograms of Indian classical mu-
sic.

Figure 2: Method for Pitch class distribution from Audio

4
Processing of pitch histograms using group delay functions and its ability
to amplify certain traits of Indian music in the pitch histogram is discussed.
Three different strategies to detect tonic, namely, the concert method, the
template matching and segmented histogram methods are proposed.
In Article “Tonal Representations for Music Retrieval: from version iden-
tification to Query-by-Humming”, [4] authors compared the use of different
music representations for retrieving alternative performances for the same
musical piece. From the audio signals descriptors representing its melody,
bass line and harmonic progression using state-of-art algorithms. By com-
bining descriptors version detection accuracy is increased. Author proposed
a melody based retrieval approach, and demonstrated how melody represen-
tations extracted.
In paper “Real-time Audio Effects with DSP Algorithms and Direct-
Sound”, [9] authors focused on the software implementation of DSP oriented
algorithms for different audio effects, like Echo, 3-Tap Echo Vibrato, Tremolo
and Chorus. The main contribution of this research consists in a software
application which is able to process in real time the audio signals received
from the electric guitar using PC sound card. In addition, the application
also includes a digital filter class used for implementation of a low pass filter
as well as other filter types, a tone generator and the options for storing the
sound samples in .wav file. The application is written in C# language which
uses DirectSound Library in order to process the sound sample at low level.
Two methods are proposed for tonic identification by Sankalp Gulati [1],
in previous work analysis based on only predominant melody, drone is not
explained anywhere. Only tonic pitch-class is identified, no information re-
garding tonic octave was given. First method identifies tonic pitch in a single
step and it is not applicable to the instrumental music because instrumen-
tal music pieces are annotated by the tonic pitch class. Second method is
applicable for the both vocal and instrumental music.

5
3 Problem Statement
The proposed system aims at retrieving musical information and develop
an algorithm for automatic tonic identification for both instrumental and vo-
cal excerpt of Indian Art Music. Further it include real-time implementation
of proposed algorithm for Android App Development.

4 Objectives
List the objectives:

• Design an innovative algorithm for Automatic Tonic Identification.

• Classifier: Decision Tree for rule based approach to identify the tonic
octave from the melody histogram.

• Design algorithm for real-time implementation.

• Develop an android app.

5 Scope
Implementation of Automatic tonic identification will clear main obstacle
in the path of automatic raga identification in Indian Classical Music. This
will help in intonation and motif analysis of music. Apart from this extracted
musical information will leads to create software environment for music lovers
to learn music without tutor. Various features extracted will help in creating
unbiased system for evaluation where exam involves use of audio.

6 Methodology
Database comprised of more than 300 full length audio songs, containing
both vocal (more than 200) and instrumental (more than 100) musical pieces.
There will be two methods, first method identifies tonic pitch and applicable
to vocal music pieces whereas second method identifies the tonic pitch-class
and caters to both the vocal and instrumental excerpts. Second method also
encounters with octave estimation which is only required for vocal excerpts
and not for instrumental pieces. Using multi-pitch histogram top 10 or more
candidate frequencies are extracted for selection of tonic. Finally simple
MATLAB GUI is developed.

6
Method presented by Sankalp Gulati fails to identify correct tonic due
to low sound of drone instrument, this problem can be further solved by
use of source separation upfront and use only extracted signal component to
identify the tonic pitch [1].

Figure 3: Block Diagram of Proposed Method

7 Resources required
7.1 Software
MATLAB R2013b, AUDACITY and PRAAT, FL Studio.

7.2 Toolbox
MIR Toolbox, Signal Processing Toolbox, Auditory Toolbox.

7
8 Plan of execution

Table 1: Expected time line of the project

Month Expected work

• Literature survey
Jul. to Aug.
• database collection

• Testing of vocals and instrumental pieces in AU-


DACITY and PRAAT
Sept. to Oct. • Extracting features

• Implementation of first method

• Implementation of second method


Nov. to Dec.
• testing an algorithm

• Testing Algo. for Collected database

• Obtaining and improving results


Jan. to Feb.
• Research for further optimization

• Target for Conference

• Create MATLAB GUI, Real-time implementation


Mar. to Apr.
• Target for Conference

May to June • Android App development

8
9 Targeted Publications
9.1 Conferences
• IEEE International Conference on Acoustics, Speech and Signal Pro-
cessing (ICASSP).

• International Conference on Communication and Signal Processing (ICCSP).

• IEEE Signal Processing Magazine.

• IEEE transaction on Audio, Speech and language Processing

• IET Signal Processing

9.2 Journals
• International Journal of Signal and Imaging Systems Engineering (IN-
DERSCIENCE).

• Journal of Audio Engineering Society.

• Signals and Communication, SADHANA, Published by Indian Academy


of Sciences.

• Journal by Springer ”Signal, Image and Video Processing”.

• EURASIP Journal on Advances in Signal Processing, Springer Publi-


cations

References
[1] Sankalp Gulati, Ashwin Bellur, Justin Salamon, Ranjani H.G., Vignesh
Ishwar, Hema A Murthy and Xavier Serra, “Automatic Tonic Identifi-
cation in Indian Art Music”, In Journal of New Music Research, Vol.43,
Issue-1, 2014.

[2] Rohan T. Pillai and Shrinivas P. Mahajan, “Automatic Carnatic Raga


Identification using Octave Mapping and Note Quantization”, In ICCSP-
2017, India.

[3] Justin Salamon, Emilia Gomez, Daniel P.W. Ellis, Gael Rechard,
“Melody Extraction from Polyphonic Music Signals:Approaches, Appli-
cations and Challenges”, In IEEE Signal Processing Magazine-2014.

9
[4] Justin Salamon, Joan Serra, Emilia Gomez, “Tonal Representations for
Music Retrieval: from version identification to Query-by-Humming”, In
International Journal of Multimedia Information retrieval, 2012.
[5] Ashwin Bellur(IIT-M), Vignesh Ishwar(IIT-M), Xavier Serra(MTG-
University Of Pompeu Fabra, Barcelona,spain) and Hema A Murthy
(IIT-M) in “A knowledge based signal processing approach to tonic iden-
tification in indian classical music”, In Proc. of 2nd CompMusic Work-
shop, Istabul, Turkey,2012.
[6] Harshavardhan Sundar, Ranjani H. G., and T.V.Sreenivas,“Students-t
Mixture model based multi-instrument recognition in polyphonic music”,
In International Conference on Acoustics, Speech and Signal Processing,
2013.
[7] Miroslav Stanek and Tomas Smatana, “Comparison of fundamental fre-
quency detection methods and introducing simple self-repairing algorithm
for musical applications”, Music Technology Group, University Pompeu
Fabra, Spain,2016.
[8] Ptre Anghelescu,“Real-time Audio Effects with DSP Algorithms and Di-
rectSound”, In International Conference, ECAI-6th edition, 2014.

Dr.S.P.Mahajan Mahesh Y. Pawar


MTech. Project Guide MTech. Student
Department of
Electronics & Telecommunication
COEP, Pune

Date- 29/09/2017
Place- COEP, Pune

10

You might also like