You are on page 1of 9

Biomechanical modeling of register transitions and the role of

vocal tract resonatorsa)


Isao T. Tokudab
School of Information Science, Japan Advanced Institute of Science and Technology, Nomi-city, Ishikawa 9231292, Japan

Marco Zemke
Institut fr Informatik, Humboldt University Berlin, Rudower Chaussee 25, 12489 Berlin, Germany

Malte Kob
Erich Thienhaus Institute, University of Music Detmold, Neustadt 22, 23756 Detmold, Germany

Hanspeter Herzel
Institute for Theoretical Biology, Humboldt University Berlin, Invalidenstrae 43, 10115 Berlin, Germany

Received 3 March 2009; revised 27 July 2009; accepted 31 July 2009


Biomechanical modeling and bifurcation theory are applied to study phonation onset and register
transition. A four-mass body-cover model with a smooth geometry is introduced to reproduce
characteristic features of chest and falsetto registers. Sub- and supraglottal resonances are modeled
using a wave-reflection model. Simulations for increasing and decreasing subglottal pressure reveal
that the phonation onset exhibits amplitude jumps and hysteresis referring to a subcritical Hopf
bifurcation. The onset pressure is reduced due to vocal tract resonances. Hysteresis is observed also
for the voice breaks at the chest-falsetto transition. Varying the length of the subglottal resonator has
only minor effects on this register transition. Contrarily, supraglottal resonances have a strong effect
on the pitch, at which the chest-falsetto transition is found. Experiment of glissando singing shows
that the supraglottis has indeed an influence on the register transition.
2010 Acoustical Society of America. DOI: 10.1121/1.3299201
PACS numbers: 43.70.Bk, 43.70.Gr DAB

Voice registers have been introduced for perceptually


distinct types of certain vocal qualities that can be maintained over some ranges of pitch and loudness Titze, 2000.
The perceptive classification can be accompanied by measurements of voice source parameters such as spectral slope
or glottal open quotient Henrich et al., 2005; Salomo and
Sundberg, 2008. Characteristic features are vocal breaks at
register transitions associated with pitch and amplitude
jumps Roubeau et al., 1987; vec et al., 1999; Miller et al.,
2000. It has been shown by Hirano et al. 1970 that the
thyroarytenoid TA muscle and the cricothyroid CT
muscle regulate register transitions. Consequently, the perceptive aspects of registers are inherently related to laryngeal
features of vocal fold vibrations. In chest phonation the vocal
folds are thick and glottal closure is complete, whereas in
falsetto only the vocal fold edges vibrate Vilkman et al.,
1995. Despite extensive experimental investigations using
acoustic signals Hollien, 1974; Sundberg and Gauffin,
1979, electromyography Shipp and McGlone, 1971, electroglottography Henrich et al., 2005, and videokymography
vec et al., 2008, many questions regarding register transitions remain open: what determines the pitch of involuntary
a

This paper is based on a talk presented at the 6th ICVPB, Tampere, Finland, 69 August 2008.
b
Author to whom correspondence should be addressed. Electronic mail:
isao@jaist.ac.jp
1528

J. Acoust. Soc. Am. 127 3, March 2010

register transition? What is the role of sub- and supraglottal


resonances? Is there hysteresis at register transition?
In order to address these problems, biomechanical modeling can complement experimental studies. Even though
some register-like phenomena have been described in twomass models Sciamarella and dAlessandro, 2004; Zaccarelli et al., 2006, an appropriate representation of vibratory modes in chest and falsetto requires more advanced
models such as body-cover model of Story and Titze 1995,
two-dimensional model of Adachi and Yu 2005, and threemass model of Tokuda et al. 2007. Moreover, a smoothed
glottal geometry improves the classical two-mass model of
Ishizaka and Flanagan 1972 considerably Pelorson et al.,
1994; Lous et al., 1998.
In this paper, we use a four-mass body-cover polygon
model Tokuda et al., 2008 to study register transitions and
the influence of resonators. Sub- and supraglottal resonances
are described using the wave-reflection model Kelly and
Lochbaum, 1962; Liljencrants, 1985; Story, 1995; Titze,
2006. Our model simulations reveal hysteresis at the phonation onset and at chest-falsetto transition, which is consistent
with experimental data Berry et al., 1996; Horek et al.,
2004. We find that vocal tract resonances have a pronounced
effect on the chest-falsetto transition.
II. FOUR-MASS BODY-COVER POLYGON MODEL

There are complex high-dimensional models of vocal


fold vibrations Alipour et al., 2000; Titze, 2006; Gmmel et

0001-4966/2010/1273/1528/9/$25.00

2010 Acoustical Society of America

Author's complimentary copy

I. INTRODUCTION

Pages: 15281536

h1
2

h4
2

k2,3 m
3

m2
h2
2

h3
2

0.4
0.2
0

Vocal
Tract

Trachea

x1

x2

x3

x4

FIG. 1. Schematic illustration of the four-mass polygon model of the vocal


folds.

al., 2008; Zhang, 2009 describing anatomical and physiological details. However, many parameters are not precisely
known and a comprehensive bifurcation analysis is difficult
Berry et al., 1994. Consequently, we constructed a lowdimensional model that is consistent with the basic experimental observations.
Our model is based on the body-cover differentiation
proposed by Story and Titze 1995, a three-mass representation of the cover Tokuda et al., 2007, and a smooth vocal
fold geometry as in Lous et al. 1998. Advantage of dividing the cover layer into the three masses is that they are
suitable for representing the coexistence of different vibratory patterns, which may correspond to chest and falsetto
registers Tokuda et al., 2007. The chosen parameter values
are similar to the values in these papers and they are consistent with muscle activation rules Titze and Story, 2002. A
detailed discussion of the modeling and a complete set of
equations and parameters are given in Appendixes A and B
and a recent thesis Zemke, 2008.
Figure 1 visualizes our four-mass polygon model. The
three cover masses allow wave-like vibrations of the whole
vocal folds with a complete closure of the glottis in chestregister simulation. For other parameter sets, high-pitched
oscillations with diminished closure of the glottis are simulated, which resemble falsetto register. Figure 2 shows chestlike vibrations fundamental frequency of 96 Hz for the default parameters listed in Appendix A. The phase shifts in the
opening areas shown in the upper graph allow the energy
transfer from the air flow to the masses and contribute to a
skewing of the glottal pulses of the lower graph.
In order to simulate register transitions, we recall rules
for controlling low-dimensional vocal fold models with
muscle activation Smith et al., 1992; Titze and Story, 2002.
An active CT muscle decreases the vibrating mass and increases stiffness. We introduce a tension parameter T which
mimics the CT muscle Steinecke and Herzel, 1995. Masses
are divided by T and stiffness parameters are multiplied by T
and thus the fundamental frequency of the model is roughly
proportional to the parameter T. Increasing T from 1 to 7
J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

Volume Flow [cm3 /msec]

Symmetry Axis

Air Flow

x0

0.6

2.5
2
1.5
1
0.5
00

10
15
Time [msec]

20

25

FIG. 2. Chest-like vibration simulated by the default parameter setting.


Time series of glottal areas ai = lhi : i = 1 , 2 , 3 l is glottal length between
the three masses are shown in the upper graph, whereas the corresponding
volume flow U is shown in the lower graph.

allows pitch variation from 100 to 600 Hz exhibiting register


transitions detailed simulations in Sec. IV. Figure 3 shows
an example of simulating falsetto-like vocal fold oscillations
fundamental frequency of 369 Hz with the tension parameter T = 3.6. An almost sinusoidal glottal volume flow with
weaker harmonics is observed.
III. MODELING VOCAL TRACT RESONANCES

For male speech, the linear source-filter theory Fant,


1960 was quite successful Stevens, 1999. However, as
shown already in Ishizaka and Flanagan 1972, source-filter
coupling is essential if the fundamental frequency F0 is comparable to the formant frequencies Titze, 2008; Titze et al.,
2008. Register transition can be regarded as bifurcations of
limit cycle oscillations see Tokuda et al., 2007 for a detailed
a1
a2
a3

0.3

0.2

0.1

0
1
0.8
0.6
0.4
0.2
0

4
Time [msec]

FIG. 3. Falsetto-like vibration simulated with the tension parameter of T


= 3.6.
Tokuda et al.: Modeling register transitions

1529

Author's complimentary copy

m1

k1,2

r3

k3

Opening Area [cm2 ]

h0
2

r2

k2

r1

a1
a2
a3

0.8

Volume Flow [cm3 /msec]

k1

rb
mb

Opening Area [cm2 ]

kb

1500

1000

Frequency [Hz]
500

IV. SIMULATION RESULTS

(a)

B. Bifurcation diagrams of phonation onset

Phonation onset and offset can be studied in the context


of Hopf bifurcations see, e.g., Lucero, 1998; Mergell et al.,
2000. Hopf bifurcation theory describes the onset of selfsustained oscillations due to parameter variations. Smooth
oscillation onset is associated with a supercritical Hopf bi1530

J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

10

Time [s]

1500

A. Simulation of gliding pitch

1000

Frequency [Hz]

The analysis of register transitions and source-tract interaction is often studied using glissando singing Henrich et
al., 2005, experimental variation in vocal fold tension
Tokuda et al., 2007, or gliding of the fundamental frequency in biomechanical models Titze, 2008. In Fig. 4, we
compare a glissando of an untrained singer with simulations
of a corresponding F0 glide in our four-mass model coupled
to sub- and supraglottal resonators. The singers glissando in
Fig. 4a exhibits register transitions with frequency jumps
around 3.3 and 7.8 s at slightly different pitches. There is an
abrupt phonation onset at 1.2 s and a smoother offset with
some irregularities. Glissando is simulated in Fig. 4b by
varying our tension parameter T from 1 to 5.5 and then back.
We find a frequency jump at 6.7 s T = 3.8, F0 = 390 Hz and
a backward transition at 17 s T = 3.3, F0 = 350 Hz. These
differences between chest-falsetto and falsetto-chest transitions are a landmark of hysteresis see Tokuda et al., 2007
for a detailed discussion of bifurcations leading to hysteresis. Hysteresis indicates that there are coexisting vibratory
regimes limit cycles for a range of parameters. Moreover,
hysteresis implies that there are voice breaks instead of passagi of trained singers.
In addition to register transitions, occasionally subharmonics are observed, e.g., at 8.7 and 14.1 s. It has been
discussed earlier Berry et al., 1996; Tokuda et al., 2007 that
register transitions are often accompanied by nonlinear phenomena such as subharmonics and chaos. The gross features
of the experimental and simulated F0 glides in Fig. 4 are
similar. The study of hysteresis at phonation onset/offset requires a more detailed Hopf bifurcation analysis.

500

(b)

10

12

14

16

18

20

22

Time [s]

FIG. 4. Color online a Spectrogram of human voice with a gliding fundamental frequency F0. b Model simulation of the gliding F0.

furcation whereas hysteresis and amplitude jumps indicate a


subcritical Hopf bifurcation see, e.g., Guckenheimer and
Holmes, 1983 for details.
In case of phonation onset, an increasing subglottal pressure indicates vocal fold oscillations at overcritical values. In
the simplified two-mass model Steinecke and Herzel, 1995,
no hysteresis was observed. In contrast, excised larynx experiments revealed a clear pressure difference between onset
and offset values of about 0.2 kPa Berry et al., 1996.
Such a hysteretic phonation onset/offset is shown in our
model simulations in Fig. 5. Figure 5a refers to the model
without vocal tract resonators. There are amplitude jumps at
0.52 kPa onset and at 0.34 kPa offset. Differences between increasing and decreasing pressures indicate a subcritical Hopf bifurcation. On the other side, Fig. 5b shows
that sub- and supercritical bifurcations can occur in the fourmass model, to which sub- and supraglottal resonators are
attached. As discussed in Titze 1988, the resonators reduce
Tokuda et al.: Modeling register transitions

Author's complimentary copy

discussion. It is known that bifurcations, i.e., sudden transitions due to slow parameter variations, might depend
strongly on system parameters. Thus, it is possible that even
medium effects of the resonances see Titze et al., 2008 for
data on source-filter interactions can shift register transition
drastically. It has been suggested that particular subglottal
resonances govern involuntary register transitions Titze,
2000; Zhang et al., 2006.
In order to study source-filter coupling, we implemented
the wave-reflection model Kelly and Lochbaum, 1962; Liljencrants, 1985; Story, 1995; Titze, 2006. Details of the
simulations are given in Appendix B. For simplicity we approximate the resonators by uniform tubes characterized by
their length and area. This simplification gives direct insight
on how resonance frequencies given by the tube lengths affect the location of register transitions.

Local maxima of a1 [cm2 ]

Type of registers

Volume Flow

0.6

0.2

0.3
00

Time [ms]

20

0.15

Chest

Falsetto
Volume Flow

0.1

0.6

0.3

(a)

0
0

Local maxima of a1 [cm2 ]

0.05
0.2

0.8

0.3
20

Time [ms]

0.4
0.5
Subglottal Pressure [kPa]

0.6

0.6

Local maxima of a1 [cm2 ]

0.075

0.2

0.08
0

Time[ms]

20

0.1

Volume
Flow

0.07

20

0.1

0.06

0.08
0

(b)

0.14

0.15
0.16
Subglottal Pressure [kPa]

Time [ms]

20

0.17

FIG. 5. Phonation onset. No vocal tract is attached to the vocal fold model
in a, but in b vocal tract is attached. Tension parameter is set as T = 4.
Local maxima of the opening area of the lower mass a1 = lh1 are plotted for
both increasing crosses and decreasing circles subglottal pressure. The
small graphs inside of each diagram represent the volume flow U cm3 / ms
corresponding to each branch of the onset curve.

the threshold pressure. The phonation onset around 0.15 kPa


seems rather smooth, implying a supercritical Hopf bifurcation. The generated phonation on the middle branch, however, has a relatively small amplitude and becomes unstable
around 0.165 kPa, where it jumps to more stable one with
larger amplitude on the upper branch. This jump creates hysteresis in the model simulations.
C. Influence of vocal tract on chest-falsetto transition

We induced register transitions in our model by changing the tension parameter T gradually and by measuring the
fundamental frequency F0, the amplitude of the opening
area, and the number of the colliding cover masses. We observed a steady increase in F0 and collision of all three cover
masses at low F0 and collision of only the top mass at high
F0. For simplicity, a binary classification is applied to draw
the register transitions of Figs. 68 as follows: collision of
three cover masses are termed chest, whereas collision of
less masses are termed falsetto. If only the upper masses
collide, open quotient OQ, defined as OQ= duration of the
open phase of the glottis/pitch period, became large as
known from measurement in singers Henrich et al., 2005.
In our bifurcation diagrams, with the tension T as the bifurcation parameter, we plotted the fundamental frequency F0
on the x-axis instead of T since this allows a direct comparison with glissando spectrograms.
Figures 6 and 7 compare register transitions of the isolated four-mass model with the ones of the complete model
including sub- and supraglottal resonances. In both cases, we
J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

300

400
Frequency [Hz]

500

600

FIG. 6. Register transition of the vocal fold model without vocal tract.
Frequency domains for chest and falsetto registers are drawn in the upper
graph, whereas the corresponding bifurcation diagrams are drawn in the
lower graph. The curves were drawn by both increasing dotted line with
crosses and decreasing solid line with circles tension parameter T. In the
bifurcation diagram, local maxima of the opening area of the lower mass
a1 = lh1 were plotted.

find a pronounced hysteresis of about 3040 Hz but relatively small jumps of the amplitudes. Most notable is the
dramatic shift in the transition due to the coupling to vocal
tract resonators. This observation reveals that the chestfalsetto transition depends sensitively on source-tract interactions.
In order to substantiate this finding, we varied the length
of the sub- and supraglottal tubes. First, we decreased and
increased the length of the subglottal tube by 25%. It turned
out that there are only minor effects on the register transition.
The length changes led to shifts in the transition point by
1015 Hz no graphs shown. In contrast, the supraglottal
resonance had a profound effect: changing the default length
of 17.5 cm to 75% or 125% induced major shifts in the

Type of registers

Time [ms]

Chest

Falsetto

Local maxima of a1 [cm2 ]

0.08
0

Volume
Flow

0.065

200

0.8
0.6
0.4
0.2
0

200

300

400
Frequency [Hz]

500

600

FIG. 7. Register transition of the vocal fold model with vocal tract. The
default lengths for sub- and supraglottis are Lsub = 24.7 cm and Lsup
= 17.5 cm, respectively.
Tokuda et al.: Modeling register transitions

1531

Author's complimentary copy

Volume Flow

0.4

0.1

Long Supraglottis

3000

Short Supraglottis

Type of registers

Chest

Falsetto
200

300
400
Frequency [Hz]

500

600

Frequency [Hz]

2500

2000

1500

1000

FIG. 8. Dependence of the register transition on the supraglottal length. The


right and left graphs show the case of short supraglottis Lsup = 13.125 cm
and long supraglottis Lsup = 21.875 cm, respectively.

500

frequencies at which register transitions are observed see


Fig. 8. Note that these vocal tract lengths are within the
physiological range. For all considered vocal tract length,
subharmonics were observed slightly above the chest-falsetto
transition. Hysteresis with a frequency difference of about
3040 Hz and subharmonics were robust features of our
simulation. The pitch of the register transition was, however,
strongly affected by the formant frequencies.

(a)

10

10

Time [s]

0.65

0.6

1532

J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

0.55

0.5

0.45

0.4
0

(b)

Time [s]

FIG. 9. Color online a Spectrogram of a male subject on vowel /i/ with


a gliding fundamental frequency F0. The first formant F1 estimated from
the speech signal is indicated by a solid line. b OQ computed from the
EGG signal of a. A region of high OQ, corresponding to falsetto register, is
separated from low OQ regions by dashed lines.

first formant. Our experimental study therefore implies that


the supraglottal resonance has indeed a strong influence on
the register transitions. This influence of course depends on
the individual characteristics of the subject, where welltrained singers should know how to sustain the chest register
by avoiding the voice instability. Hence it is reasonable that
not all subjects in our experiment showed a clear influence of
the resonator on the registers.
To study how the different vowels affect the register
transition, we further collected statistical data from the three
subjects, who showed the influence of the resonator. Each
subject was asked to perform F0 gliding on both vowels /a/
and /i/, where ten recordings were obtained for each vowel.
Average and standard deviations of the fundamental frequency F0, at which the register transition takes place when
Tokuda et al.: Modeling register transitions

Author's complimentary copy

Our numerical study has shown that the supraglottal


resonance has a strong influence on the pitch of the chestfalsetto transition. In order to examine this effect, we have
carried out an experimental study of glissando on vowel /i/.
This vowel has been chosen, since /i/ is one of the vowels
that provide the lowest formant frequency F1, thereby the
F0-F1 interaction can be easily observed in the singing experiment. Four subjects were asked to perform F0 gliding
from low F0 to high F0 and then back to low F0. Two
recordings were obtained from each subject. The subjects
were all untrained males who have no evidence on laryngeal
pathology. Both speech signal and electroglottographic
EGG signal were simultaneously recorded.
Figure 9 shows an example of the recording data. The
first formant F1 was estimated from the speech signal by the
conventional technique based on linear prediction analysis
McCandless, 1974. The spectrogram shows that, as the fundamental frequency F0 increases and crosses the first formant
F1 = 250 Hz, the frequency jump is induced at t = 4.6 s. The
same frequency jump is observed, when the fundamental frequency decreases and crosses the first formant at t = 6.5 s. As
indicated by the OQ computed from the EGG signal by the
method of Henrich et al. 2005, these frequency jumps are
accompanied by the register change. The regime of falsetto
register, characterized by high OQ, is clearly distinguished
from the regime of chest register by dashed lines in Fig. 9b.
The timing of the register change coincides with the F0-F1
crossings quite well. This implies that the register transition
is induced by the source-filter interaction, which is known to
become strong when F0 and F1 are close to each other Story
et al., 2000; Titze, 2008. Among the four subjects, coincidence of the register transition and the F0-F1 crossing has
been observed in three subjects, where the other subject
showed register transitions with a pitch much higher than the

Open Quotient

V. EXPERIMENT

Subject
/a/
/i/

I
Hz

II
Hz

III
Hz

270 19
240 14

271 18
275 13

238 20
231 20

increasing F0, were computed from the ten data sets, as summarized in Table I. Because of the high variability of the
register transitions, the standard deviation was estimated to
be relatively large. According to Welchs t-test, the mean
frequency difference between /a/ and /i/ was statistically significant for subject I with a level of 1%. For the other two
subjects, the difference was not significant.
We remark that Titze et al. 2008 carried out the same
experimental framework in the context of voice instability
induced by the source-tract coupling. Their main focus was,
however, on the frequency jumps and not much attention has
been paid to the register change. They found many frequency
jumps induced by the F0-F1 crossing accompanied by hysteresis, in particular, for male subjects. Our observation essentially agrees with their study.

VI. SUMMARY AND DISCUSSION

From the nonlinear dynamics point of view, voice registers are distinct types of limit cycle oscillations. In this context, phonation onset refers to a Hopf bifurcation and register
transitions are associated with bifurcations of limit cycles. In
Tokuda et al. 2007, we characterized register transition in
excised larynx experiments and simulations by twodimensional bifurcation diagrams. In that paper, we analyzed
a simple three-mass cover model.
Here we introduced a more realistic four-mass polygon
model coupled to sub- and supraglottal resonators. This
model was used to study bifurcations at phonation onset and
the chest-falsetto transition.
In Mergell et al. 2000, a smooth phonation onset was
quantified using the normal form of a supercritical Hopf bifurcation. This model explained high speed glottographic
data in a reasonable way. In excised larynx experiments,
however, amplitude jumps and hysteresis were reported at
phonation onset Berry et al., 1996. The present simulation
without vocal tract exhibits a subcritical Hopf bifurcation,
where the associated hysteresis is in good agreement with the
excised larynx experiments. Another simulation with vocal
tract showed that the coupling to the resonators lowers the
phonation onset threshold. It remains to be tested experimentally under which circumstances the phonation onset can be
regarded as super- or subcritical Hopf bifurcation.
It is well known that register transitions in untrained
singers are accompanied by vocal breaks see, e.g., vec and
Pek, 1994. In our simulations, we find indeed sudden
jumps of pitch and amplitudes while varying the tension parameter T smoothly. In computer simulations, the associated
phenomenon of hysteresis can be studied more easily than in
J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

glissando of singers. It turns out that our model exhibits clear


pitch differences between chest-falsetto and falsetto-chest
transitions.
In order to analyze the role of sub- and supraglottal resonators on register transition, we varied the length of the tubes
by 25%. We find that the length of the subglottal tube has
only minor effects on the register transition. Contrarily, the
supraglottal resonator influences the pitch of the chestfalsetto transition strongly. To examine this effect, a simple
experiment has been carried out based on the glissando singing on vowels /i/ and /a/. We have found a strong correlation
between the register transition and the source-filter interaction for most of the subjects. Significant difference in the
register transition point between the sung vowels was also
detected from one subject despite high variability of the register breaks. These results provide a good indication that the
supraglottal resonator has indeed an influence on the register
transitions.
We remark that the present experiment is only preliminary and further investigations with more recording trials
and with more subjects having various singing backgrounds
are indispensable. To further study the influence of the resonators, the following experiments may also be of great interest. i The vocal tract anatomy can affect the register shift,
i.e., vocal tract length might be correlated with the pitch of
register transitions. ii Singing into tube Hatzikirou et al.,
2006 should shift register transitions.
We finally note that comparable frequency jumps are
also frequent in a variety of animal vocalizations Wilden et
al., 1998; Tembrock, 1996; Riede and Zuberbhler, 2003.

ACKNOWLEDGMENTS

We thank Markus Hess and Frank Mller for the opportunity of acoustic and EGG recordings. We are grateful to
Tobias Riede for stimulating discussions. This work was supported by SCOPE Grant No. 071705001 of Ministry of
Internal Affairs and Communications MIC, Japan.

APPENDIX A: DETAILED FOUR-MASS MODEL

Figure 1 shows a schematic illustration of the four-mass


polygon model. Following the body-covered theory Hirano,
1974; Hirano and Kakita, 1985 this model is composed of a
body part mb and a cover part, which is divided into three
masses mi lower: i = 1, middle: i = 2, and upper: i = 3. Following the simplifications of 1 neglecting the cubic nonlinearities of the oscillators, 2 neglecting the additional pressure drop at inlet and considering the Bernoulli flow only
below the narrowest part of the glottis Steinecke and Herzel,
1995, and 3 assuming symmetry between the left and the
right vocal folds, the model equations read as
m1y 1 + r1y 1 y b + k1y 1 y b + h1c1

h1
2

+ k1,2y 1 y 2 = F1 ,
Tokuda et al.: Modeling register transitions

A1
1533

Author's complimentary copy

TABLE I. Average and standard deviation of the fundamental frequency F0,


at which the register transition takes place when increasing F0. For each
subject, ten recordings were collected.

+ k1,2y 2 y 1 + k2,3y 2 y 3 = F2 ,
m3y 3 + r3y 3 y b + k3y 3 y b + h3c3

TABLE II. Default parameters of the four-mass polygon model.

h2
2

Parameter

A2

h3
2

+ k2,3y 3 y 2 = F3 ,

A3

mby b + rby b + kby b + r1y b y 1 + k1y b y 1


+ r2y b y 2 + k2y b y 2 + r3y b y 3 + k3y b y 3
A4

= 0.

The dynamical variables y i represent displacements of the


masses mi, where the corresponding glottal opening is given
by hi = h0i + 2y i h0i is prephonatory length; i = 1 , 2 , 3. The
constant parameters ri, ki, and ci represent damping, stiffness, and collision stiffness of the masses mi, respectively,
whereas ki,j represents coupling strength between two masses
mi and m j. The stiffness is determined as ri = 2imiki using
the damping ratio i. The collision function is approximated
as = 0 0; = 1 0 .
The aerodynamic force, Fi, acting on each mass is derived as follows. First, the vocal fold geometry is described
by a pair of four-mass-less plates, as shown in Fig. 1. The
flow channel height hx , t is a piecewise linear function,
composed of h1,0 x0 x x1, h2,1 x1 x x2, h3,2 x2
x x3, and h4,3 x3 x x4, which are determined as
hit hi1t
x xi1 + hi1t,
xi xi1

hi,i1x,t =

A5

where i = 1 , 2 , 3 , 4 and h0 and h4 are constants. Assuming


Bernoulli flow, the pressure distribution Px , t below the
narrowest part of the glottis, hmin = minh1 , h2 , h3, is described as
Ps = Px,t +

U

2 hx,tl

= P0 +

 U
2 hminl

A6

where  represents the air density  = 1.13 kg/ m3, Ps is


the subglottal pressure, P0 is the supraglottal pressure, and l
is the length of the glottis. In the case that no resonators are
attached to the vocal folds, the subglottal pressure Ps is considered to be constant Ps = 0.8 kPa, whereas the supraglottal pressure is assumed zero P0 = 0 kPa. This gives a
simple formula for the glottal volume flow velocity as U
= 2Ps / hminlhmin. The aerodynamic forces on the plates
are induced by the pressure Px , t along the flow channel.
As m1, m2, and m3 support the plates, an aerodynamic force
on point i i = 1 , 2 , 3 is found to be
Fit =

xi

xi1

x xi1
Px,tdx
xi xi1

xi+1

xi

xi+1 x
Px,tdx.
xi+1 xi1

A7

As shown by Lous et al. 1998, this integral can be solved


1534

J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

Cover mass 1
Cover mass 2
Cover mass 3
Body mass
Stiffness of cover mass 1
Stiffness of cover mass 2
Stiffness of cover mass 3
Stiffness of body mass
Stiffness between cover masses 1 and 2
Stiffness between cover masses 2 and 3
Damping ratio of cover mass 1
Damping ratio of cover mass 2
Damping ratio of cover mass 3
Damping ratio of body mass
Collision stiffness of cover mass 1
Collision stiffness of cover mass 2
Collision stiffness of cover mass 3
Prephonatory displacement of cover mass 1
Prephonatory displacement of cover mass 2
Prephonatory displacement of cover mass 3
Height at vocal fold entrance
Height at vocal fold exit
Mass displacement at point 0
Mass displacement at point 1
Mass displacement at point 2
Mass displacement at point 3
Mass displacement at point 4
Glottal length

Symbol

Nominal value

m1
m2
m3
mb
k1
k2
k3
kb
k1,2
k2,3
1
2
3
b
c1
c2
c3
h01
h02
h03
h0
h4
x0
x1
x2
x3
x4
l

0.009 g
0.009 g
0.003 g
0.05 g
6.0 N/m
6.0 N/m
2.0 N/m
30.0 N/m
1.0 N/m
0.5 N/m
0.1
0.4
0.4
0.4
3k1
3k2
3k3
0.036 cm
0.036 cm
0.036 cm
1.8 cm
1.8 cm
0 cm
0.05 cm
0.2 cm
0.275 cm
0.3 cm
1.4 cm

analytically for the pressure distribution Px , t described by


Eq. A6.
Parameter values used as the default situation of the
present study are summarized in Table II. These values have
been carefully selected in accordance with the previous studies Ishizaka and Flanagan, 1972; Story and Titze, 1995;
Lous et al., 1998; Titze and Story, 2002. The observed phenomena were in general robust and did not show a strong
dependence on the selected parameter values. To simulate the
coexistence of the chest and falsetto registers, the damping
ratio of the lower mass was set to be relatively small compared with the other masses. The small damping ratio activates the lower mass so that it leads to a large movement of
all the cover masses to produce a chest-like register, which
can easily coexist with a falsetto-like register.
The tension parameter T is also introduced to control the
frequency of the four masses as
mi = mi/T,
ki = ki T

A8
i = 1,2,3,b.

A9

The initial values for all simulations were set as x1


= 0.02 cm, x2 = 0.015 cm, x3 = 0.01 cm, xb = 0 cm, and x1
= x2 = x3 = xb = 0 cm/ s. To integrate the four-mass model
equations A1A4, Eulers method was applied with an
integration step of t = 11.4/ 8 s. The model equations
were simulated also by using the MATLAB ODE solver
Tokuda et al.: Modeling register transitions

Author's complimentary copy

m2y 2 + r2y 2 y b + k2y 2 y b + h2c2

APPENDIX B: WAVE-REFLECTION MODEL FOR SUBAND SUPRAGLOTTIS

Sub- and supraglottal resonances were described by using the wave-reflection model Kelly and Lochbaum, 1962;
Liljencrants, 1985; Story, 1995; Titze, 2006, which is a
time-domain model of the propagation of one-dimensional
planar acoustic waves through a collection of uniform cylindrical tubes. The supraglottal system was modeled as a
simple uniform tube area of 3 cm2 and length of 17.5 cm,
which is divided into 44 cylindrical sections. The area function for the subglottal tract was based on the one proposed by
Zaartu et al. 2007. The area function is composed of 62
cylindrical sections. For both sub- and supraglottal systems,
the section length z was set to 17.5/44 cm. This determines
the sampling time interval as t = z / c = 11.4 s, where c
= 350 m / s stands for the sound velocity. The corresponding
sampling frequency is 88 kHz.
Attenuation factor for the resonators was approximated
as ak = 1 0.007 / Ak1/2z Ak is kth cylinder area. Radiation resistance and radiation inertance at the lip were
128 c
,
92 AL

Rr =

Ir =

,
3/2
3
AL

B1

respectively, where the lip area AL was set to be equal to the


last section of the supraglottis.
To couple the sub- and supraglottal resonators to the
vocal fold model, an interactive source-filter coupling was
realized according to Titze 2006, 2008. In this formula, the
glottal flow is given by
U=

ag
ag

kt
A

ag
A

2kt
Pl + 2ps+ 2pe
c2

1/2

B2

where A = AsAe / As + Ae with A and A being the subglottal


and supraglottal entry areas, respectively. kt is a transglottal
pressure coefficient set as 1.0. Pl stands for the lung pressure, whereas ps+ and pe represent the incident partial wave
pressures arriving from the subglottis and supraglottis, respectively. In the present study, subglottal and supraglottal
entry areas were set to be equal to that of the last section of
the subglottal system and that of the initial section of the
supraglottal system, respectively. The lung pressure was set
as Pl = 1.2 kPa. Since subglottal pressure Ps is time dependent in this formula, the pressure value was averaged over a
long-term simulation to plot Ps in Fig. 5b.
s

J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

Adachi, S., and Yu, J. 2005. Two-dimensional model of vocal fold vibration for sound synthesis of voice and soprano singing, J. Acoust. Soc.
Am. 117, 32133224.
Alipour, F., Berry, D. A., and Titze, I. R. 2000. A finite-element model of
vocal-fold vibration, J. Acoust. Soc. Am. 108, 30033012.
Berry, D. A., Herzel, H., Titze, I. R., and Krischer, K. 1994. Interpretation of biomechanical simulations of normal and chaotic vocal fold oscillation with empirical eigenfunctions, J. Acoust. Soc. Am. 95, 35953604.
Berry, D. A., Herzel, H., Titze, I. R., and Story, B. H. 1996. Bifurcations
in excised larynx experiments, J. Voice 10, 12938.
Fant, G. 1960. The Acoustic Theory of Speech Production Moulton, The
Hague, The Netherlands.
Gmmel, A., Butenweg, C., and Kob, M. 2008. Calculation model of the
influence of the vocal fold shape and the ventricular folds on the laryngeal
flow, in Proceedings of the 6th International Conference on Voice Physiology and Biomechanics, Tampere, Finland, pp. 194196.
Guckenheimer, J., and Holmes, P. 1983. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields Springer-Verlag, New
York.
Hatzikirou, H., Fitch, W. T., and Herzel, H. 2006. Voice instabilities due
to source-tract interactions, Acta Acust. 92, 468475.
Henrich, N., dAlessandro, C., Doval, B., and Castellengo, M. 2005.
Glottal open quotient in singing: Measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency, J.
Acoust. Soc. Am. 117, 14171430.
Hirano, M. 1974. Morphological structure of the vocal cord as a vibrator
and its variations, Folia Phoniatr Basel 26, 8994.
Hirano, M., and Kakita, Y. 1985. Cover-body theory of vocal cord vibration, in Speech Science, edited by R. G. Daniloff College Hill Press, San
Diego, CA, pp. 146.
Hirano, M., Vennard, W., and Ohala, J. 1970. Regulation of register,
pitch, and intensity of voice, Folia Phoniatr Basel 22, 120.
Hollien, H. 1974. On vocal registers, J. Phonetics 2, 125143.
Horek, J., vec, J. G., Vesel, J., and Vilkman, E. 2004. Bifurcations in
excised larynges caused by vocal fold elongation, in Proceedings of the
International Conference on Voice Physiology and Biomechanics, edited
by A. Giovanni, P. Dejonckere, and M. Ouaknine, Laboratory of AudioPhonology, Marseille, France, pp. 8789.
Ishizaka, K., and Flanagan, J. L. 1972. Synthesis of voiced sounds from a
two-mass model of the vocal cords, Bell Syst. Tech. J. 51, 12331268.
Kelly, J. L., and Lochbaum, C. 1962. Speech synthesis, in Proceedings
of the 4th International Congress on Acoustics, Paper No. G42, pp. 14.
Liljencrants, J. 1985. Speech synthesis with a reflection-type line analog,
Ph.D. thesis, Royal Institute of Technology, Stockholm, Sweden.
Lous, N. J., Hofmans, G. C., Veldhuis, R. N. J., and Hirschberg, A. 1998.
A symmetrical two mass vocal fold model coupled to vocal tract and
trachea, with application to prothesis design, Acta Acust. 84, 11351150.
Lucero, J. C. 1998. Subcritical Hopf bifurcation at phonation onset, J.
Sound Vib. 218, 344349.
McCandless, S. 1974. An algorithm for automatic formant extraction using linear prediction spectra, IEEE Trans. Acoust., Speech, Signal Process. 22, 135141.
Mergell, P., Herzel, H., and Titze, I. R. 2000. Irregular vocal fold
vibrationHigh-speed observation and modeling, J. Acoust. Soc. Am.
108, 29963002.
Miller, D. G., vec, J. G., and Schutte, H. K. 2002. Measurement of
characteristic leap interval between chest and falsetto registers, J. Voice
16, 819.
Pelorson, X., Hirschberg, A., van Hassel, R. R., Wijnands, A. P. J., and
Auregan, Y. 1994. Theoretical and experimental study of quasi-steady
flow separation within the glottis during phonation, J. Acoust. Soc. Am.
96, 34163431.
Riede, T., and Zuberbhler, K. 2003. Pulse register phonation in Diana
monkey alarm calls, J. Acoust. Soc. Am. 113, 29192926.
Roubeau, B., Chevrie-Muller, C., and Arabia-Guidet, C. 1987. Electroglottographic study of the changes of voice registers, Folia Phoniatr
Basel 39, 280289.
Salomo, G. L., and Sundberg, J. 2008. Relation between perceived voice
register and flow glottogram parameters in males, J. Acoust. Soc. Am.
124, 546551.
Sciamarella, D., and dAlessandro, C. 2004. On the acoustic sensitivity of
a symmetrical two-mass model of the vocal folds to the variation of control parameters, Acta Acust. 90, 746761.
Shipp, T., Robert, E., and McGlone, R. E. 1971. Laryngeal dynamics
Tokuda et al.: Modeling register transitions

1535

Author's complimentary copy

ODE45 and it was confirmed that essentially the same results can be obtained.
To draw the bifurcation diagrams of Figs. 57, 20 local
maxima of the opening area of the lower mass a1 = lh1 were
plotted after discarding the transients. For the next parameter
values, the final state of the preceding simulation was used as
the initial condition.
The spectrogram of Fig. 4b was computed using the
minimum glottal area amin = lhmin with the following parameters. Sampling rate of 44 kHz, window length of 8192
sample points, overlap of 496 sample points, and Hanning
window.

1536

J. Acoust. Soc. Am., Vol. 127, No. 3, March 2010

Titze, I. R. 2008. Nonlinear source-filter coupling in phonation: Theory,


J. Acoust. Soc. Am. 123, 27332749.
Titze, I. R., Riede, T., and Popolo, P. 2008. Nonlinear source-filter coupling in phonation: Vocal exercises, J. Acoust. Soc. Am. 123, 19021915.
Titze, I. R., and Story, B. 2002. Rules for controlling low-dimensional
vocal fold models with muscle activation, J. Acoust. Soc. Am. 112,
10641076.
Tokuda, I., Zemke, M., and Herzel, H. 2008. Biomechanical modeling of
voice registers and their transitions, in Proceedings of the 6th International Conference on Voice Physiology and Biomechanics, Tampere, Finland, pp. 98100.
Tokuda, I. T., Horek, J., vec, J. G., and Herzel, H. 2007. Comparison
of biomechanical modeling of register transitions and voice instabilities
with excised larynx experiments, J. Acoust. Soc. Am. 122, 519531.
Vilkman, E., Alku, P., and Laukkanen, A. 1995. Vocal-fold collision mass
as a differentiator between registers in the low-pitch range, J. Voice 9,
6673.
Wilden, I., Herzel, H., Peters, G., and Tembrock, G. 1998. Subharmonics,
biphonation, and deterministic chaos in mammal vocalization, Bioacoustics 9, 171196.
Zaccarelli, R., Elemans, C. P. H., Fitch, W. T., and Herzel, H. 2006. Modelling bird songs: Voice onset, overtones and registers, Acta Acust. 92,
741748.
Zaartu, M., Mongeau, L., and Wodlicka, G. R. 2007. Influence of acoustic loading on an effective single mass model of the vocal folds, J.
Acoust. Soc. Am. 121, 11191129.
Zemke, M. 2008. Biomechanical-aerodynamical modeling of register
transitions of the human voice, Diploma thesis, Humboldt University of
Berlin, Berlin, Germany.
Zhang, Z. 2009. Characteristics of phonation onset in a two-layer vocal
fold model, J. Acoust. Soc. Am. 125, 10911102.
Zhang, Z., Neubauer, J., and Berry, D. A. 2006. The influence of subglottal acoustics on laboratory models of phonation, J. Acoust. Soc. Am.
120, 15581569.

Tokuda et al.: Modeling register transitions

Author's complimentary copy

associated with voice frequency change, J. Speech Hear. Res. 14, 761
768.
Smith, M. E., Berke, G. S., Gerrat, B. R., and Kreiman, J. 1992. Laryngeal paralyses: Theoretical considerations and effects on laryngeal vibration, J. Speech Hear. Res. 35, 545554.
Steinecke, I., and Herzel, H. 1995. Bifurcations in an asymmetric vocal
fold model, J. Acoust. Soc. Am. 97, 18741884.
Stevens, K. 1999. Acoustic Phonetics MIT, Cambridge, MA.
Story, B. H. 1995. Physiologically-based speech simulation using an enhanced wave-reflection model of the vocal tract, Ph.D. thesis, University
of Iowa, Iowa City, IA.
Story, B. H., Laukkanen, A.-M., and Titze, I. R. 2000. Acoustic impedance of an artificially lengthened and constricted vocal tract, J. Voice 14,
455469.
Story, B. H., and Titze, I. R. 1995. Voice simulation with a body-cover
model of the vocal folds, J. Acoust. Soc. Am. 97, 12491260.
Sundberg, J., and Gauffin, J. 1979. Waveform and spectrum of the glottal
voice source, in Frontiers of Speech Communication Research, edited by
B. Lindblom and S. Oehman Academic, London, pp. 301320.
vec, J. G., and Pek, J. 1994. Vocal breaks from the modal to falsetto
register, Folia Phoniatr Basel 46, 97103.
vec, J. G., Schutte, H. K., and Miller, D. G. 1999. On pitch jumps
between chest and falsetto registers in voice: Data from living and excised
human larynges, J. Acoust. Soc. Am. 106, 15231531.
vec, J. G., Sundberg, J., and Hertegrd, S. 2008. Three registers in an
untrained female singer analyzed by videokymography, strobolaryngoscopy and sound spectrography, J. Acoust. Soc. Am. 123, 347353.
Tembrock, G. 1996. Akustische Kommunikation bei Sugetieren (Acoustic
Communication in Mammals) Wissenschaftliche Buchgesell, Darmstadt,
Germany.
Titze, I. R. 1988. The physics of small-amplitude oscillation of the vocal
folds, J. Acoust. Soc. Am. 83, 15361552.
Titze, I. R. 2000. Principles of Voice Production, 2nd ed. National Center
for Voice and Speech, Iowa City, IA.
Titze, I. R. 2006. Myoelastic Arodynamic Theory of Phonation National
Center for Voice and Speech, Iowa City, IA.

You might also like