Professional Documents
Culture Documents
'
Transaural stereo, generic for binaural stereo processed for cancellation of loudspeakerto-ear crosstalk, results from the use of minimum-phase filters in shuffler configuration.
Simplifying the filters further at short wavelengths makes the listener position noncritical.
Full spatial qualities appear in a conventional stereo playback that avoids early reflections.
Inverse shufflers provide precise transaural pan functions for multitrack work.
0 INTRODUCTION
The composite-signal structure is subsequently inverted (decomposition) in the intervening loudspeakerto-ear transmission to produce the intended sounds at
the ears. On the way to the ears, in addition to the
direct transmission, left to left and right to right, there
occur the cross transmissions of left to right and right
to left. The latter are traditionally called crosstalk (from
telephony), and the composition-decomposition
scheme cited is a nonadaptive precancellation of crosstalk. It consists of the "planting" of a crosstalk process,
in advance, that is devised to be the inverse of the
acoustic crosstalk expected to occur subsequently. When
properly done, the net result is the elimination of all
evidence of crosstalk.
PAPERS
or that only a portion of the performing ensemble requires the spatial delineation available through artificialhead pickup. Such artistic decisions remain, of course,
with the producing authority, and it is the re!iponsibility
of the engineer to provide incisive imaging, to the extent
possible, where desired. Transaural technology may
be viewed as providing improved options for that purpose, not necessarily a whole new recording style.
A better choice for incisive imaging, however, cannot
be made. In a previous paper, Cooper, using calculations
from Bauck's thesis [2], showed [3, Fig. 8] the required
loudspeaker-signal specifications for two examples of
imaging. None of the conventional stereo methods
produces signals that in any way resemble these specifications, except at low frequencies. Conventional
stereo has not sought to devise loudspeaker signals to
meet imaging-signal specifications at the ears, as was
required in these calculations, except in the low-frequency work of Blumlein [4]. Specifically, none of the
existing pan-pot formulas meet these specifications,
nor do any of the stereo microphone arrays, whether
coincident or spaced, whether using directional elements
or not.
Some recording engineers, seeking a spacious effect,
use widely spaced microphones in a concert-hall setting.
It is known, of course, that the signals so obtained are
highly decorrelated, and it is also a known fact, in
concert-hall acoustics, that highly decorrelated ear
sounds are identified with spacious acoustic impressions. Unfortunately, the interaural correlation wiii always be greater than the correlation at the loudspeakers,
because of crosstalk. The net result is that the spacious
effect is perceived as confined to an "acoustic stage,"
as in a different space from that of the listener. An
important aspect of the concert-hall experience is lost.
The use of widely spaced microphones with binaural
synthesis and suitable delay, however, will give the
recording engineer much greater control over the representation of the sound of the hall. Thus many more
venues may be exploited to advantage. At the same
time, a full spatial envelopment of the listener can be
provided to the extent desired. Many recording engineers will discover, also, that imaging and spaciousness
are not mutually exclusive, but, as has long been known
in concert-hall acoustics, belong together. Placing them
together is natural in transaural technology.
At first the recording engineer wiii want to try only
the simplest things from transaural technology. Indeed,
it is likely that only the simpler equipment wiii become
available at first. Existing techniques wiii necessarily
continue to be used, and the improvements oftransaural
technology wiii, in some instances, be adapted to that.
For reviews of existing techniques, the writings of Eargle may be consulted [5]. The evolution of such techniques to suit a binaural style of recording is not amenable to detailed prediction, and will not be attempted here.
Abbozzare
It is possible, however, to sketch a catalog of specific
kinds of transaural-related equipment, the development
previsto
of which may be foreseen. Some of these items are
discussed in a later section.
4
PAPERS
1.7 Summary
The principal purpose of this paper is to report on
improvements we have discovered in a particular signalprocessing scheme, the crosstalk-canceling scheme of
Atal and Schroeder. These improvements, which are
largely practical, offer the possibility of a significant
restructuring of stereo recording to make for extraordinary improvements in stereo quality.
5
PAPERS
(la)
(lb)
S'
A'
(2a)
and
P'
S'
A'
(2b)
S-A
(3a)
and
J. Audio Eng. Soc., Vol. 37, No. 1/2, 1989 January/February
PAPERS
P'
(3b)
S +A
Thus the matrix of the shuffter transfer functions, diagonal with elements N' and P', is the inverse of the
diagonal acoustic matrix for difference-and-sum ear
sounds with elements N = S - A and P = S + A.
1.3 Minimum-Phase Characteristics
In 1977 Mehrgardt and Mellert [19] showed experimentally that the head-related transfer functions are
of minimum phase to within a frequency-independent
delay, a delay that is incident-angle dependent. They
proceeded via the Hilbert transform of the log-magnitude
response to calculate the minimum-phase part of the
phase response. The remainder, or excess-phase part,
L
(a)
(/)
t:
wZ
(/) :::> 0
Z>-
00::
a.c:x
(/)0::
w ..... -1
O::ii)
AA\ A
.A l A. ft.n_.
vv-vv
vv
""
a:
<X
--2
DELAY (ms)
(b)
Fig. I. (a) Atal-Schroeder crosstalk-canceling filter arr y. (b) Plot of the impulse response of its cross filter, .c = - (11S.
The filter matrix of (a), adapted from [ 12], is the inverse of the matrix of acoustic transfer functions, the. matnx showmg S
for the transmission from a loudspeaker to the same-side ear and A for the transmission to the a ternate-s1de e r: The curve
of (b), adapted from [ 18], is the impulse response of C and shows the process to be completed m very few milliseconds.
J.
Audio
Eng.
Soc.,
Vol.
37,
No.
1/2,
1989
January/February
7
PAPERS
+ ISI 2
(4a)
(4b)
and
IPI = <IAI 2
ISI 2
+ IAI
(5a)
(5b)
(6)
(7)
Shuffler
N
E(eo)
(8a)
PAPERS
p
(8b)
E(eo)
our loudspeaker playback of recordings from the Neumann KU-80 h;a l. As indicated in the Introduction,
the stereo effect"was of a "hole in the middle you could
drive a truck through," as one listener said. Wli n co verted to transaural, using the crosstalk canceler built
with the functions of Fig. 3, Schroeder's description
of "nothing less than amazing" spatial and imaging
qualities certainly applied, but it was possible to notice
that the equalization was "a little off." Later, the appearance of a "hole" tendency in this recording would
alert us to early reflections in a listening setup. As we
also noted, recordings from the Aachen head (0 equalization) provided stereo of unequaled excellence by
ordinary standards. Certainly, no "hole" was observed,
even without cancellation.
1.6 System Transfer Functions
In the following, M will be used to designate either
N or P. It will be understood to be a function of frequency and incidence angle. Thus for natural directional
hearing, either member of the pair of overall transfer
functions from a source at angle e to the ears is designated
Hn
(9)
Mn(e)
12r--r-r--
----
-r
---- -r-r-r
10
6r----+
-- ---+
-- ----
CD
Frequency
PAPERS
H
t
Ma(S)Mn(S 0)
oMx(So)
(11)
which to undeitake departures in the service of practicality. It is usually one of the strengths of starting
from an optimal position that departures from the optimum in design parameters usually produce remarkably
small effects.
1.7 Practical Design Considerations
Except for a custom-designed crosstalk canceler, it
is not to be expected that Mx will be the same as M 0 ,
and a commercial release of a transaural recording would
have to embody an Mx that would be required to be
satisfactory for a wide range of listener heads, each
with its own M 0 Generally this is not a difficult requirement. It has been found, for example, that the
crosstalk canceler based on a spherical-model head [2],
[3] produces immensely satisfying results for a wide
range of listeners' heads. Heads that are somewhat
small may be placed somewhat nearer the loudspeakers,
and those that are somewhat large may be placed at a
somewhat greater distance, as may be seen from the
structure of head functions, but the exact placement
does not seem to be a critical matter for most listeners.
What is probably the case is not that a sphere is
necessarily a best fit, but that it is a "comfortable" fit
for most heads just because of its inexactness. While
the advantages of inexactness merit further exploration,
we have tried another aspect for inexact treatment, the
domain of wavelengths shorter than about 50 mm (frequencies higher than about 6 kHz). The first experimental crosstalk-canceler filters followed, after a
somewhat abrupt transition, the null-crosstalk contour
of Eq. (6) for the shorter wavelengths. We attribute
the tolerance in listener movement to these aspects of
inexactness in Mx filters.
The choice of a rather abrupt "cut" in our first experimental canceler may have been somewhat extreme.
We do notice a tendency for sibilantlike sounds and
clicklike sounds to be mislocated, generally toward
the front. This is a confirmation, extended to short
wavelengths, of the importance of interaural phase.
Although this style of design variation has proved instructive, we are now inclined to rely on a more uniform
distribution of inexactness, of which the spherical
functions are a good example. Another variation of
interest is that of introducing a gradual taper, as shown
in Fig. 3(a), dashed line, wherein the upper and lower
envelopes approach the null-crosstalk contour in a
somewhat less accelerated manner for short wavelengths, replacing the more abrupt cut.
We visualize these styles of inexactness as defining
a volume of space near each ear of the listener, a space
over which cancellation is satisfactorily accurate. We
visualize this volume as being of smaller extent for the
shorter wavelengths, and we suppose that it is appropriate to be less exact at these shorter wavelengths.
We also believe, despite our successes with spherical
functions, that we need to continue to investigate this
problem. Thus the tolerance we have gained for listener
movement, already satisfactory for most purposes, may
be extended.
J. Audio Eng. Soc., Vol. 37, No. 112, 1989 January/February
PAPERS
2 BINAURAL SYNTHESIS
2.1 Synthesis Filters
H
s
(6i)Ms(6i)
(60 )(6i)
(13)
(15)
(6o)
and
(16)
(17)
K=
IS!AI + IA!SI .
(18)
It is seen that
(19)
11
PAPERS
1300
1200
1100
I
I
,.,+--l I
I "
/
I
I I
I
'\
1/1
j.
>-
<! 800
...J
I
I
140
Vl
<!
I
z
H
600
oo
l 1/'
u..
u..
400
.'7 \
I
I
'
AV
!I
I
I
I
rtrli
!-' 7f..
. 'I.
1\
""' '\
"'-I
\\
\
"'
200
100
0
0
71. iVi
/,: lt i'--.
. (\
\j \
I \ 1\ '\ \\- \ \
.,\. ' \'\
!M
1\[\. \\
. I
2200
'"-
'\.
1100
\ -
ll';(1
;(/;
\1\
7800
1;' .M
I ',, \\
I /,.300
'
-- - --- - " \I
,
,
j/
I 1/
w 500
a::
w
Ci
31
a..
.- -- -..
/; ..
700
/-
- 900
1000
\ ,\
4200
\\
20
40
60
80
100
120
140
160
180
PAPERS
I 'M lex
(20)
(1 K)'h
28
I I
I I I
26
I I I
I I
I
I
I \
I \
I I
I
I
I !
i i I I 'I !
I
! I I ,t, I ,I
I I
24
1
I
I I I 1 \ I I I 1\1
I
I I
I /I \ I \ (
I
22
I
f
'I II : I I 1
,! w'I
II
I I
I
20
I
'Ji ! I I ;
I I
!
I
I \
I
:
I
18
I
I
I I
I
I
I
I I
I
r
!\
I
I
i
I I
16
I
CO
'0
....J
UJ
....J
14
I
..
UJ
',
'
I
I
u..
10
/1 '
':
:/
i/
..- 2(0
--.
-
I 1:
,.,/1/
/)
f,Y
I
!\
'
''
.-
'/ft ..
4
\,j
I .,
1\
.
1 lI I
-r 1 v \\
=-40
20
,j
- --60
\'-
--. -- --..
310
'
!--
80
100
II
,...\
)0
\
\
\\I\
I'
I{
II
/:
J:
f\1
I\ i
Y.
.i
\i
/!J'I09
I!
.\lJ
\
\
1\
I:
!If
4Zoq
1\
,_J
-f.,./'
.r
12
1/
I!
8obl
120
,\_
rA
'J . \
\\ \
,..
140
\
160
'
180
Fig. 6. Plots of interaural difference in leveliStAI ver ys angle of incidence for various frequencies (Hz). Adapted from
[22].
13
PAPERS
-=-
-......:
.............
''..._
......
"";.
f
.-:-
...
............... }-..
--
t ......
'
\
7;
{/
ocl
CD
-6
...._.1
., V /
7
7I
/,
1i
r\
I i\
V
V
20
\ \i \
\ l V\
\ !\ 1/i\ \
i.
\
\ .I
fli
0
J;
200
./.,
...
.... / ......
-2
. /
"'
'
-----
.........
178oot-- J ........... /r t-./...
/'
,._,
.,
.
/
....\
.!
40
60
80
100
120
'"
140"
i
160
180
Fig. 7. Plots of alternation envelopes square root of 1 K, versus angle of incidence for three highest frequencies of Fig.
6.
8
6
2r----r--
or-
-n
--
rn---- --r-
-2
CD
CD
3-4
-2
(I)
-6 ---+------4-----+--- ---+
(I)
-.:::::-
>
j -4
-10
_J
-8
-12L,--*-..l.-..l.....,..J.,,.W.....L...I..-!---+-
Y....J.....:.'-W.:', 0.1
-10
-8
Frequency
-12
0.1
0.2
Frequency
1-'AI-'t:H:>
'
oo
12.---.--.-.-.
.----.--r-r -rrrn
.---'1'------------'1'----1'----o
Inputs
10
-85
6 -- ------+--- -----r------r+
-90
4
CD
Outputs
Frequency
PAPERS
3.2 Monitoring
Facilities for earphone monitoring require 30 freefield equalization as above, if it is not internal to the
earphone. If the program material to be monitored is
in the form of loudspeaker signals (whether transaural
or conventional stereo), there would also be needed a
binaural-synthesizer version of a circuit devised by
Bauer [17], the so-called Bauer box. The two inputs
would be processed to simulate 30.
Loudspeaker monitoring would require transaural
monitor equipment to derive the proper signals from
binaural material. It could embody a crosstalk canceler
of standard grade adopted for mass distribution. Some
means of assurance of adherence to a standard would
be needed for full reliance on such monitoring. Also,
1
16
r -------------------------- 1
L------------------------1
Ensemble
1
Art.
Head
Binaural Output
L--::...._--.J
_r--u
PAPERS
A virtual loudspeaker is a transaural image synthesized to simulate the effect of a loudspeaker placed at
a specified image location. The process involves binaural synthesis followed by transaural conversion.
For example, an experimental processor has been constructed that makes a pair of loudspeakers placed at
15 sound as if the loudspeakers had been placed at
30. Applications are indicated below.
4.1 Correction of Loudspeaker Placement
PAPERS
PAPERS
[7) M. C. Killion, "Equalization Filter for EardrumPressure Recording Using a KEMAR Manikin," J. Audio Eng. Soc., vol. 27, pp. 13-16 (1979 Jan./Feb.).
[8] B. S. Atal and M. R. Schroeder, "Apparent Sound
Source Translator," U.S. patent 3,236,949 (1966 Feb.
22).
[9] M. R. Schroeder and B. S. Atal, "Computer
Simulation of Sound Transmission in Rooms," IEEE
Conv. Rec., pt. 7, pp. 150-155 (1963).
[10] M. R. Schroeder, "Digital Simulation of Sound
Transmission in Reverberant Spaces," J. Acoust. Soc.
Am., vol. 47, pp. 424-431 (1970 Feb.).
[11] M. R. Schroeder, "Computer Models for Concert Hall Acoustics," Am. J. Phys., vol. 41, pp. 461471 (1973 Apr.).
[12] M. R. Schroeder, "Models of Hearing," Proc.
IEEE, vol. 63, pp. 1332-1350 (1975 Sept.).
[13] P. Damaske, "Head-Related Two-Channel
Stereophony with Loudspeaker Reproduction," J. Acoust.
Soc. Am., vol. 50, pt. 2, pp. 1109-1115 (Oct. 1971).
[14] T. Mori, G. Fujiki, N. Takahashi, and F. Maruyama, "Precision Sound-Image-Localization Technique Utilizing Multitrack Tape Masters," J. Audio
Eng. Soc. (Engineering Reports), vol. 27, pp. 32-38
(1979 Jan./Feb.).
[15] H. W. Gierlich and K. Genuit, "Processing Artificial-Head Recordings," J. Audio Eng. Soc. (Engineering Reports), vol. 37, this issue, pp. 35-40. Also,
W. Bray, private communication (1987 Nov.)
[16] E. L. Torick, A. Di Mattia, A. J. Rosenheck,
'
[17] B. B. Bauer, "StereophonicEarphonesandBi
naural Loudspeakers," J. Audio Eng. Soc., vol. 9, pp.
148-151 (1961 Apr.).
[18] H. MfZiller, "Cancellation of Crosstalk in Artificial-Head Recordings Reproduced through Loudspeakers," J. AudioEng. Soc., vol. 37, this issue, pp.
31-34.
[19] S. Mehrgardt and V. Mellert, "Transformation
Characteristics of the External Human Ear," J. Acoust.
Soc. Am., vol. 61, pp. 1567-1576 (1977 June).
[20] D. H. Cooper and J. L. Bauck, "Corrections
to L. Schwarz, 'On the Theory of _Diffraction of a Plane
Soundwave Around a Sphere' ['Zur Theorie der Beugung einer ebenen Schallwelle an der Kugel,' Akust.
Z., vol. 8, pp. 91-117 (1943)]," J. Acoust. Soc. Am.,
vol. 80, pp. 1793-1802 (1986 Dec.).
[21] E. L. Torick, private communication (1975
Nov.).
[22] H. Mertens, "Directional Hearing in Stereophony- Theory and Experimental Verification," EBU
Rev., pt. A, no. 92, pp. 146-168 (1965 Aug.).
[23] J. S. Russotti, T. P. Santoro, and G. B. Haskell,
"Proposed Technique for Earphone Calibration,''
J. Audio Eng. Soc., vol. 36, pp. 643-650 (1988 Sept.).
[24] M. A. Gerzon, "Ambisonrcs in Multichannel
Broadcasting and Video,'' J. Audio Eng. Soc., vol. 33,
pp. 859-871 (1985 Nov.).
THE AUTHORS
D. H. Cooper
J. L. Bauck
J. Audio Eng. Soc., Vol. 37, No. _1/2, 1989 J l1 l!_ry1Fe I'1J 'Y
19