Professional Documents
Culture Documents
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=springer.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Springer is collaborating with JSTOR to digitize, preserve and extend access to Machine Translation.
http://www.jstor.org
MachineTranslation 15: 149-185, 2000.
^M J49
y\ © 2001 KluwerAcademicPublishers. Printedin the Netherlands.
MARK SELIGMAN
GETA,UniversiteJosephFourier,385 rue de la Bibliotheque,38041 GrenobleCedex9, France
(E-mail:markseligman@earthlink.net)
1. Introduction
This paperreviews some aspectsof the author'sresearchin spokenlanguagetrans-
lation (SLT) since 1992. Since the purposeis to promptdiscussion, the treatment
is informal,programmatic,and speculative.Thereis frequentreferenceto workin
progress- in otherwords,workfor which evaluationis incomplete.
The paper sketches work in nine areas: interactivedisambiguation;system
architecture;data structures;the interfacebetween speech recognition (SR) and
analysis; the use of naturalpauses for segmenting utterances;example-based
machinetranslation;dialogue acts; the trackingof lexical co-occurrences;and the
resolutionof translationmismatches.There is no attemptto provide a balanced
surveyof the SLTscene. Instead,the hope is to providea provocativeandsomewhat
personallook at the field by spotlightingit fromnine directions- in some respects,
to offer an editorialratherthanpurelya report.
One of the most significant and difficult aspects of the SLT problem is the
need to integrateeffectivelymanydifferentsortsof knowledge:phonological,pros-
odic, morphological,syntactic,semantic,discourse,anddomainknowledgeshould
ideally work togetherto producethe most accurateand helpful translation.Thus a
trend towardgreaterintegrationof knowledge sources is visible in currentSLT
research(e.g., in the Verbmobilproject,Wahlster,1993), and most of the work
describedbelow is in this integrativedirection.Many of the issues to be discussed
here could in fact be addressedby dedicatedpieces of softwareplayingpartsin an
integratedSLTsystem. The paper'sconclusionwill review the issues by sketching
an idealized system of this sort - a kind of personal dream team in which the
componentsare team members.
150 MARKSELIGMAN
2. Interactive Disambiguation
In the presentstateof the art,severalstages of SLTleave ambiguitieswhich current
techniquescannotyet resolve correctlyandautomatically.Suchresidualambiguity
plagues SR, analysis,transfer,and generationalike.
Since users can generally resolve these ambiguities, it seems reasonableto
incorporatefacilities for interactivedisambiguationinto SLT systems, especially
those aimingfor broadcoverage.A good idea of the rangeof workin this areacan
be gainedfrom Boitet (1996a).
In fact, Seligman(1997) suggests that,by stressingsuch interactivedisambigu-
ation - for instance,by using highly interactivecommercialdictationsystems for
input, and by adaptingexisting techniquesfor interactivedisambiguationof text
translation(Boitet, 1996b;Blanchon, 1996) - practicallyusable SLTsystems may
be constructablein the near term. In such "quickand dirty"or "low road"SLT
systems, user interactionis substitutedfor system integration.For example, the
interfacebetween SR and analysis can be suppliedentirelyby the user, who can
correctSR resultsbefore passing them to translationcomponents,thus bypassing
any attemptat effective communicationor feedbackbetween SR and MT.
The argument,however,is not thatthe "highroad"towardintegratedand max-
imally automaticsystems should be abandoned.Rather,it is that the low road
of forgoing integrationand embracinginteractionmay offer the quickest route
to widespreadusability, and that experience with real use is vital for progress.
Clearly, the high road is the most desirable for the longer term: integrationof
knowledgesourcesis a fundamentalissue for bothcognitiveandcomputerscience,
and maximally automaticuse is intrinsicallydesirable.The suggestion, then, is
that the low and high roads be traveledin tandem;and that even systems aiming
for full automaticityrecognize the need for interactiveresolutionwhen automatic
NINE ISSUES IN SPEECHTRANSLATION 151
resolutionis insufficient.As progressis made along the high road and increasing
knowledgecan be appliedto automaticambiguityresolution,interactiveresolution
shouldbe necessaryless often.Whenit is necessary,its qualityshouldbe improved:
Questionsput to the user shouldbecome more sensible and more tightlyfocused.
These design concepts have been informallyand partlytested in two demos, first
at the Machine TranslationSummitin San Diego in October,1997, and a second
time at the meeting of C-Star II (Consortiumfor Speech TranslationAdvanced
Research)in Grenoble,France,in January,1998. Both demos were organizedand
conductedunderthe supervisionof MaryFlanagan,and both demo systems were
basedupona text-basedchattranslationsystempreviouslybuiltby Flanagan'steam
at CompuServe,Inc. The company'sproprietaryon-line chattechnologywas used,
as distinctfrom InternetRelay Chat,or IRC (Pyra, 1995).1
In an on-line chat session, usersmost often converseas a group,thoughone-on-
one conversationsare also easy to arrange.Each conversanthas a small window
used for typing input. Once the inputtext is finished,the user sends it to the chat
serverby pressingReturn.The text comes backto the senderafteran imperceptible
interval,andappearsin a largerwindow,prefacedby a headerindicatingthe author.
Since this largerwindow receives input from all partiesto the chat conversation,
it soon comes to resemble the transcriptof a cocktail party,often with several
conversationsinterleaved.
Each partynormallysees the "same"transcriptwindow.However,priorto the
SLT demos, CompuServehad arrangedto place at the chat server a commercial
translationsystem of the direct variety, enabling several translationdirections.
Once the user of this experimentalchat system had selected a direction (say
English-French),all lines in the transcriptwindow would appearin the source
language (in this case, English), even if some of the contributionsoriginatedin
the targetlanguage(here,French).Bilingual text conversationswere thus enabled
betweenEnglishtypists and writersof French,German,Spanish,or Italian.
At the time of the demos, total delay from the pressing of Returnuntil the
arrivalof translatedtext in the interlocutor'stranscriptwindow averagedaboutsix
seconds, well withintolerablelimits for conversation.2
At the author'ssuggestion and with his consultation,highly interactiveSLT
demos were created by adding SR front ends and speech-synthesisback ends
to CompuServe'stext-basedchat-translationsystem. Two laptopswere used, one
runningEnglish inputand outputsoftware(in additionto the CompuServeclient,
modifiedas explainedbelow), and one runningthe comparableFrenchprograms.
Commercialdictationsoftwarewas employed for SR. For the first demo, both
sides used discrete dictation,in which short pauses are requiredbetween words;
for the second demo, English was dictatedcontinuously- thatis, withoutrequired
pauses- while Frenchcontinuedto be dictateddiscreetly.3
152 MARKSELIGMAN
(2) French: Qu'est-ce que vous faites plus tard? (What are you doing
later?)
English: I'm going skiing. (Je vaisfaire du ski.)
French: Vous n'avez pas besoin de travailler? (You don't need to
work?)
English: I'll take my computerwith me. {Jeprendraimon ordinateur
avec moi.)
French: Ou est-ce que vous mettrez Vordinateurpendant que vous
skiez? (Wherewill you put the computerwhile you ski?)
English: In my pocket. {Dans mapoche.)
As these examplessuggest, the level of languageremainedbasic, and sentences
were purposelykept short,with standardgrammarandpunctuation.
A primarypurposeof the chat SLT demos was to show that SLT is both feasible
and suitablefor on-line chat users, at least at the proof-of-conceptlevel.
In my own view, the demos were successful in this respect. The basic feasi-
bility of the approachappearsin the fact thatmost demo utteranceswere translated
NINE ISSUES IN SPEECHTRANSLATION 153
3. System Architecture
An ideal architecturefor "high road",or highly integrated,SLT systems would
allow global coordinationof, cooperationbetween, and feedback among, com-
ponents (SR, analysis, transfer,etc.), thus moving away from linear or pipeline
arrangements.For instance,SR, as it moves throughan utterance,should be able
to benefit from preliminaryanalysis results for segments earlierin the utterance.
The architectureshouldalso be modular,so thata varietyof configurationscan be
tried:it shouldbe possible, for instance,to exchange competingSR components;
and it shouldbe possible to combine componentsnot explicitly intendedfor work
together,even if these are writtenin differentlanguages or runningon different
machines.
Blackboardarchitectureshavebeen proposed(Erman& Lesser, 1990) to permit
cooperationamong components. In such systems, all participatingcomponents
read from and write to a central set of data structures,the blackboard.To share
this commonarea,however,the componentsmust all "speaka common(software)
language".Modularitythus suffers, since it is difficultto assemble a system from
componentsdeveloped separately.Further,blackboardsystems are widely seen
as difficultto debug, since control is typically distributed,with each component
determiningindependentlywhen to act and what actionsto take.
In order to maintainthe cooperativebenefits of a blackboardsystem while
enhancingmodularityand facilitatingcentralcoordinationor control of compon-
ents, Seligman and Boitet (1993) and Boitet and Seligman (1994) proposedand
demonstrateda "whiteboard"architecturefor SLT. As in the blackboardarchi-
tecture,a centraldata structureis maintainedwhich contains selected results of
all components.However,the componentsdo not access this whiteboarddirectly.
Instead,only a privilegedprogramcalled the Coordinatorcan readfromit andwrite
to it. Each componentcommunicateswith the Coordinatorand the whiteboardvia
a go-betweenprogramcalled a Manager,which handlesmessages to and from the
Coordinatorin a set of mailbox files. Because files are used as data-holdingareas
in this way, components(andtheirmanagers)can be freely distributedacrossmany
machines.9
Managersare not only mailmen, but interpreters:they translatebetween the
reserved language of the whiteboardand the native languages of the compon-
ents, which are thus free to differ. In our demo, the whiteboardwas maintained
in a commercialLisp-basedobject-orientedlanguage,while componentsincluded
independently-developed SR, analysis,and word-lookupcomponentswrittenin C.
Overall,the whiteboardarchitecturecan be seen as an adaptationof blackboard
NINE ISSUES IN SPEECHTRANSLATION 1 57
4. Data Structures
We have arguedthe desirabilityfor system coordinationof a centraldatastructure
whereselectedresultsof variouscomponentsare assembled.The questionremains
how thatdatastructureshouldbe arranged.The ideal structureshouldclarifyall of
the relevantrelationships,in particularclearing up the matterof representational
"levels"- a confusingtermwith severalcompetinginterpretations.
Boitet and Seligman (1994) presentedseveral argumentsfor the use of inter-
relatedlattices for maintainingcomponents'results. Here I presentone possible
elaboration, suggesting a multi-dimensionalset of structuresin which three
meaningsof "level"are kept distinct(Figure 1).
We first distinguishan arbitrarynumberof "Stagesof Translation",with each
stage viewable as a long scroll of paper extending across our view from left to
right.Left-rightis the time dimension,with earlierelementson the left. The Stage
0 scroll representsthe raw inputto the SLTsystem, includingfor examplethe un-
processedspeech inputfrom both speakersand the recordof one speaker'smouse
clicks on an on-screenmap, such as might be used for a direction-findingtask. In
its full extent from left to right, Stage 0 would thus include the raw input for a
translationsession once complete,for example,for a dialogueto be translated.
Stage 1 containsthe resultsof the firststage of processing,whateverprocesses
might be involved. This scroll, viewed as unrollingbehind Stage 0, might for in-
stance include twin sets of lattices representingthe results of phoneme spotting
within both speakers'raw input. Stages 2, 3, . . . , n unrollin turnbehind Stage 1,
recedingin depth.Stage 2 might include source-languagesyntactictrees; Stage 3
158 MARKSELIGMAN
in the corpus investigated,pause units are in fact about 60% the length of
entireutterances,on the average,when measuredin Japanesemorphemes.The
averagelengthof pause units was 5.89 morphemes,as comparedwith 9.39 for
whole utterances.Further,pause units are less variablein length than entire
utterances:the standarddeviationis 5.79 as comparedwith 12.97.
2. Wouldhesitationsgive even shorter,and thus perhapseven more manageable,
segmentsif used as alternateor additionalboundaries?The answerseems to be
thatbecausehesitationsso often coincide with pause boundaries,the segments
they markout are nearlythe same as the segmentsmarkedby pausesalone. No
combinationof expressionswas found which gave segments as much as one
morphemeshorterthanpause units on average.
3. Is the syntax within pause units relatively manageable?A manual survey
showed that, once hesitation expressions are filtered from them, some 90%
of the pause units studiedcan be parsedusing standardJapanesegrammars;a
varietyof special problemsappearin the remaining10%.
4. Is translationof isolatedpause units a possibility?We foundthata majorityof
the pauseunits in four dialoguesgave understandabletranslationsinto English
when translatedby hand.
The studyprovidedencouragementfor a "divideandconquer"analysisstrategy,
in whichparsingandperhapstranslationof pauseunitsis carriedout before,or even
without,attemptsto createcoherentanalysesof entireutterances.
As mentioned, parsabilityof spontaneousutterancesmight be enhanced by
filteringhesitationexpressionsfrom them in preprocessing.Researchon spotting
techniquesfor such expressionswould thus seem to be worthwhile.Researchers
can exploita speaker'stendencyto lengthenhesitations,andto use themjust before
or afternaturalpauses.
Use of pauseinformationfor "dividingutterancesinto meaningfulchunks"dur-
ing SLT of Japaneseis describedby Takezawaet al. (1999). Pauses are used as
segmentboundariesin severalcommercialdictationproducts,but no descriptions
are available.
7. Example-Based SLT
Example-basedMT (EBMT) (Nagao, 1984; Sato, 1991) is translationby ana-
logy. An EBMT system translatessource-languagesentences by referenceto an
"example base", or set of source-languageutterancespaired with their target-
language equivalents. In developing such a system, the hope is to improve
translationqualityby reusingcorrectandidiomatictranslations;to partlyautomate
grammardevelopment;and to gain insightinto languagelearning.
Two EBMT systems arenow being appliedto SLT:the TDMT (Transfer-driven
MT) system developedat ATR (Furuse& Iida, 1996; Iida et al., 1996; Sumita&
Iida, 1992), used in the ATR-MatrixSLT system (Takezawaet al., 1999); and the
PanEBMTsystem (Brown, 1996) of CMU, used along with transfer-basedMT
NINE ISSUES IN SPEECHTRANSLATION 1 63
b. conferenceof Kyoto
c. conferencein Kyoto
d. Kyoto conference
We could hope to provide such improvedtranslationsif we had an example base
showingfor instancethat(6a) had been translatedas (6b) or (6c), andthat(7a) had
been renderedas (7b) or (7c).
b. conferencein Tokyo
c. Tokyoconference
d. New Yorkconference.
The strategywould be to recognize a close similaritybetween the new input (5a)
and these previously translatednoun phrases, based on the semantic similarity
164 MARKSELIGMAN
between kyoto on one hand and tokyo and nyu yoku on the other.The same sort
of patternmatchingcould be performedagainsta nounphrasein the examplebase
differingfrom the input at more than one point, for example (8), where miitingu
('meeting') is semanticallysimilarto kaigi ('conference').
9.3. EVALUATION
We arepresentlyreportingthe implementationof facilitiesintendedto enablemany
experimentsconcerningmorphologicaland morpho-semantic co-occurrence;the
experimentsthemselves remain for the future. Clearly, furthertesting is neces-
sary to demonstratethe reliability and usefulness of the approach.A principle
aim would be to determinehow large the corpus must be before consistent co-
occurrencepredictionsare obtained. Nevertheless, some indication of the basic
usabilityof the datais in order.
Tools have been providedfor comparingtwo corporawith respect to any of
the fields in the recordsrelatingto morphs,morphco-occurrences,cats, or cat co-
occurrences.Using these, we treated15 of our dialogues as a trainingcorpus,and
the one remainingdialogueas a test corpus.We comparedthe two corporain terms
of conditionalprobabilitiesfor morphco-occurrences.In both cases, statistically
unsmoothedscores were used for simplicityof interpretation.
We found 5,162 co-occurrencepairs above a conditionalprobabilitythreshold
of 0.10 in the trainingcorpusand 1,552 in the test. Since 509 pairsoccurredin both
corpora,the trainingcorpuscovered 509 out of 1,552, or 33%, of the test corpus.
Thatis, one thirdof the morphco-occurrenceswith conditionalprobabilitiesabove
0.10 in the test corpuswere anticipatedby the trainingcorpus.
This coverageseems respectable,consideringthatthe trainingcorpuswas small
and that neither statistical nor semantic smoothing was used. More important
176 MARKSELIGMAN
than coverage, however, is the presence of numerouspairs for which good co-
occurrencepredictionswere obtained. Such predictionsdiffer from those made
using n-gramsin thatthey need not be chained,and thus need not cover the input
to be useful:if consistentlygood co-occurrencepredictionscan be recognized,they
can be exploitedselectively.
The figuresobtainedfor cats and cat co-occurrencesare comparable.
(21)a. He ate.
b. Tabemashita.
EAT-past
(23) * boughtbook
b. He is studying.
11. Conclusions
The firstsection of the paperdescribeda "lowroad"or "quickanddirty"approach
to SLT, in which interactivedisambiguationof SR and translationis temporar-
ily substitutedfor system integration.This approach,I believe, is likely to yield
broad-coveragesystems with usablequalitysoonerthanapproacheswhich aim for
maximallyautomaticoperationbased upontight integrationof knowledgesources
and components.
Two demonstrationsof "quickand dirty"SLT over the Internetwere reported.
For the demos, an experimentalchat translationsystem createdby CompuServe,
Inc. was providedwith front and back ends, using commercialdictationproducts
for speech inputand commercialspeech-synthesisengines for speech output.The
dictationproducts'standardinterfaceswere used to debug dictationresults inter-
actively.While evaluationof these experimentsremainedinformal,coveragewas
much broaderthan in most SLT experimentsto date - in the tens of thousands
of words. While interactivecontrolof translationwas lacking, outputqualitywas
probablysufficientfor many social exchanges.
But while the "low road"may offer the fastest route to usable broad-coverage
SLT systems, automaticoperationbased upon knowledge-source integrationis
certainto remaindesirablein the longer run. Hence the balance of the paperhas
concentratedon aspectsof integratedsystems.
Takentogether,the nine areasof researchexaminedin the papersuggest a nine-
item wish list for an experimentalSLTsystem.
1. The system would include facilities for interactivedisambiguationof both
speech and translationcandidates.
2. Its architecturewould allow modularreconfigurationand global coordination
of components.
NINE ISSUES IN SPEECHTRANSLATION 1 79
Acknowledgements
Warmestappreciationto CompuServe,Inc. for makingthe chat-basedSLTdemon-
strationspossible. In particular,thanksare due to Mary Flanagan,then Manager,
Advanced Technologies, and to Sophie Toole, then Supervisor,Language Sup-
port. Ms. Flanagan authorizedand oversaw both demos. Ms. Toole organized
and conductedthe Grenobledemo and played an active role in making the SR
and speech-synthesis softwareoperational.Thanksalso to Phil Jensen and Doug
Chinnock, translationsystem engineers. The demos made use of pre-existing
proprietarysoftware.
Workon all nine of the issues discussed here began at ATR InterpretingTele-
communicationsLaboratoriesin Kyoto, Japan.I am very gratefulfor the support
and stimulationI receivedthere.
Thanksalso to numerouscolleagues at GETA (Grouped'Etudespour la Tra-
ductionAutomatique)at the UniversiteJosephFourierin Grenoble,France;and at
DFKI (DeutschesForschungszentrum fur KiinstlicheIntelligenz)in Saarbriicken,
Germany.
The opinionsexpressedthroughoutare mine alone.
180 MARKSELIGMAN
Notes
CompuServe'schat translationprojectwas discontinuedin early 1998. All trademarksare hereby
acknowledged.
A later commercialchat translationservice, that of Uni-verse, Inc. (now discontinued),gave a
comparablethroughputin 2-3 seconds.
3 ContinuousFrenchwas released
just before the second demo, but because little testing time was
available,a decision was made to forego its use.
SpeechLinkssoftwarefrom SpeechOne,Inc.
5
By March1998, upgradesof the continuoussoftwarehad alreadymade this macroless necessary.
Directdictationto the chat window would then have been possible withoutit, with some sacrificeof
advancedfeaturesfor voice-driveninteractivecorrectionof errors.
6 Kowalskiet al.
(1995) arrangedthe only previousdemonstrationknownto the authorof SLTusing
commercialdictationsoftwarefor input(thoughat least one group(Miikeet al., 1988) hadpreviously
simulatedSLT after a fashion by automaticallytranslatingtyped conversations).Since Kowalski's
users (spectatorsat twin exposition displays in Boston, Massachusettsand Lyons, France) were
untrained,little interactivecorrectionof dictationwas possible. Forthis andotherreasons,translation
qualitywas generallylow (BurtonRosenberg,personalcommunication);but as the main purposeof
the demo was to make an artistic and social statementconcerningfuture hi-tech possibilities for
cross-culturalcommunication,this was no great cause for concern. Text was transmittedvia FTP,
ratherthanvia chat as in the experimentsreportedhere. See Seligman(1997) for a fuller account.
www.itl.atr.co. jp/matrix/c-s tar /matrix. en. html
o
www.c-star.org
9 Mailbox files were
extensively and successfully used in the Frenchentry in the C-Star II SLT
demo of July 22, 1999 (www.c-star.org).
0 Inclusion of other levels is also
possible. At the lower limit, assuming the grammarwere
stochastic,one could even use sub-phonespeech segments as grammarterminals,thus subsuming
even HMM-basedphone recognitionin the parsingregime.At an intermediatelevel betweenphones
andwords, syllables could be used.
11 The
parse tree was not used for analysis, however.Instead,it was discarded,and a unification-
based parserbegan a new parsefor MT purposeson a text stringpassed from speech recognition.
For one exampleof extensiverelatedwork in the frameworkof the Verbmobilsystem, see Kompe
etal. (1997).
13A relatedbut distinct
proposalappearsin Hosakaet al. (1994).
14 Research
Entropic Laboratory,Washington,DC, 1993.
15PanEBMT
operatessolo only when the entire source expressioncan be renderedwith a single
memorizedtargetexpression.
16The sort of
generalizationsuggested here - from graded semantic similaritymeasurementsto
gradedmeasurementsof similarityalong multiple dimensions - should not be confused with that
of GeneralizedEBMT,the example-basedtechniqueproposedfor CMU's PanEBMTengine. That
engine utilizes no gradedsimilaritymeasurementsalong any scale. Its generalizationinsteadinvolves
substitutionof semantic tags for lexical items in examples and in input, so that for example John
Hancockwas in Washingtonbecomes (PERSON) was in (CITY).
In this calculation,fixed elements are treateddifferentlyfrom variableelements,and variableele-
ments can be weighted to varyingdegrees:the heads of complex structuresare differentlyweighted
thannon-heads.
See for examplethe website of the Discourse ResourceInitiative,
www.georgetown.edu/luper-foy/Discourse-Treebank/dri-home.html, with links to recent
workshops,or browse Walker(1999), especially regardingattemptedstandardizationof Japanese
discourselabeling(Ichikawaet al., 1999).
NINEISSUESIN SPEECH
TRANSLATION 181
References
Aberdeen,John, Sam Bayer, Sasha Caskey,LaurieDamianos,Alan Goldschen,LynetteHirschman,
Dan LoehrandHugo Trappe:1999, 'ImplementingPracticalDialogue Systems with the DARPA
CommunicatorArchitecture',IJCAI-99 Workshopon Knowledge and Reasoning in Practical
Dialogue Systems,Stockholm,Sweden, pp. 81-86.
Alexandersson,Jan, Norbert Reithinger and Elisabeth Maier: 1997, 'Insights into the Dialogue
Processing of Verbmobil', Fifth Conference on Applied Natural Language Processing,
Washington,DC, pp. 33-40.
Barnett, Jim, Kevin Knight, Inderjeet Mani and Elaine Rich: 1990, 'Knowledge and Natural
LanguageProcessing', Communicationsof the ACM33(8), 50-71.
Black, Alan: 1997, 'Predictingthe Intonationof Discourse Segments from Examples in Dialogue
Speech', in Y. Sagisaka,N. CampbellandN. Higuchi(eds), ComputingProsody,SpringerVerlag,
Berlin,pp. 117-128.
Black, Ezra,Roger Garsideand GeoffreyLeech: 1993, Statistically-drivenComputerGrammarsof
English: TheIBM/LancasterApproach,Rodopi, Amsterdam.
Blanchon, Herve: 1996, 'A Customizable Interactive Disambiguation Methodology and Two
Implementationsto DisambiguateFrenchand EnglishInput',in C. Boitet (1996a), pp. 190-200.
Boitet, Christian(ed.): 1996a, Proceedings of MIDDIM-96Post-COLINGSeminar on Interactive
Disambiguation,Le Col de Porte,France.
Boitet, Christian:1996b, 'Dialogue-basedMachine Translationfor Monolingualsand FutureSelf-
explainingDocuments',in C. Boitet (1996a), pp. 75-85.
Boitet, Christianand Mark Seligman: 1994, 'The "Whiteboard"Architecture:A Way to Integrate
HeterogeneousComponentsof NLP Systems', COLING94, The 15th InternationalConference
on ComputationalLinguistics,Kyoto, Japan,pp. 426^-30
Brown, Ralph D.: 1996, 'Example-basedMachine Translationin the Pangloss System', COLING-
96, The 16th InternationalConferenceon ComputationalLinguistics,Copenhagen,Denmark,
pp. 169-174.
Dohsaka, K.: 1990, 'Identifying the Referents of Zero-pronounsin Japanese Based on Prag-
matic ConstraintInterpretation',9th EuropeanConferenceon ArtificialIntelligence,ECAI y90,
Stockholm,Sweden, pp. 240-245.
Erman, Lee D. and Victor R. Lesser: 1990, 'The Hearsay-IISpeech UnderstandingSystem: A
Tutorial',in A. Waibeland K.-F.Lee (eds), Readingsin SpeechRecognition,MorganKaufmann,
San Mateo,CA, pp. 235-245.
Fano,RobertM.: 1961, Transmissionof Information:A StatisticalTheoryof Communications,MIT
Press, Cambridge,MA.
182 MARKSELIGMAN
Fillmore, Charles J., Paul Kay and CatherineO'Connor: 1988, 'Regularityand Idiomaticity in
GrammaticalConstructions:The Case of Let Alone', Language64, 501-538.
Flanagan,Mary: 1997, 'Machine Translationof InteractiveTexts', In MT Summit VI, Machine
Translation:Past PresentFuture,San Diego, CA, p. 50.
Frederking,Robert,AlexanderRudnicky,and ChristopherHogan: 1997, 'InteractiveSpeech Trans-
lation in the DIPLOMATProject', SpokenLanguage Translation:Proceedings of a Workshop
Sponsoredby the Associationfor ComputationalLinguisticsand by the EuropeanNetworkin
Languageand Speech (ELSNET),Madrid,Spain,pp. 61-66.
FurukawaRyo, Yato Fumihiro and Loken-Kim Kyung-ho: 1993, Denwakaiwa o Maruchimedia
Kaiwano Tokuchobunseki [MultimediaDialogue FeatureAnalysis of TelephoneConversations].
Technical Report TR-IT-0020, ATR InterpretingTelecommunicationsLaboratories,Kyoto,
Japan.
Furuse, Osamu and Hitoshi Iida: 1996, 'IncrementalTranslationUsing Constituent Boundary
Patterns', COLING-96, The 16th International Conference on ComputationalLinguistics,
Copenhagen,Denmark,pp. 412-417.
Gorz, Giinther,MarcusKesseler, Jorg Spilker and Hans Weber: 1996, 'Researchon Architectures
for IntegratedSpeech/LanguageSystems in Verbmobil',COLING-96,The 16th International
Conferenceon ComputationalLinguistics,Copenhagen,Denmark,pp. 484-489.
Grosz, BarbaraJ., Aravind K. Joshi and Scott Weinstein: 1983, 'Providinga Unified Account of
DefiniteNoun Phrasesin Discourse', 21st AnnualMeetingof theAssociationfor Computational
Linguistics,Cambridge,MA, pp. 44-50.
Hearst,MartiA.: 1994, 'Multi-paragraph Segmentationof ExpositoryText', 32nd AnnualMeeting
of the Associationfor ComputationalLinguistics,Las Cruces,NM, pp. 9-16.
Hosaka,Junko,MarkSeligmanand HaraldSinger: 1994, 'Pauseas a PhraseDemarcatorfor Speech
and LanguageProcessing', COLING94, The 15th InternationalConferenceon Computational
Linguistics,Kyoto, Japan,pp. 987-991.
Ichikawa,A., M. Araki,Y. Horiuchi,M. Ishizaki,S. Itabashi,T. Itoh,H. Kashioka,K. Kato,H. Kiku-
chi, H. Koiso, T. Kumagai,A. Kurematsu,K. Maekawa,S. Nakazato,M. Tamoto,S. Tutiya,Y.
YamashitaandT. Yoshimura:1999, 'Evaluationof AnnotationSchemes for JapaneseDiscourse',
in M. Walker(1999), pp. 26-34.
Iida, Hiroshi, Eichiro Sumita and Osamu Furuse: 1996, 'Spoken Language TranslationMethod
Using Examples', COLING-96,The 16th InternationalConferenceon ComputationalLinguis-
tics, Copenhagen,Denmark,pp. 1074-1077.
Iwadera,T, M. Ishizaki and T. Morimoto:1995, 'Recognizing an InteractionalStructureand Top-
ics of Task-orientedDialogues', Proceedings of the European Workshopon SpokenDialogue
Systems,Vigs0, Denmark,pp. 41-44.
Jokinen, Kristiina,Hideki Tanakaand Akio Yokoo: 1998, 'Context Managementwith Topics for
Spoken Dialogue Systems', COLING- ACL '98: 36th Annual Meeting of the Associationfor
ComputationalLinguistics and 17th InternationalConferenceon ComputationalLinguistics,
Montreal,Canada,pp. 631-637.
Joshi, AravindK. and Scott Weinstein:1981, 'Controlof Inference:Role of some Aspects of Dis-
course Structure- Centering',SeventhInternationalJoint Conferenceon ArtificialIntelligence
(IJCAI-81),Vancouver,BC, pp. 385-387.
Julia, L., L. Neumeyer, M. Charafeddine, A. Cheyer, and J. Dowding: 1997,
'HTTPV/WWW.SPEECH.SRI.COM/DEMOS/ATIS.HTML', Workingnotes of the AAAF97
Spring SymposiumWorkshopon Natural LanguageProcessingfor the Web,Stanford,CA, pp.
72-76.
Jurafsky,Daniel: 1993, A Cognitive Model of Sentence Interpretation:The ConstructionGram-
mar Approach. Technical Report TR-93-077. InternationalComputer Science Institute and
Departmentof Linguistics,Universityof California,Berkeley.
Kay,Paul: 1990, 'Even', Linguisticsand Philosophy13, 59-216.
NINE ISSUES IN SPEECHTRANSLATION 183
Reithinger, Norbert and Martin Klesen: 1997, 'Dialogue Act Classification Using Language
Models', Proceedings of the 5th European Conferenceon Speech Communicationand Tech-
nology (Eurospeech),Rhodes, Greece,pp. 2235-2238.
Sato, Satoshi: 1991, Example-basedMachine Translation,Doctoral thesis (in Japanese), Kyoto
University,Japan.
Schutze,Hinrich:1998, 'AutomaticWordSense Discrimination',ComputationalLinguistics24, 97-
124.
Searle,J.: 1969, SpeechActs, CambridgeUniversityPress, Cambridge,England.
Seligman, Mark: 1991, GeneratingDiscoursesfrom Networks Using an Inheritance-basedGram-
mar, Dissertation,Departmentof Linguistics,Universityof California,Berkeley.
Seligman,Mark:1994a, CO-OC:Semi-automaticProductionof Resourcesfor TrackingMorpholo-
gical and SemanticCo-occurrencesin SpontaneousDialogues. TechnicalReportTR-IT-0084,
ATRInterpretingTelecommunicationsLaboratories,Kyoto, Japan.
Seligman,Mark:1994b, CNTR:Basic Functionsfor CenteringExperimentswithASURA.Technical
ReportTR-IT-0085,ATRInterpretingTelecommunicationsLaboratories,Kyoto, Japan.
Seligman,Mark:1997, 'InteractiveReal-timeTranslationvia the Internet',in K. Mahesh(1997), pp.
142-148.
Seligman, Mark, Jan Alexanderssonand KristiinaJokinen: 1999, 'TrackingMorphologicaland
SemanticCo-occurrencesin SpontaneousDialogues', IJCAI-99 Workshopon Knowledgeand
Reasoningin Practical Dialogue Systems,Stockholm,Sweden, pp. 105-1 11.
Seligman, Mark and ChristianBoitet: 1993, 'A "Whiteboard"Architecturefor AutomaticSpeech
Translation',Proceedings of ISSD-93, InternationalSymposiumon Spoken Dialogue - New
Directionsin Humanand Man-machineCommunication,Tokyo,Japan,pp. 243-246.
Seligman,Mark,ChristianBoitet and BoubakerMeddeb-Hamrouni:1998a, 'TransformingLattices
into Non-deterministicAutomatawith Optional Null Arcs', COLING-ACL'98: 36th Annual
Meeting of the Associationfor ComputationalLinguistics and 17th InternationalConference
on ComputationalLinguistics,Montreal,Canada,pp. 1205-1211.
Seligman, Mark, Laurel Fais and Mutsuko Tomokiyo: 1995, A Bilingual Set of Communicat-
ive Act Labels for SpontaneousDialogues. Technical Report TR-IT-0081, ATR Interpreting
TelecommunicationsLaboratories,Kyoto, Japan.
Seligman, Mark, Mary Flanagan and Sophie Toole: 1998b, 'Dictated Input for Broad-coverage
Speech Translation',Associationfor Machine Translationin the Americas (AMTA-98),Work-
shop on EmbeddedMT Systems:Design, Construction,and Evaluationof Systemswith an MT
Component,Langhorne,PA.
Seligman, Mark,Junko Hosaka and HaraldSinger: 1997, '"Pause Units" and Analysis of Spon-
taneousJapaneseDialogues: PreliminaryStudies', in E. Meier, M. Mast and S. Luperfoy(eds),
Dialogue Processing in SpokenLanguageSystems,Springer,Berlin,pp. 110-112.
Seligman, Mark, Masami Suzuki and Tsuyoshi Morimoto: 1993. Semantic-level Transfer in
Japanese-GermanSpeechTranslation:SomeExperiences.TechnicalReportNLC93-13,Institute
of Electronics,Information,and CommunicationEngineers(IEICE),Japan.
Sidner,Candace: 1979, Towarda ComputationalTheoryof Definite Anaphora Comprehensionin
English. TechnicalReportAI-TR-537,MIT,Cambridge,MA.
Sobashima, Yauhiro and Hitoshi Iida: 1995, 'A Multi-dimensional Analogy-based, Context-
dependent,Bottom-up ParsingMethod for Spoken Dialogues', ThirdNatural Language Pro-
cessing PacificRimSymposiumNLPRS'95,Seoul, Korea,pp. 586-591.
SobashimaYasuhiroand MarkSeligman: 1994, 'Yorei to no tagentekiruijidokeisan ni motodzuku
bunmyakuizon no kobunkaiseki ho', [ParsingMethodfor Example-basedAnalysis Integrating
Multiple Knowledge Sources], Shadan hojinjoho shod gakkai dai49 kai zenkokutaikai koen
ronbunshu, Vol. 3, Sapporo,Japan,pp. 103-104.
NINEISSUESIN SPEECH
TRANSLATION 185