Professional Documents
Culture Documents
Jerker Rönnberg
Department of Behavioural Sciences,
Linköping University,
S-581 83 Linköping, Sweden
E-mail: JR@IBV.LIU.SE
few years ago (Rönnberg et al, 1998b). The current formulation tradition following the model of Baddeley and colleagues (e.g.
is an attempt to develop this framework a step further, and to Baddeley, 2000) emphasizes that working-memory resources
specify the general parameters of a model, and some of their comprise amodal as well as modality-specific components. Thus,
interactions. initially a central, executive component as well as two modality-
The components involved in the general formulation of the specific slave systems (i.e. the phonological and visuospatial
framework are: perceptual input channels, working memory, systems), meeting different attention, storage and processing
and long-term memory (Rönnberg, 2003). The model acknow- demands, were assumed (Baddeley & Hitch, 1974). Later
ledges top-down processes, in the sense that it allows for inform- research on, for example, visual similarity on verbal recall and
ation retrieved from long-term memory affecting processing of on the impact of meaning on sentence recall has prompted
the perceptual input, specifically utilizing script-related and Baddeley (2000) to suggest a synergistic, episodic memory buffer,
semantic information about the particular communicative capable of binding together information from long-term memory
situation or topic (Samuelsson & Rönnberg, 1993). As a rule of and from the slave systems.
thumb, then, the more degraded, distorted or unintelligible the An alternative type of model follows Just and Carpenter and
perceived signal, the greater the contribution from long-term colleagues (e.g. Just & Carpenter, 1992), where language com-
memory, and, conversely, the more specified the perceptual putation resources are seen as more global processing and
input (sometimes conveyed by several poorly specified inform- storage capacities. This model contrasted more sharply with the
ation channels in concert), the less the need for top-down Baddeley & Hitch (1974) model before the latest revision by
processing. The reciprocal relationship between top-down and Baddeley (2000). Also, recognizing both the modality-specific
poorly specified bottom-up processing in speech understanding and modality-free nature of embedded cognition, Wilson (2001)
is also assumed to be modulated by the general capacity for has discussed the similarities and differences that exist when
processing and storage of information in an on-line working- working-memory systems for signed and spoken language are
memory system. Lexical access or word recognition in long-term compared. It is with her reasoning and arguments as back-
memory is assumed to be achieved through phonological, ground that we wish to pursue the general hypothesis that it is
and/or contextual, successive specification in real time (see Luce reasonable to formulate a general model as a more explicit
& Pisoni (1998) and Marslen-Wilson (1987) for spoken word starting point for comparisons across communication modes
identification). and languages.
It is important to note that interaction (i.e. bottom-up Thus, when we examine the data and arguments that are
processing interacting with top-down processing) in the general pertinent to determining the level and kind of working-memory
sense of other auditory speech perception models such as the model for sign and speech which are most appropriate, there
TRACE model (McClelland & Elman, 1986), the Cohort model seems to be sufficient support to initially test a model where
(Marslen-Wilson, 1987) and the NAM model (Luce & Pisoni, modality specificity is less emphasized, and integration of
1998) is assumed. Owing to the general precondition of poor information is favoured (see below for arguments). Furthermore,
language input and assumption of multi-modal interaction attempting to generalize to a cross-language, comparative
(Rönnberg et al, 1998b), the framework emphasizes prosodic and approach, we will necessarily have to consider potentially
syllabic aspects as one means of disambiguating the signal into common, amodal bases for such a working-memory system.
word boundaries for lexical activation, a feature that has not
been made explicit in the above speech perception models Empirical arguments for the modality-free speech
(Andersson, 2001). In the word-initial cohort (Marslen-Wilson, communication modes aspect of working memory
1987), for example, reliance is put on similarities among 1. Cross-modal and ‘auditory’ cortical activity may come about
candidate items that share initial phonetic information. Gener- as a result of interventions by means of cochlear implants in
ally, the present framework is more similar to the NAM, postlingually deaf users (Zatorre, 2001), and auditory cortical
because it also assumes a matching process between input and overcompensations, as well as visual cortical recruitment, may
representations in memory that takes into account both phono- represent the resulting, new perceptual strategies (Giraud et al,
logical similarity and word frequency (i.e. semantic constraints). 2000). Primary visual cortex areas (V1 and V2) are recruited in
The framework takes the whole speech-understanding process the cochlear implant user, and the degree to which this is done is
a step further by assuming that explicit working-memory also a function of the length of the post-implant phase (Giraud,
capacity is called for only when there is a mismatch in the word Price, Graham, Truy & Frackowiak, 2001). Tactile stimuli in the
decision or lexical activation process. This assumption concerns congenitally deaf tactile aid user activate secondary auditory
processing economy (for details, see below). The present frame- areas (Levänen, 1998). Silent speech-reading activates the audi-
work also seeks to expand the domain of speech understanding tory cortex (Calvert et al, 1997; Ludman et al, 2000; MacSweeney
into sign understanding. In general, the framework and the et al, 2001), and the activation is dependent on speech-reading
model are about the ease with which the processing of multi- skill (Ludman et al, 2000) and auditory deprivation (MacSweeney
modal input and language understanding is accomplished. The et al, 2001).
model is not a model of language understanding per se. Thus, there is strong neurophysiological evidence that in
many cases the interdependence, plasticity, and substitutability
Classes of working memory model of brain tissue for different speech-processing modes are
There are several classes of working-memory model, developed commonplace. Depending on the type of intervention and
for different theoretical and applied purposes, supported by alternative sensory input used, new ‘audio-visual–tactile’,
different kinds of data (see Richardson et al (1996) for an speech-processing strategies may thus be developed. The audi-
overview). In this context, it is important to note that the tory input channel does not seem to be privileged for speech
Cognition in the hearing impaired and deaf as a bridge between Rönnberg S69
signal and dialogue: a framework and a model
understanding. Speech-relevant information may arrive in like representations, i.e. planum temporale (Petitto et al, 2000).
different ways (Leybaert, 1998). In general, it seems to be the The syllable may be a reasonably good approximation for the
case that compensatory pressure is at hand when the perceptual unit that a cross-language working-memory system is based on.
input is poorly specified (Rönnberg et al, 1998b), or needs to be This empirical finding undoubtedly fits with the linguistic
recombined to represent a ‘new’ perceptual platform for the argument that there are organizational similarities between sign
profoundly hearing-impaired or deaf individual. This sets the and spoken syllables: positions and movements in sign corres-
stage for a more general and abstract working-memory model pond to margins and nuclei in speech (Brentari, Poizner & Kegl,
that capitalizes on common bases and potentials of a system 1995).
that deals with poorly specified input.
3. Behavioural data also suggest rather strongly that there are
2. The generality of certain cognitive and working-memory- similarities across signed and spoken languages with respect to
related functions has been empirically demonstrated and ‘phonological similarity’ effects in working memory (Wilson &
successfully applied to several communicative domains, such as Emmorey, 1997), and with respect to the capacity of the ‘phono-
tactile aids (Rönnberg et al, 1998a; Andersson, Lyxell, Rönnberg logical’ loop (Marschark & Mayer, 1998). Phonological loop
& Spens, 2001a,b), cochlear implants (Lyxell et al, 1996, 1998), effects have also been observed with cued speech (Leybaert &
and visual speech understanding (review: Rönnberg, Samuelsson Lechat, 2001), and hence seem to constitute a rather general
& Lyxell, 1998c). Briefly, the predictors are: (1) phonology, operating characteristic of working memory for language. On
measured by rhyme tests (i.e. tapping the syllable level); (2) the whole, many classical working-memory effects are analogous
lexical access speed; (3) verbal processing and storage capacity; for sign and speech (i.e. similarity, length and suppression effects
and (4) verbal inference-making. It is here ventured that (Wilson, 2001)).
communicative proficiency mediated by such predictors, across Given the empirical arguments for the modality-free nature
visual, non-auditory and alternative communication systems, of working memory in terms of neurophysiological, behavioural
adds to the plausibility of a rather abstract design of a working- and cognitive similarities—across speech communication modes
memory system. It is clear that speed of lexical access, the nature and languages—a tentative model can be formulated. The model
and quality of phonological representations, and sometimes is abstract in the sense of modality and language independence,
more complex information-processing indices, represent the and assumes that four parameters are especially crucial in
predictors both within and across sensory aid domains, and predictions of the ease with which speech and language can be
sometimes also across pre- and post-training of speech-under- understood. As already indicated, it further assumes that
standing performance (Rönnberg et al, 1998b). perceptual input channels, long-term memory and working-
memory interactions set the structural frame for interactions
3. Research on speech-reading expertise (Lyxell, 1994; Rönnberg, among parameters.
1993; Rönnberg et al, 1999) shows that, despite differences in
onset of hearing impairment or deafness, and despite differences
Model parameters: across communication modes and
in communicative habits, language use and strategy, there is
languages
convergence on the fact that, for successful speech-readers,
bottom-up cognitive functions such as lexical-access speed and 1. The first assumption is that the interface between perception/
phonology are relatively normal, whereas top-down cognitive perceptual input systems and cognition for speech and sign
skills, such as complex storage and processing of information, as processing is at the sublexical, syllable level. The accuracy and
well as inference-making, are exceptional compared to normal quality of phonological, syllable-like representations determine
participants. Thus, ‘higher-order’ cognitive and working-memory- parameter fp(P). This assumption is supported by both theory
related skills are required for speech-understanding expertise, and data. Four points can be adduced here, as follows.
beyond the threshold that is set by the lower-order functions
(Rönnberg et al, 1998a), and they operate independently of (a) The assumption is founded on the facts that phonological
communication mode. representations stored in long-term memory deteriorate over
time and are critical for initial processing of spoken, visual
Empirical arguments for the modality-free language language (i.e. indicated by a text-based, rhyme test (Lyxell,
aspect of working memory Rönnberg & Samuelsson, 1994; Andersson, 2001)). Phone-
1. Lesion data and neurophysiological data seem to suggest mically based processing in letter span tasks (i.e. tapping the
similarities in the ways in which left-hemisphere neural networks phonological loop in working memory) does not deteriorate
are active during language processing. What can also be seen is (Andersson, 2001). In addition, the interpretation of
that the similarities hold true across different levels of language correlations with this test is less clear, as the similarity index
and imaging technique (reviews: Rönnberg, Söderfeldt & Risberg, was negatively correlated with visual speech understanding,
2000; Wilson, 2001). As regards ‘inner speech’, it was found in a suggesting also a visual component (Andersson et al,
recent imaging study that ‘inner signing’ of sentences engages 2001a). In a phonological lexical-access test, where the
similar functional networks in the brain to those associated with subject is instructed to decide as quickly as possible whether
the activation of the phonological loop: left inferior frontal a non-word (sometimes a pseudohomophone) sounds like a
cortex, rather than visuospatial areas (McGuire et al, 1997). real word, processing is more explicitly geared toward a
phonemic analysis—otherwise the task cannot be solved
2. Additional neurophysiological evidence seems to support the correctly (Andersson et al, 2001a). Again, the point here is
notion of an ‘abstract’ site for carrying phonological, syllable- that this task is not as sensitive as a predictor of visual,
Cognition in the hearing impaired and deaf as a bridge between Rönnberg S71
signal and dialogue: a framework and a model
the two sources of potential mismatch generally depends on the type of hearing aid. The ratio may be a useful indicator for long-
quality of the input signal and contextual specification of script term evaluations.
use—in general, the mismatch depends on the way in which
bottom-up and top-down processing balance each other in the 5. Various contextual effects determine working-memory pre-
language-understanding task. Experimentally, one could, for dictions and the degree of mismatch: this holds true for dialogue
example, keep a well-defined context, with highly predicted and experimental sentences; context and lexical activation also
materials, manipulating only the quality of the phonological interact, except for the phonology–lexical-access interaction.
input to tease out the contribution of the long-term phonological Materials for assessing and estimating explicit and implicit
representations. The next logical step would be to offer less processes can be constructed and developed for different
specified contextual support to estimate parameter fe(E). communication modes.
4. Given a high value of parameter fe(E), the assumption is that Across languages
explicit resources are invoked, named the capacity parameter, 1. The existence of phonological problems in sign language
fc(C), in working memory; when fc(C) is high, explicit process- users is expected. For example, specific language impairments in
ing, inference-making and complex information processing are sign, with phonological, ‘non-word repetition’ problems, should
made easier. fc(C) represents the general, amodal capacity of the be observed (Bishop, North & Donlan, 1996; Briscoe, Bishop &
working-memory system. It is not tied to a specific loop Norbury, 2001), particular phonological awareness problems in
function, but represents a general processing and storage ‘phonological dyslexia’ (i.e. ordering and composition of
capacity (Daneman & Merikle, 1996). sublexical syllable-like units, problems with analysis of lexical
into sublexical units, and synthesis of sublexical units into
Corollary: The fp(P)fs(S) interaction sets the perceptual, implicit lexical units (Höien et al, 1995)), as well as the possibility that
platform (threshold) (Rönnberg et al, 1998c) for language phonological loop deficits in sign will determine lower rates of
understanding, on the basis of which explicit resources may be acquisition of a foreign sign language vocabulary (Baddeley,
needed, for which the fe(E)fc(C) interaction represents an Gathercole & Papagno, 1988).
important moderator function.
2. fs(S)fp(P) interactions in parkinsonian patients using sign
may demand high fc(C) estimates to allow fluency of dialogue
Expectations and applications
and to compensate for the ‘whispering’, slow sign production
Based on the model, a few expectations and applications that that is typically employed by the parkinsonian signer (Kegl,
follow and are compatible with existing data are presented Cohen & Poizner, 1999).
below.
3. fe(E)fc(C) interactions are expected in contextually un-
Across communication modes expected or poorly specified, signed situations, especially in
1. Tests that assess phonological, syllable-based representations elderly signers.
in long-term memory should have a high predictive power for
speech understanding, irrespective of communicative form 4. Similar neuronal networks are expected to be activated for
(visual, auditory, or tactile, or combinations; see Andersson et al signed and spoken working-memory tasks.
(2001a,b) and Rönnberg et al (1998a,c).
Discussion
2. Cognitive ageing affects the speed of information processing.
It can be expected that the fp(P)fs(S) interaction will be further In the overall evaluation of generality across speech com-
reinforced in elderly, hearing-impaired listeners/speech-readers munication modes, points 1–3 hold up to experimental scrutiny,
(Pichora-Fuller, this supplement). whereas points 4 and 5 have not been properly assessed and
evaluated. When it comes to generality across languages, points
3. The fe(E)fc(C) interaction can account for the facts that (a) 1, 3 and 4 have not been properly researched. While we await a
skilled speech-readers capitalize better on contextual constraints firmer basis for decision in this respect, it can be stated that the
(i.e. they have a higher fc(C) (Samuelsson & Rönnberg (1991)), model parameters put forward here are not crucially dependent
(2) younger listeners (compared to older listeners with similar on modality specificity per se (Wilson, 2001). It is quite con-
audiograms) do better in noise conditions (Pichora-Fuller et al, ceivable that all parameters are equally applicable to both signed
1995; Tun & Wingfield, 1999), and (3) high-capacity subjects and spoken languages.
adapt more easily to different processing strategies in hearing However, there is also a set of modality-specific findings:
aids (Lunner, this supplement). recall order effects are modality specific, spatial rehearsal and
an irrelevant-sign effect are de facto found for sign language
4. One way of conceiving acclimatization effects in hearing aid (Wilson, 2001), as well as modality-specific recency and recall
fitting, or with other sensory aids, is to assume that the overall preferences (Rönnberg & Nilsson, 1987). There is sign-specific
ratio between explicit fe(E) and implicit processes (fp(P)fs(S) is activity (e.g. in the right hemisphere (Rönnberg et al, 2000)), as
reduced as acclimatization proceeds. Fc(C) may be tied to well as other domain-specific working-memory effects that
further adaptation of the processing scheme in the hearing aid oppose the simple, language-neutral, amodal view (Smith &
(Lunner, this supplement). Procedures that vary the balance Jonides, 1997).
between explicit and implicit processing need to be developed to A compromise view at this early stage of testing of the
estimate these processes with standardized testing for a given memory part of the model would be to use Baddeley’s recent
Cognition in the hearing impaired and deaf as a bridge between Rönnberg S73
signal and dialogue: a framework and a model
High High
High C
Low C
ELU
ELU
Moderate Moderate
High C
Low Low Low C
Figure 2. Hypothetical ELU functions for high- and low-fc(C) Figure 3. Hypothetical ELU functions for high- and low-fc(C)
subjects, given a high level of fp(P)fs(S) and a variation of the subjects, given a low level of fp(P)fs(S) and a variation of the
[fp(P)fs(S)]/fe(E) ratios above and below 1. [fp(P)fs(S)]/fe(E) ratios above and below 1.
Critical empirical tests of communication mode and language Experimentation, combined with neuroimaging, is needed to
specificity are needed to further shape the model (Wilson, 2001). evaluate the modality specificity of the model (e.g. comparing
For example, the weighting of fp(P) and fs(S) may vary across working memory for sign and speech). Systematic manipul-
language, and the particular phonological tests used. Presently, it is ation of parameters is needed to derive empirically founded
expected that syllable-based rhyming tests represent straightforward ELU functions, functions which may assist the development
independent tests that capture fp(P) (Andersson, 2001), lexical- of new tests and the future conceptualization of ‘cognitive’
access speed is a good approximation of a general cognitive speed sensory aids.
component, fs(S) (Rönnberg, 1990; Pichora-Fuller, this supple-
ment; Tun & Wingfield, 1999), and the reading (or listening) span Acknowledgments
test is a good first estimate of explicit capacity in general, fc(C) (e.g.
Lunner, this supplement; Rönnberg, 1995). fc(C) is also highly This research is supported by a grant from the Swedish Council
correlated with the actual ability to make intelligent guesses and for Social Research (30305108).
linguistic inferences (Lyxell & Rönnberg, 1989). Such a test can be
developed with different languages and communication modes.
References
Andersson, U. (2001). Cognitive deafness. The deterioration of
Conclusions phonological representations in adults with an acquired severe
hearing loss and its implications for speech understanding.
This article builds on the impressive empirical evidence, Dissertation. Studies from the Swedish Institute for Disability
behavioural and neurophysiological, that exists for modality-free Research No. 3.
speech and language understanding. A framework, which Andersson, U., Lyxell, B., Rönnberg, J. & Spens, K.-E. (2001a).
assumes a continuous interaction between perceptual input Cognitive predictors of visual speech understanding. Journal of
Deaf Studies and Deaf Education, 6, 116–129.
channels, long-term memory, and working memory, was assumed Andersson, U., Lyxell, B., Rönnberg, J. & Spens, K.-E. (2001b). A
as a starting point for conceptualizing the bottlenecks that follow-up study on the effects of speech tracking training on visual
determine the ease with which language can be understood, speechreading of sentences and words: cognitive prerequisites and
spoken, or signed. The theoretical context of speech perception chronological age. Journal of Deaf Studies and Deaf Education, 6,
103–116.
models and working-memory models have been addressed.
Baddeley, A.D. (2000). The episodic buffer: a new component of
On the basis of this general framework for language under- working memory? Trends in Cognitive Neuroscience, 4, 417–423.
standing, an explicit model was proposed. In brief, the model Baddeley, A., Gathercole, S. & Papagno, C. (1998). The phonological
assumes four interacting, modality-free parameters: phonology loop as a language learning device. Psychological Review, 105,
(quality and precision) fp(P), speed fs(S), explicit processing 158–173.
Baddeley, A.D. & Hitch, G. (1974). Working memory. In G.A. Bower
fe(E), and general storage and processing capacity fc(C) in (Ed.), The psychology of learning and motivation (pp. 47–89).
working memory. A general formalized description was also London: Academic Press.
proposed, where ease of language understanding ELU = Bernstein, L. & Auer, E. (2003). Speech perception and spoken word
[fp(P)fs(S)]/[fe(E)fc(C)]. Hypothetical graphs were used to recognition. In: M. Marschark & P.E. Spencer (Eds.), The handbook
of deaf studies, language, and education (pp. 379–391). Oxford:
illustrate some of the interacting features and limiting
Oxford University Press.
conditions of the model. Applications and expectations across Bernstein, L.E., Demorest, M.E. & Tucker, P.E. (2000). Speech perception
speech communication modes and language were pointed out. without hearing. Perception & Psychophysics, 62, 233–252.
Cognition in the hearing impaired and deaf as a bridge between Rönnberg S75
signal and dialogue: a framework and a model
R. Campbell, B. Dodd & D. Burnham (Eds.), Hearing by eye: Part ences in language processing with different types of distracting sounds.
II. Advances in the psychology of speechreading and audiovisual Journal of Gerontology B Psychological Sciences, 54, 317–327.
speech (pp. 143–153). London: Lawrence Erlbaum Associates. Wagner, R.K. & Torgesen, J.K. (1987). The nature of phonological
Rönnberg, J., Söderfeldt, B. & Risberg, J. (2000). The cognitive neuro- processing and its causal role in the acquisition of reading skills.
science of signed language. Acta Psychologica, 105, 237–254. Psychological Bulletin, 101, 192–212.
Samuelsson, S. & Rönnberg, J. (1991). Script activation in lipreading. Wilson, M. (2001). The case for sensorimotor coding in working
Scandinavian Journal of Psychology, 32, 124–143. memory. Psychonomic Bulletin & Review, 8, 44–57.
Samuelsson, S. & Rönnberg, J. (1993). Implicit and explicit use of Wilson, K & Emmorey, K. (1997). Working memory for sign language: a
scripted constraints in lipreading. European Journal of Cognitive window into the architecture of the working memory system.
Psychology, 5, 201–233. Journal of Deaf Studies and Deaf Education, 2, 121–130.
Smith, E.E. & Jonides, J. (1997). Working memory: a view from neuro- Zatorre, R.J. (2001). Do you see what I’m saying? Interactions between
imaging. Cognitive Psychology, 33, 5–42. auditory and visual cortices in cochlear implant users. Neuron, 31,
Tun, P.A. & Wingfield, A. (1999). One voice too many: adult age differ- 13–14.