You are on page 1of 9

Review

International Journal of Audiology 2003; 42:S68–S76

Jerker Rönnberg Cognition in the hearing impaired and


Department of Behavioural Sciences,
Linköping University and deaf as a bridge between signal and
The Swedish Institute for Disability
Research,
dialogue: a framework and a model
Linköping, Sweden

Key Words Abstract


Cognition This paper focuses on the role of cognition in visual language processing in the deaf and hard of
Working memory hearing. Although there are modality-specific cognitive findings in the literature on comparisons across
speech communication modes and language (sign and speech), there is an impressive bulk of evidence
Speech-reading that supports the notion of general modality-free cognitive functions in speech and sign processing. A
Speech understanding working-memory framework is proposed for the cognitive involvement in language understanding (sign
Sign language and speech). On the basis of multiple sources of behavioural and neuroscience data, four important
Modality-free parameters for language understanding are described in some detail: quality and precision of phonology,
long-term memory access speed, degree of explicit processing, and general processing and storage
Working memory model capacity. Their interaction forms an important parameter space, and general predictions and applications
can be derived for both spoken and signed language conditions. The model is mathematically
formulated at a general level, hypothetical ease-of-language-understanding (ELU) functions are
presented, and similarities and differences from current working-memory and speech perception
formulations are pointed out.

Introduction speech signals (i.e. auditory, visual, or tactile) are poorly


Individual-difference research on cognitive abilities associated specified or distorted, there is always the inherent task demand
with visual communication skills such as speech-reading and of substituting missing information with inference-making to
speech understanding has a relative stronghold in aural reach understanding (Lyxell et al, 1998; Rönnberg et al, 1998b).
rehabilitation and audiology (Jeffers & Barley, 1971). Charac- In this sense, speech understanding is a broader term that
teristic of previous work is the tendency to associate single includes both bottom-up and top-down processes. Task demands
perceptual or cognitive predictors with single language-test will always contribute to the relative weights of and interactions
variables. More recent research has started to focus on the between the two types of process.
multicomponent nature of cognitive functions that support Understanding speech implies extraction of meaning and
speech reading (e.g. Cowie & Douglas-Cowie, 1992; Gailey, overall comprehension of a message. In a dialogue this becomes
1987). We have been able to show that, for adventitiously especially crucial, because meaning and meaning-extracting
hearing-impaired persons, certain individual cognitive abilities activities always rely on negotiations and co-constructions
that together form an interacting cognitive architecture mediate between the two parties (e.g. Markova & Linell, 1996). The
effective visual speech perception and understanding (Rönnberg, general appraisal, then, is that speech-understanding tests are
1995, 2003). The fact that subsets of individual cognitive func- much more cognitively embedded or demanding than speech
tions support speech understanding under different conditions, perception tests, and, typically, test materials are more complex
with or without sensory aids (e.g. Lyxell et al, 1996; Rönnberg, and ecologically relevant (Borg, this supplement; Larsby &
Andersson, Lyxell & Spens, 1998a; Lunner, 2002), represents an Arlinger, 1994). Speech understanding—especially in dialogue-
important first cornerstone of a general, comprehensive multi- like situations—is therefore likely to be dependent on the
modal model of speech understanding with poorly specified interaction between implicit, bottom-up processes and explicit,
linguistic input (Rönnberg et al, 1998b). In this article, it is top-down processes, and not only on the particulars of bottom-
claimed that the actual interface between perception and cog- up word recognition processes (Bernstein & Auer, 2003;
nition in speech understanding constitutes the bridge between Berstein, Demorest & Tucker, 2000).
signal processing and dialogue. Theoretical and practical implic-
ations are likely to result as a consequence of a model which
The cognitive framework
addresses this important interface.
It is here emphasized that the term speech understanding A framework for a general working-memory system for the
rather than speech perception should be used, because when processing of poorly specified language input was presented a

Jerker Rönnberg
Department of Behavioural Sciences,
Linköping University,
S-581 83 Linköping, Sweden
E-mail: JR@IBV.LIU.SE
few years ago (Rönnberg et al, 1998b). The current formulation tradition following the model of Baddeley and colleagues (e.g.
is an attempt to develop this framework a step further, and to Baddeley, 2000) emphasizes that working-memory resources
specify the general parameters of a model, and some of their comprise amodal as well as modality-specific components. Thus,
interactions. initially a central, executive component as well as two modality-
The components involved in the general formulation of the specific slave systems (i.e. the phonological and visuospatial
framework are: perceptual input channels, working memory, systems), meeting different attention, storage and processing
and long-term memory (Rönnberg, 2003). The model acknow- demands, were assumed (Baddeley & Hitch, 1974). Later
ledges top-down processes, in the sense that it allows for inform- research on, for example, visual similarity on verbal recall and
ation retrieved from long-term memory affecting processing of on the impact of meaning on sentence recall has prompted
the perceptual input, specifically utilizing script-related and Baddeley (2000) to suggest a synergistic, episodic memory buffer,
semantic information about the particular communicative capable of binding together information from long-term memory
situation or topic (Samuelsson & Rönnberg, 1993). As a rule of and from the slave systems.
thumb, then, the more degraded, distorted or unintelligible the An alternative type of model follows Just and Carpenter and
perceived signal, the greater the contribution from long-term colleagues (e.g. Just & Carpenter, 1992), where language com-
memory, and, conversely, the more specified the perceptual putation resources are seen as more global processing and
input (sometimes conveyed by several poorly specified inform- storage capacities. This model contrasted more sharply with the
ation channels in concert), the less the need for top-down Baddeley & Hitch (1974) model before the latest revision by
processing. The reciprocal relationship between top-down and Baddeley (2000). Also, recognizing both the modality-specific
poorly specified bottom-up processing in speech understanding and modality-free nature of embedded cognition, Wilson (2001)
is also assumed to be modulated by the general capacity for has discussed the similarities and differences that exist when
processing and storage of information in an on-line working- working-memory systems for signed and spoken language are
memory system. Lexical access or word recognition in long-term compared. It is with her reasoning and arguments as back-
memory is assumed to be achieved through phonological, ground that we wish to pursue the general hypothesis that it is
and/or contextual, successive specification in real time (see Luce reasonable to formulate a general model as a more explicit
& Pisoni (1998) and Marslen-Wilson (1987) for spoken word starting point for comparisons across communication modes
identification). and languages.
It is important to note that interaction (i.e. bottom-up Thus, when we examine the data and arguments that are
processing interacting with top-down processing) in the general pertinent to determining the level and kind of working-memory
sense of other auditory speech perception models such as the model for sign and speech which are most appropriate, there
TRACE model (McClelland & Elman, 1986), the Cohort model seems to be sufficient support to initially test a model where
(Marslen-Wilson, 1987) and the NAM model (Luce & Pisoni, modality specificity is less emphasized, and integration of
1998) is assumed. Owing to the general precondition of poor information is favoured (see below for arguments). Furthermore,
language input and assumption of multi-modal interaction attempting to generalize to a cross-language, comparative
(Rönnberg et al, 1998b), the framework emphasizes prosodic and approach, we will necessarily have to consider potentially
syllabic aspects as one means of disambiguating the signal into common, amodal bases for such a working-memory system.
word boundaries for lexical activation, a feature that has not
been made explicit in the above speech perception models Empirical arguments for the modality-free speech
(Andersson, 2001). In the word-initial cohort (Marslen-Wilson, communication modes aspect of working memory
1987), for example, reliance is put on similarities among 1. Cross-modal and ‘auditory’ cortical activity may come about
candidate items that share initial phonetic information. Gener- as a result of interventions by means of cochlear implants in
ally, the present framework is more similar to the NAM, postlingually deaf users (Zatorre, 2001), and auditory cortical
because it also assumes a matching process between input and overcompensations, as well as visual cortical recruitment, may
representations in memory that takes into account both phono- represent the resulting, new perceptual strategies (Giraud et al,
logical similarity and word frequency (i.e. semantic constraints). 2000). Primary visual cortex areas (V1 and V2) are recruited in
The framework takes the whole speech-understanding process the cochlear implant user, and the degree to which this is done is
a step further by assuming that explicit working-memory also a function of the length of the post-implant phase (Giraud,
capacity is called for only when there is a mismatch in the word Price, Graham, Truy & Frackowiak, 2001). Tactile stimuli in the
decision or lexical activation process. This assumption concerns congenitally deaf tactile aid user activate secondary auditory
processing economy (for details, see below). The present frame- areas (Levänen, 1998). Silent speech-reading activates the audi-
work also seeks to expand the domain of speech understanding tory cortex (Calvert et al, 1997; Ludman et al, 2000; MacSweeney
into sign understanding. In general, the framework and the et al, 2001), and the activation is dependent on speech-reading
model are about the ease with which the processing of multi- skill (Ludman et al, 2000) and auditory deprivation (MacSweeney
modal input and language understanding is accomplished. The et al, 2001).
model is not a model of language understanding per se. Thus, there is strong neurophysiological evidence that in
many cases the interdependence, plasticity, and substitutability
Classes of working memory model of brain tissue for different speech-processing modes are
There are several classes of working-memory model, developed commonplace. Depending on the type of intervention and
for different theoretical and applied purposes, supported by alternative sensory input used, new ‘audio-visual–tactile’,
different kinds of data (see Richardson et al (1996) for an speech-processing strategies may thus be developed. The audi-
overview). In this context, it is important to note that the tory input channel does not seem to be privileged for speech

Cognition in the hearing impaired and deaf as a bridge between Rönnberg S69
signal and dialogue: a framework and a model
understanding. Speech-relevant information may arrive in like representations, i.e. planum temporale (Petitto et al, 2000).
different ways (Leybaert, 1998). In general, it seems to be the The syllable may be a reasonably good approximation for the
case that compensatory pressure is at hand when the perceptual unit that a cross-language working-memory system is based on.
input is poorly specified (Rönnberg et al, 1998b), or needs to be This empirical finding undoubtedly fits with the linguistic
recombined to represent a ‘new’ perceptual platform for the argument that there are organizational similarities between sign
profoundly hearing-impaired or deaf individual. This sets the and spoken syllables: positions and movements in sign corres-
stage for a more general and abstract working-memory model pond to margins and nuclei in speech (Brentari, Poizner & Kegl,
that capitalizes on common bases and potentials of a system 1995).
that deals with poorly specified input.
3. Behavioural data also suggest rather strongly that there are
2. The generality of certain cognitive and working-memory- similarities across signed and spoken languages with respect to
related functions has been empirically demonstrated and ‘phonological similarity’ effects in working memory (Wilson &
successfully applied to several communicative domains, such as Emmorey, 1997), and with respect to the capacity of the ‘phono-
tactile aids (Rönnberg et al, 1998a; Andersson, Lyxell, Rönnberg logical’ loop (Marschark & Mayer, 1998). Phonological loop
& Spens, 2001a,b), cochlear implants (Lyxell et al, 1996, 1998), effects have also been observed with cued speech (Leybaert &
and visual speech understanding (review: Rönnberg, Samuelsson Lechat, 2001), and hence seem to constitute a rather general
& Lyxell, 1998c). Briefly, the predictors are: (1) phonology, operating characteristic of working memory for language. On
measured by rhyme tests (i.e. tapping the syllable level); (2) the whole, many classical working-memory effects are analogous
lexical access speed; (3) verbal processing and storage capacity; for sign and speech (i.e. similarity, length and suppression effects
and (4) verbal inference-making. It is here ventured that (Wilson, 2001)).
communicative proficiency mediated by such predictors, across Given the empirical arguments for the modality-free nature
visual, non-auditory and alternative communication systems, of working memory in terms of neurophysiological, behavioural
adds to the plausibility of a rather abstract design of a working- and cognitive similarities—across speech communication modes
memory system. It is clear that speed of lexical access, the nature and languages—a tentative model can be formulated. The model
and quality of phonological representations, and sometimes is abstract in the sense of modality and language independence,
more complex information-processing indices, represent the and assumes that four parameters are especially crucial in
predictors both within and across sensory aid domains, and predictions of the ease with which speech and language can be
sometimes also across pre- and post-training of speech-under- understood. As already indicated, it further assumes that
standing performance (Rönnberg et al, 1998b). perceptual input channels, long-term memory and working-
memory interactions set the structural frame for interactions
3. Research on speech-reading expertise (Lyxell, 1994; Rönnberg, among parameters.
1993; Rönnberg et al, 1999) shows that, despite differences in
onset of hearing impairment or deafness, and despite differences
Model parameters: across communication modes and
in communicative habits, language use and strategy, there is
languages
convergence on the fact that, for successful speech-readers,
bottom-up cognitive functions such as lexical-access speed and 1. The first assumption is that the interface between perception/
phonology are relatively normal, whereas top-down cognitive perceptual input systems and cognition for speech and sign
skills, such as complex storage and processing of information, as processing is at the sublexical, syllable level. The accuracy and
well as inference-making, are exceptional compared to normal quality of phonological, syllable-like representations determine
participants. Thus, ‘higher-order’ cognitive and working-memory- parameter fp(P). This assumption is supported by both theory
related skills are required for speech-understanding expertise, and data. Four points can be adduced here, as follows.
beyond the threshold that is set by the lower-order functions
(Rönnberg et al, 1998a), and they operate independently of (a) The assumption is founded on the facts that phonological
communication mode. representations stored in long-term memory deteriorate over
time and are critical for initial processing of spoken, visual
Empirical arguments for the modality-free language language (i.e. indicated by a text-based, rhyme test (Lyxell,
aspect of working memory Rönnberg & Samuelsson, 1994; Andersson, 2001)). Phone-
1. Lesion data and neurophysiological data seem to suggest mically based processing in letter span tasks (i.e. tapping the
similarities in the ways in which left-hemisphere neural networks phonological loop in working memory) does not deteriorate
are active during language processing. What can also be seen is (Andersson, 2001). In addition, the interpretation of
that the similarities hold true across different levels of language correlations with this test is less clear, as the similarity index
and imaging technique (reviews: Rönnberg, Söderfeldt & Risberg, was negatively correlated with visual speech understanding,
2000; Wilson, 2001). As regards ‘inner speech’, it was found in a suggesting also a visual component (Andersson et al,
recent imaging study that ‘inner signing’ of sentences engages 2001a). In a phonological lexical-access test, where the
similar functional networks in the brain to those associated with subject is instructed to decide as quickly as possible whether
the activation of the phonological loop: left inferior frontal a non-word (sometimes a pseudohomophone) sounds like a
cortex, rather than visuospatial areas (McGuire et al, 1997). real word, processing is more explicitly geared toward a
phonemic analysis—otherwise the task cannot be solved
2. Additional neurophysiological evidence seems to support the correctly (Andersson et al, 2001a). Again, the point here is
notion of an ‘abstract’ site for carrying phonological, syllable- that this task is not as sensitive as a predictor of visual,

S70 International Journal of Audiology, Volume 42 Supplement 1


sentence-based speech understanding as the rhyme tasks. automatic, on-line mode of language processing. Given that the
One reason why the syllable level of phonology is important input level of processing in the cognitive system is at the syllable
may be that not only is initial phonemic information level, and that representations in long-term memory carry
involved in a rhyme test, but also the suprasegmental and sufficient syllable-like properties, then it is also possible to argue
syllabic stress information are important (cf. the NAM that processing is implicit with that unit of analysis. This is
(Luce & Pisoni, 1998)). The latter aspects of the test relate to because rhyming skills are developed earlier than phonemic
intonation and prosody, at both the word and sentence levels skills (e.g. Wagner & Torgesen, 1987), and rhyming tasks are
(Kjelgaard & Speer, 1999; Lindfield, Wingfield & Goodglass, typically executed implicitly rather than explicitly, compared to
1999), which under any circumstances may represent com- other more analytical phonological lexical-access tasks
pensatory cues, when we know that certain phonemes are (Andersson et al, 2001a).
not easily distinguished from lip movements only (Rönnberg
et al, 1998c). 3. The mechanism whereby processing is switched over to an
explicit, controlled mode, fe(E), is assumed to be determined by a
(b) A further aspect of this assumption of syllabic primacy of
general mismatch mechanism (Näätänen, 1986; Näätänen &
the conceptualized interface is that the phonological pro-
Escera, 2000)—a mechanism that is assumed to operate at two
cessor must be relatively ‘abstract’; that is, it is assumed to
levels of representation in long-term memory: The phonological
work with non-sound, tactile and visual input, as well as with
and the semantic script levels both contribute to lexical access,
auditory input. When perceptually amalgamating tactile,
and therefore set limits for language understanding in the
prosodic information with visual, lip-read information, it is
dialogue. When there is a mismatch signal from long-term
hard to see that the perceptual–cognitive bridge is phonemic
memory, working-memory resources are assumed to be allocated
when the initial word information offered is less precise to
to inference-making and disambiguation of the message.
begin with (Rönnberg, 1993). Therefore, two alternative
interpretations remain: either visual–tactile perception works (a) When lexical and semantic access are hindered by a less
because phonemic information is derived before lexical favourable fp(P)fs(S) interaction, then explicit, controlled
access, as a result of some further preprocessing of processing resources need to be invoked. Put in other
information, or it is assumed that sublexical, syllable-like words, when the perceived signals are too distorted (due to
information is represented in long-term memory and is the impairment or signal properties) to activate long-term
sufficient to accomplish lexical access. As already implied, we memory phonological representations, the probability of a
opt for the latter alternative, because it is more parsimonious mismatch signal occurring is higher. For example, either
and functional, and in conjunction with script-related speed of processing hampers understanding at a rate suffici-
constraints, speech understanding may run smoothly. ent to keep up with a conversation, or poor phonological
representations, or the interaction between them (or inter-
(c) The syllable assumption is also compatible with the idea action with the peripheral hearing loss in the case of
that syllables represent an important perceptual–cognitive speech) affect lexical access to such an extent that lexical
bridge for natural languages, but not necessarily for arte- entry and meaning of the message are lost. Hence, a
facts, like script. The written language bias is based on a mismatch signal is produced.
long history of research, e.g. on reading development. Here,
it is clear that phonemes are crucial, because phonemic tests, (b) The expectancy or prediction given by the context and
but not rhyme tests, are crucial indices for beginning readers discourse (Samuelsson & Rönnberg, 1993) is a powerful
(Höien, Lundberg, Stanovich & Bjaalid, 1995). However, determinant of visual speech understanding. For example, as
this is not necessarily true for dynamic, non-sound stimuli long as the materials are typical of a particular script or
that drive language-understanding processes. communicative situation, activation of the lexicon proceeds
with sufficient efficiency, and, hence, processing proceeds
(d) It is conceivable that the syllable is also an important unit of implicitly. When materials to be understood are below the
analysis when it comes to possible generalization of this level of prediction (i.e. are less typical), the probability of a
model to signed languages. This possibility is supported by mismatch signal increases. For materials at or above the level
some neurophysiological data on sign language (Petitto et of prediction, processing is assumed to proceed implicitly.
al, 2000).
Corollary: Working memory is continuously testing predictions
In our view (Rönnberg et al, 1998a), then, the above data imply
and the quality/speed of phonological lexical access in its
that the brain has the capacity for a variety of ‘phonological’
interactions with long-term memory and perceptual input.
codes—coding schemes that capitalize on a rapid linguistic
Processing is implicit as long as no mismatch occurs. Thus, the
combinatorial capacity for sublexical units such as the syllable.
system is conservative and economical in the sense that working-
memory resources should not be unnecessarily deployed. This
2. The second assumption is that long-term memory access
assumption is not made explicit in current models of working
speed is important for perceptual decoding and lexical and
memory (Richardson et al, 1996). When processing becomes
semantic access, giving parameter fs(S) (Pichora-Fuller,
explicit, the degree of mismatch determines the parameter fe(E)
Schneider & Daneman, 1995; Pichora-Fuller, this supplement;
(Samuelsson & Rönnberg, 1993). The mismatch signal can be
Rönnberg, 1990; Tun & Wingfield, 1999).
derived from two sources: one phonological and the other
Corollary: fp(P) and fs(S) are assumed to interact. The inference contextual, both relying on the relationship between long-term
is that the interactions typically take place in the implicit and memory representations and input signal. The exact weighting of

Cognition in the hearing impaired and deaf as a bridge between Rönnberg S71
signal and dialogue: a framework and a model
the two sources of potential mismatch generally depends on the type of hearing aid. The ratio may be a useful indicator for long-
quality of the input signal and contextual specification of script term evaluations.
use—in general, the mismatch depends on the way in which
bottom-up and top-down processing balance each other in the 5. Various contextual effects determine working-memory pre-
language-understanding task. Experimentally, one could, for dictions and the degree of mismatch: this holds true for dialogue
example, keep a well-defined context, with highly predicted and experimental sentences; context and lexical activation also
materials, manipulating only the quality of the phonological interact, except for the phonology–lexical-access interaction.
input to tease out the contribution of the long-term phonological Materials for assessing and estimating explicit and implicit
representations. The next logical step would be to offer less processes can be constructed and developed for different
specified contextual support to estimate parameter fe(E). communication modes.

4. Given a high value of parameter fe(E), the assumption is that Across languages
explicit resources are invoked, named the capacity parameter, 1. The existence of phonological problems in sign language
fc(C), in working memory; when fc(C) is high, explicit process- users is expected. For example, specific language impairments in
ing, inference-making and complex information processing are sign, with phonological, ‘non-word repetition’ problems, should
made easier. fc(C) represents the general, amodal capacity of the be observed (Bishop, North & Donlan, 1996; Briscoe, Bishop &
working-memory system. It is not tied to a specific loop Norbury, 2001), particular phonological awareness problems in
function, but represents a general processing and storage ‘phonological dyslexia’ (i.e. ordering and composition of
capacity (Daneman & Merikle, 1996). sublexical syllable-like units, problems with analysis of lexical
into sublexical units, and synthesis of sublexical units into
Corollary: The fp(P)fs(S) interaction sets the perceptual, implicit lexical units (Höien et al, 1995)), as well as the possibility that
platform (threshold) (Rönnberg et al, 1998c) for language phonological loop deficits in sign will determine lower rates of
understanding, on the basis of which explicit resources may be acquisition of a foreign sign language vocabulary (Baddeley,
needed, for which the fe(E)fc(C) interaction represents an Gathercole & Papagno, 1988).
important moderator function.
2. fs(S)fp(P) interactions in parkinsonian patients using sign
may demand high fc(C) estimates to allow fluency of dialogue
Expectations and applications
and to compensate for the ‘whispering’, slow sign production
Based on the model, a few expectations and applications that that is typically employed by the parkinsonian signer (Kegl,
follow and are compatible with existing data are presented Cohen & Poizner, 1999).
below.
3. fe(E)fc(C) interactions are expected in contextually un-
Across communication modes expected or poorly specified, signed situations, especially in
1. Tests that assess phonological, syllable-based representations elderly signers.
in long-term memory should have a high predictive power for
speech understanding, irrespective of communicative form 4. Similar neuronal networks are expected to be activated for
(visual, auditory, or tactile, or combinations; see Andersson et al signed and spoken working-memory tasks.
(2001a,b) and Rönnberg et al (1998a,c).
Discussion
2. Cognitive ageing affects the speed of information processing.
It can be expected that the fp(P)fs(S) interaction will be further In the overall evaluation of generality across speech com-
reinforced in elderly, hearing-impaired listeners/speech-readers munication modes, points 1–3 hold up to experimental scrutiny,
(Pichora-Fuller, this supplement). whereas points 4 and 5 have not been properly assessed and
evaluated. When it comes to generality across languages, points
3. The fe(E)fc(C) interaction can account for the facts that (a) 1, 3 and 4 have not been properly researched. While we await a
skilled speech-readers capitalize better on contextual constraints firmer basis for decision in this respect, it can be stated that the
(i.e. they have a higher fc(C) (Samuelsson & Rönnberg (1991)), model parameters put forward here are not crucially dependent
(2) younger listeners (compared to older listeners with similar on modality specificity per se (Wilson, 2001). It is quite con-
audiograms) do better in noise conditions (Pichora-Fuller et al, ceivable that all parameters are equally applicable to both signed
1995; Tun & Wingfield, 1999), and (3) high-capacity subjects and spoken languages.
adapt more easily to different processing strategies in hearing However, there is also a set of modality-specific findings:
aids (Lunner, this supplement). recall order effects are modality specific, spatial rehearsal and
an irrelevant-sign effect are de facto found for sign language
4. One way of conceiving acclimatization effects in hearing aid (Wilson, 2001), as well as modality-specific recency and recall
fitting, or with other sensory aids, is to assume that the overall preferences (Rönnberg & Nilsson, 1987). There is sign-specific
ratio between explicit fe(E) and implicit processes (fp(P)fs(S) is activity (e.g. in the right hemisphere (Rönnberg et al, 2000)), as
reduced as acclimatization proceeds. Fc(C) may be tied to well as other domain-specific working-memory effects that
further adaptation of the processing scheme in the hearing aid oppose the simple, language-neutral, amodal view (Smith &
(Lunner, this supplement). Procedures that vary the balance Jonides, 1997).
between explicit and implicit processing need to be developed to A compromise view at this early stage of testing of the
estimate these processes with standardized testing for a given memory part of the model would be to use Baddeley’s recent

S72 International Journal of Audiology, Volume 42 Supplement 1


development of the working-memory model, which is very clear A further general restriction of the model is that this is only
on the possibility of integration of sensory input in the episodic possible given a sufficiently precise (but still distorted) per-
buffer (Baddeley, 2000). One way of deciding the viability of the ceptual input; that is, we must assume a family of functions
opposing arguments would be to design experiments which where a minimal threshold (Mt) of fp(P)fs(S) is assumed (e.g.
allow for modality-free and modality-specific classical working- Mt = 20% accuracy). For fp(P)fs(S) > Mt, the U-shaped function
memory effects—across languages in bilingual individuals—and is generally true, and for f(P)f(S) < Mt, the [f(P)f(S)]/f(E) ratio
to do this as a neuroscience, imaging project. Specificity versus gives essentially very small possibilities for compensation by
generality in working-memory representations would thus be means of fc(C).
tested explicitly. Furthermore, for intermediate levels of fp(P)fs(S) (Figure 1),
It should also be noted that current working-memory models the most pronounced U-shaped, compensatory function is
have not relied on the processing economy inherent in the expected for high-capacity individuals when the ratio of
assumption of implicit processing, given a lack of mismatch [f(P)f(S)]/f(E) is allowed to vary from >1 to <1. This is assumed
signals. For example, the models by Logie (1996) and Baddeley to be possible because (1) there is still room for explicit
(2000) seem to rely on automatic activation of both long-term compensation when fc(C) is high, and (2) the phonological input
memory representations and corresponding modality-specific is sufficiently poor to trigger a mismatch signal—demanding a
‘slave systems’. This non-constrained activity in working higher fe(E)—but at the same time sufficiently precise to support
memory seems appropriate for memory and cognitive tasks as the further interaction with fc(C). For high levels of initial
such, but less so when cognition is used in time-pressed, on-line, fp(P)fs(S) (Figure 2), the variation of fc(C) will presumably—as
language-processing tasks. In the present model, selective in the case for levels above, but close to an Mt of fp(P)fs(S)
activation of working-memory resources is time-consuming and (Figure 3)—result in a less pronounced U-function for high-
should be done only when mismatch signals occur. capacity individuals. In the high-fp(P)fs(S) case, this is because
As a summary and stricter formulation, the model can also there is less room for improvement; in the low-fp(P)fs(S) case,
be couched in mathematical terms, partly because it is quite this happens because there are very few elements on which to
possible to estimate effect sizes through proper experimentation. base further processing. In addition, it may be the case that the
Thus, one needs to determine the function(s) that relate(s) ease monotonic decrease in ELU for all levels of initial fp(P)fs(S) (as a
of language understanding (ELU) to functions of the four function of the decrease in ratio) vary as to when the interaction
(standardized) parameters involved: ELU refers primarily to with fc(C) starts. For high compared to low levels of initial
understanding messages at the sentence level, but is assumed to fp(P)fs(S), it is expected that processing is more resistant to
be useful also for the discourse level of understanding. fp(P) disruption; for example, it takes more noise to disrupt an initially
refers to the accuracy and quality of syllable-like representations stable and implicit processing mode; that is, in principle, it takes
in long-term memory; fs(S) refers to long-term memory access a lower fp(P)fs(S)/f(E) ratio to initiate the interaction with fc(C)
speed; the degree of mismatch, phonologically or semantically, (Figure 2), whereas the reverse is assumed to hold true for low
determines fe(E); and, finally, fc(C) represents the general levels of initial fp(P)fs(S) (Figure 3).
processing and storage capacity of working memory. Proper Testing and simulation of the model will have to entail
estimation of ELU demands experimentation to evaluate parameter estimation, standardization, and variation of the
interactions among variables, as well as separate testing of actual functions relating ELU to the critical zones above and
abilities, such as parameter fc(C). Thus, each parameter is in below [fp(P)fs(S)]/fe(E) = 1, for different levels of fp(P)fs(S),
turn a function of type of test, potential subcomponents, and combined with a variation of fe(E)fc(C).
linear and non-linear interactions among subcomponents.
Nevertheless, one general point of departure is the funda-
mental relationship between the implicit and explicit com-
ponents in the general formulation of the model. The general
form may be stated as follows: ELU = [fp(P)fs(S)]/[fe(E)fc(C)].
When fe(E) is small, the three remaining individual parameters
High
point to a high ELU value. This represents the general case High C
when input is typical and expected and the perceptual
mechanism works with sufficient precision and automaticity in
ELU

relation to the degree of specification of the input. However, in Moderate


many instances, the above conditions do not hold true for the
hearing-impaired listener or language user. Therefore, it is Low C
important to address the different conditions, or hypothetical Low
outcomes of principal interest, when the ratio between the
implicit and explicit components varies.
When [fp(P)fs(S)]/fe(E) is >1, assuming standardized vari-
ables, ELU is generally better than when [fp(P)fs(S)]/fe(E) < 1. >1 1 <1
In addition, for low [fp(P)fs(S)]/fe(E) ratios, e.g. when fe(E) is [fp(P)fs(S)]/fe(E)
[fp(P)fs(S)]/fe(E)
relatively high, it is expected that fc(C)—in accordance with the
general formulation—includes an added compensatory function Figure 1. Hypothetical ELU functions for high- and low-fc(C)
such that performance improves in a U-shaped fashion. Previous subjects, given an intermediate level of fp(P)fs(S) and a variation
case studies support such a supposition (Rönnberg et al, 1998c). of the [fp(P)fs(S)]/fe(E) ratios above and below 1.

Cognition in the hearing impaired and deaf as a bridge between Rönnberg S73
signal and dialogue: a framework and a model
High High
High C
Low C

ELU
ELU

Moderate Moderate
High C
Low Low Low C

>1 1 <1 >1 1 <1


[fp(P)fs(S)]/fe(E)
[fp(P)fs(S)]/fe(E) [fp(P)fs(S)]/fe(E)
[fp(P)fs(S)]/fe(E)

Figure 2. Hypothetical ELU functions for high- and low-fc(C) Figure 3. Hypothetical ELU functions for high- and low-fc(C)
subjects, given a high level of fp(P)fs(S) and a variation of the subjects, given a low level of fp(P)fs(S) and a variation of the
[fp(P)fs(S)]/fe(E) ratios above and below 1. [fp(P)fs(S)]/fe(E) ratios above and below 1.

Critical empirical tests of communication mode and language Experimentation, combined with neuroimaging, is needed to
specificity are needed to further shape the model (Wilson, 2001). evaluate the modality specificity of the model (e.g. comparing
For example, the weighting of fp(P) and fs(S) may vary across working memory for sign and speech). Systematic manipul-
language, and the particular phonological tests used. Presently, it is ation of parameters is needed to derive empirically founded
expected that syllable-based rhyming tests represent straightforward ELU functions, functions which may assist the development
independent tests that capture fp(P) (Andersson, 2001), lexical- of new tests and the future conceptualization of ‘cognitive’
access speed is a good approximation of a general cognitive speed sensory aids.
component, fs(S) (Rönnberg, 1990; Pichora-Fuller, this supple-
ment; Tun & Wingfield, 1999), and the reading (or listening) span Acknowledgments
test is a good first estimate of explicit capacity in general, fc(C) (e.g.
Lunner, this supplement; Rönnberg, 1995). fc(C) is also highly This research is supported by a grant from the Swedish Council
correlated with the actual ability to make intelligent guesses and for Social Research (30305108).
linguistic inferences (Lyxell & Rönnberg, 1989). Such a test can be
developed with different languages and communication modes.
References
Andersson, U. (2001). Cognitive deafness. The deterioration of
Conclusions phonological representations in adults with an acquired severe
hearing loss and its implications for speech understanding.
This article builds on the impressive empirical evidence, Dissertation. Studies from the Swedish Institute for Disability
behavioural and neurophysiological, that exists for modality-free Research No. 3.
speech and language understanding. A framework, which Andersson, U., Lyxell, B., Rönnberg, J. & Spens, K.-E. (2001a).
assumes a continuous interaction between perceptual input Cognitive predictors of visual speech understanding. Journal of
Deaf Studies and Deaf Education, 6, 116–129.
channels, long-term memory, and working memory, was assumed Andersson, U., Lyxell, B., Rönnberg, J. & Spens, K.-E. (2001b). A
as a starting point for conceptualizing the bottlenecks that follow-up study on the effects of speech tracking training on visual
determine the ease with which language can be understood, speechreading of sentences and words: cognitive prerequisites and
spoken, or signed. The theoretical context of speech perception chronological age. Journal of Deaf Studies and Deaf Education, 6,
103–116.
models and working-memory models have been addressed.
Baddeley, A.D. (2000). The episodic buffer: a new component of
On the basis of this general framework for language under- working memory? Trends in Cognitive Neuroscience, 4, 417–423.
standing, an explicit model was proposed. In brief, the model Baddeley, A., Gathercole, S. & Papagno, C. (1998). The phonological
assumes four interacting, modality-free parameters: phonology loop as a language learning device. Psychological Review, 105,
(quality and precision) fp(P), speed fs(S), explicit processing 158–173.
Baddeley, A.D. & Hitch, G. (1974). Working memory. In G.A. Bower
fe(E), and general storage and processing capacity fc(C) in (Ed.), The psychology of learning and motivation (pp. 47–89).
working memory. A general formalized description was also London: Academic Press.
proposed, where ease of language understanding ELU = Bernstein, L. & Auer, E. (2003). Speech perception and spoken word
[fp(P)fs(S)]/[fe(E)fc(C)]. Hypothetical graphs were used to recognition. In: M. Marschark & P.E. Spencer (Eds.), The handbook
of deaf studies, language, and education (pp. 379–391). Oxford:
illustrate some of the interacting features and limiting
Oxford University Press.
conditions of the model. Applications and expectations across Bernstein, L.E., Demorest, M.E. & Tucker, P.E. (2000). Speech perception
speech communication modes and language were pointed out. without hearing. Perception & Psychophysics, 62, 233–252.

S74 International Journal of Audiology, Volume 42 Supplement 1


Bishop, D.V.M., North, T. & Donlan, C. (1996). Nonword repetition as standing with cochlear implants in deafened adults. Scandinavian
a behavioural marker for inherited language impairment: evidence Journal of Psychology, 39, 175–179.
from a twin study. Journal of Child Psychology and Psychiatry and Lyxell, B., Arlinger, S., Andersson, J., Harder, H., Näsström, E.,
Allied Disciplines, 37, 391–403. Svensson, H., et al (1996). Information-processing capabilities and
Brentari, D., Poizner, H. & Kegl, J. (1995). Aphasic and Parkinsonian cochlear implants: pre-operative predictors for speech understand-
signing: differences in phonological disruption. Brain and Language, ing. Journal of Deaf Studies & Deaf Education, 1, 190–201.
48, 69–105. Lyxell, B. & Rönnberg, J. (1989). Information-processing skills and
Briscoe, J., Bishop, D.V.M. & Norbury, C.F. (2001). Phonological speechreading. British Journal of Audiology, 23, 339–347.
processing, language, and literacy: a comparison of children with Lyxell, B., Rönnberg, J. & Samuelsson, S. (1994). Internal speech
mild-to-moderate sensorineural hearing loss and those with specific functioning and speechreading in deafened and normal hearing
language impairment. Journal of Child Psychology and Psychiatry adults. Scandinavian Audiology, 23, 181–185.
and Allied Disciplines, 42, 329–340. MacSweeney, M., Campbell, R., Calvert, G.A., McGuire, P.K., David,
Calvert, G., Bullmore, E., Brammer, M., Campbell, R., Woodruff, P., A.S., Suckling, J., et al (2001). Dispersed activation in the left
McGuire, P., et al (1997). Activation of auditory cortex during silent temporal cortex for speech-reading in congenitally deaf people.
speechreading. Science, 276, 593–596. Proceedings of the Royal Society of London Series B—Biological
Cowie, R.E. & Douglas-Cowie, E. (1992). Postlingually acquired Sciences, 268, 451–457.
deafness: speech deterioration and the wider consequences. New Markova, I. & Linell, P. (1996). Coding elementary contributions to
York: Mouton de Gruyter. dialogue: individual acts versus dialogical interactions. Journal for
Daneman, M. & Merikle, P.M. (1996). Working memory and language the Theory of Social Behavior, 26, 353–373.
comprehension: a meta-analysis. Psychonomic Bulletin & Review, Marschark, M. & Mayer, T.S. (1998). Interactions of language and
3(4), 422–433. memory in deaf children and adults. Scandinavian Journal of
Gailey, L. (1987). Psychological parameters of lipreading skill. In B. Psychology, 39, 145–148.
Dodd & R. Campbell (Eds.), Hearing by eye: the psychology of Marslen-Wilson, W. (1987). Functional parallelism in spoken word
lipreading (pp. 115–141). London: Lawrence Erlbaum. recognition. Cognition, 25, 71–103.
Giraud, A.-L., Price, C.J., Graham, J.M., Truy, E. & Frackowiak, R.S.J. McClelland, J.L. & Elman, J.L. (1986). The TRACE model of speech
(2001). Cross-modal plasticity underpins language recovery after perception. Cognitive Psychology, 18, 1–86.
cochlear implantation. Neuron, 30, 657–663. McGuire, P.K., Robertson, D., Thacker, A., David, A.S., Kitson, N.,
Giraud, A.-L., Truy, E., Frackowiak, R.S.J., Grégoire, M.-C., Pujol, J.-F. Frackowiak, R.S.J., et al (1997). Neural correlates of thinking in
& Collet, L. (2000). Differential recruitment of the speech sign language. Neuroreport, 8, 695–698.
processing system in healthy subjects and rehabilitated cochlear Näätänen, R. (1996). Neurophysiological basis of the echoic memory as
implant patients. Brain, 123, 1391–1402. suggested by event-related potentials and magnetoencephalogram.
Höien, T., Lundberg, I., Stanovich, K.E. & Bjaalid, I.-K. (1995). In F. Klix & H. Hagendorf (Eds.), Human memory and cognitive
Components of phonological awareness. Reading and Writing, 7, capabilities (pp. 615–628). Amsterdam: North Holland.
171–188. Näätänen, R. & Escera, C. (2000). Mismatch negativity: clinical and
Jeffers, J. & Barley, M. (1971). Speechreading. Springfield, IL: Charles other applications. Audiology and Neuro-otology, 5, 105–110.
C. Thomas Publishers. Petitto, L.A., Zatorre, R.J., Gauna, K., Nikelski, E.J., Dostie, D. &
Just, M.A. & Carpenter, P.A. (1992). A capacity theory of compre- Evans, A.C. (2000). Speech-like cerebral activity in profoundly deaf
hension—individual differences in working memory. Psychological people processing signed languages: implications for the neural basis
Review, 99, 122–149. of human language. Proceedings of the National Academy of
Kegl, J., Cohen, H. & Poizner, H. (1999). Articulatory consequences of Sciences of the United States of America, 97(25), 13961–13966.
Parkinson’s disease: perspectives from two modalities. Brain and Pichora-Fuller, M.K., Schneider, B.A. & Daneman, M. (1995). How
Cognition, 40, 355–386. young and old adults listen to and remember speech in noise.
Kjelgaard, M.M. & Speer, S.H. (1999). Prosodic facilitation and Journal of the Acoustical Society of America, 97, 593–608.
interference in the resolution of temporary syntactic closure Richardson, J.T.E., Engle, R.W., Hasher, L., Logie, R.H., Stoltzfus, E.R.
ambiguity. Journal of Memory & Language, 40, 153–194. & Zacks, R.T. (1996). Working memory and human cognition.
Larsby, B. & Arlinger, S. (1994). Speech recognition and just-follow- Oxford: Oxford university Press.
conversation tasks for normal-hearing and hearing-impaired Rönnberg, J. (1990). Cognitive and communicative function: the effects
listeners with different maskers. Audiology, 33, 165–176. of chronological age and ‘handicap age’. European Journal of
Leybaert, J. (1998). Phonological representations in deaf children: the Cognitive Psychology, 2, 253–273.
importance of early linguistic experience. Scandinavian Journal of Rönnberg, J. (1993). Cognitive characteristics of skilled tactiling: the
Psychology, 39, 169–173. case of GS. European Journal of Cognitive Psychology, 5, 19–33.
Leybaert, J. & Lechat, J. (2001). Phonological similarity effects in Rönnberg, J. (1995). What makes a skilled speechreader? In G. Plant &
memory for serial order of cued speech. Journal of Speech, Language, K. Spens (Eds.), Profound deafness and speech communication (pp.
and Hearing Research, 44, 949–963. 393–416). London: Whurr Publishers.
Levänen, S. (1998). Neuromagnetic studies of human auditory cortex Rönnberg, J. (2003). Working memory, neuroscience, and language:
function and reorganisation. Scandinavian Audiology, 27(suppl 49), evidence from the deaf and hard of hearing individuals. In M.
1–6. Marschark & P.E. Spencer (Eds.), The handbook of deaf studies,
Lindfield, K.C., Wingfield, A. & Goodglass, H. (1999). The role of language and education (pp. 478–489). Oxford: Oxford University
prosody in the mental lexicon. Brain & Language, 68, 312–317. Press.
Logie, R.H. (1996). The seven ages of working memory. In J.T.E. Rönnberg, J., Andersson, J., Andersson, U., Johansson, K., Lyxell, B. &
Richardson, R.W. Engle, L. Hasher, R.H. Logie, E.R. Stolzfus & Samuelsson, S. (1998b). Cognition as a bridge between signal and
R.I. Zacks (Eds.), Working memory and human cognition. Oxford: dialogue: communication in the hearing impaired and deaf.
Oxford university Press. Scandinavian Audiology, 27(suppl 49), 101–108.
Ludman, C.N., Summerfield, A.Q., Hall, D., Elliott, M., Foster, J., Rönnberg, J., Andersson, U., Lyxell, B., & Spens, K. (1998a). Vibro-
Hykin, J.L., et al (2000). Lip-reading ability and patterns of cortical tactile speechreading support: cognitive prerequisites for training.
activation studied using fMRI. British Journal of Audiology, 34, Journal of Deaf Studies & Deaf Education, 3, 143–156.
225–230. Rönnberg, J., Andersson, J., Samuelsson, S., Söderfeldt, B., Lyxell, B. &
Luce, P.A. & Pisoni, D.A. (1998). Recognising spoken words: the Risberg, J. (1999). A speechreading expert: the case of MM. Journal
neighborhood activation model. Ear & Hearing, 19, 1–36. of Speech, Language and Hearing Research, 42, 5–20.
Lyxell, B. (1994). Skilled speechreading—a single case study. Rönnberg, J. & Nilsson, L-G. (1987). The modality effect, sensory handi-
Scandinavian Journal of Psychology, 35, 212–219. cap, and compensatory functions. Acta Psychologica, 65, 263–283.
Lyxell, B., Arlinger, S., Andersson, J., Bredberg, G., Harder, H. & Rönnberg, J., Samuelsson, S. & Lyxell, B. (1998c). Conceptual
Rönnberg, J. (1998). Phonological representation and speech under- constraints in sentence-based lipreading in the hearing impaired. In

Cognition in the hearing impaired and deaf as a bridge between Rönnberg S75
signal and dialogue: a framework and a model
R. Campbell, B. Dodd & D. Burnham (Eds.), Hearing by eye: Part ences in language processing with different types of distracting sounds.
II. Advances in the psychology of speechreading and audiovisual Journal of Gerontology B Psychological Sciences, 54, 317–327.
speech (pp. 143–153). London: Lawrence Erlbaum Associates. Wagner, R.K. & Torgesen, J.K. (1987). The nature of phonological
Rönnberg, J., Söderfeldt, B. & Risberg, J. (2000). The cognitive neuro- processing and its causal role in the acquisition of reading skills.
science of signed language. Acta Psychologica, 105, 237–254. Psychological Bulletin, 101, 192–212.
Samuelsson, S. & Rönnberg, J. (1991). Script activation in lipreading. Wilson, M. (2001). The case for sensorimotor coding in working
Scandinavian Journal of Psychology, 32, 124–143. memory. Psychonomic Bulletin & Review, 8, 44–57.
Samuelsson, S. & Rönnberg, J. (1993). Implicit and explicit use of Wilson, K & Emmorey, K. (1997). Working memory for sign language: a
scripted constraints in lipreading. European Journal of Cognitive window into the architecture of the working memory system.
Psychology, 5, 201–233. Journal of Deaf Studies and Deaf Education, 2, 121–130.
Smith, E.E. & Jonides, J. (1997). Working memory: a view from neuro- Zatorre, R.J. (2001). Do you see what I’m saying? Interactions between
imaging. Cognitive Psychology, 33, 5–42. auditory and visual cortices in cochlear implant users. Neuron, 31,
Tun, P.A. & Wingfield, A. (1999). One voice too many: adult age differ- 13–14.

S76 International Journal of Audiology, Volume 42 Supplement 1

You might also like