You are on page 1of 10

Looking backward and forward

at classroom-based language
assessment
Stephen Stoynoff

While scholars debate the precise interpretations and origins of Janus,


the notion is widely associated with the two-headed Roman god of
endings and beginnings who was capable of simultaneously looking
backward and forward. As the term of one ELT Journal editor ends and
the term of another editor begins in January, Janus is a fitting metaphor
not only for the editorial transition that is underway but also for an
article that considers some of the developments of the past few decades
and their impact on classroom-based language assessment. Ihave
chosen to characterize these developments as (a) endings that Ibelieve
have occurred, (b) transitions that are underway, and (c) beginnings that
are not yet fully realized.
In a discussion of perspectives on language assessment, Bachman
Approaches to
language assessment (2007) reviewed testing practices over the past five decades and
categorized them into seven approaches:

skills and elements


direct testing/performance assessment
pragmatic language testing
communicative language testing
interaction-ability (communicative language ability)
task-based performance assessment
interactional language assessment.

ELT Journal Volume 66/4 Special issue October 2012; doi:10.1093/elt/ccs041 


The Author 2012. Published by Oxford University Press; all rights reserved.

523

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

In the past few decades, approaches to language assessment and perspectives


on learning have changed. This article highlights those developments with the
greatest significance for teachers and classroom-based language assessment,
including the emergence of new perspectives on the nature of language ability
and learning, use of an expanded array of assessment procedures, concern
for the consequences of assessment, interest in embedding assessment in
teaching and learning, and attention to the assessment of young learners.
It concludes with a discussion of the implications of these developments for
future research in classroom-based assessment and for teachers.

Although the list implies a chronological sequence, in fact some


approaches overlap and several are actually extensions, reformulations,
or elaborations of previous approaches based on theoretical
developments and the results of subsequent research.

Communicative Language Testing, according to Bachman (ibid.),


reflects the intersection of research in several disciplines, including
functional linguistics, sociolinguistics, psycholinguistics, and language
teaching. It focuses on what language users know about the language
and the extent to which they use their knowledge appropriately in
communicative language use situations. For another perspective and
fuller discussion of communicative language testing, see Morrow
(2012).
A number of scholars have extended and elaborated the communicative
language testing approach into what Bachman labels the
Interaction-ability approach. This approach assumes language ability
is comprised of multiple, highly interrelated subcompetencies that
interact in a language use situation and assessment tasks are developed
on the basis of the most salient features in a language use situation. As
interest in the use of tasks has grown, two orientations to Task-based
Performance Assessment have evolved. The first focuses on the types
of tasks, processes, and abilities to be assessed, and in this regard, it is
similar to the Interaction-ability approach. The second orientation is
essentially an iteration of Performance Assessment: tasks are based on
real-world communicative language use and performance is evaluated
on the basis of real-world considerations.
The six approaches described above focus on the language ability
possessed by test takers, or the tasks they are able to perform, or both.
Conversely, the Interactional approach emphasizes the interaction
of language ability, social contexts, and the communication that
occurs and is jointly constructed by participants. It is an approach
to assessment that is informed by a sociocultural perspective and it
acknowledges assessment as a form of social practice, albeit a highly
complex practice affected by multiple interacting factors.

Lookingbackward

In a cursory examination of the table of contents of Volumes 49 (1995)


through 65 (2011) of the Journal, Iidentified approximately 40 articles
that addressed what Iconsidered to be language assessment issues.

524

Stephen Stoynoff

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

The Skills and Elements approach emphasizes the assessment of


reading, writing, speaking, and listening skills and key aspects (or
elements) of language. Language elements include the phonological,
grammatical, lexical, and cultural knowledge tapped during the
integrated use of language skills. Direct Testing, also referred to by
some as Performance Assessment, emphasizes the use of tasks that
simulate real-world language use. Pragmatic Language Testing is
based on the premise that language ability is not comprised of a set of
distinct skills and elements but rather is a unified general ability that is
optimally assessed through such integrative assessment procedures as
an interview, a composition, a dictation, or a cloze passage.

An article by Upshur and Turner (1995) in 49/1 noted the expanding


use of performance-based tasks in the assessment of second language
ability, and the authors thought this called for more attention to the
development of adequate rating scales and procedures to support the
use of this type of assessment in classroom contexts. They described a
process teachers could use to construct context-specific and empirically
derived rating scales suitable for assessing task-based language
performance in classroom situations. The process entailed selecting
a sample of task performances from the target group of learners,
arranging the sample performances hierarchically to represent the full
range of performance on the task, and developing a set of specific yes/
no questions (representing a scale) related to the task that could then be
used to differentiate between levels of task performance.

Questions about the quality of teacher-constructed classroom


assessments continue to be debated. Some argue we can improve
the quality of classroom assessments by applying the same
psychometric principles and test quality considerations used with
large-scale, standardized assessments. Other notable language
assessment specialists (McKay 2006; Rea-Dickens 2008; Davison
and Leung 2009) contend it is not possible or appropriate to apply
the principles and values of the traditional psychometric orientation
to the development and use of classroom-based assessments because
psychometric and classroom considerations are too divergent. If this
is indeed the case, the validation of classroom-based assessments will
likely require new ways of looking at the key test qualities of reliability
and validity.
Issue one of Volume 49 also included an article by Prodromou (1995)
that explored backwasha term that has come to be more widely
referred to as washbackand its implications for second language
teaching and assessment practices. The author argued that a number
of prevailing practices limited the potential for positive washback on
language examinations and in second language learning classrooms.
These practices included (a) assessing isolated, sentence-level
samples of language by means of multiple-choice, gap-filling, and
transformation item types; (b) limiting the time learners have to
respond to assessment tasks; (c) valuing accuracy more than language
development and form more than content; and (d) failing to align
assessment procedures with curricula and teaching pedagogy.

Classroom-based language assessment

525

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

Language assessment specialists acknowledge that many factors affect


performance on language tasks, including characteristics of the test
taker and the task, and the factors interact in complex ways. Upshur
and Turner considered the influence of scoring procedures on the
assessment of task performance and how the procedures can affect
the reliability and validity of a rating scale and assessors awarding
of scores. Their approach to scale construction addressed some of
the unintended effects of a scoring procedure on the evaluation of
task performance, and the authors offered teachers a practical and
systematic process for improving the quality of a performance-based
classroom assessment.

In the years since Prodromous article was published, considerable


attention has been directed to the consequences of using language
assessments. The impetus for much of the research is attributable to
the introduction of educational reforms and interest in determining
the effect of curricular changes on student achievement in schools
and educational systems. Concern for the consequences of assessment
has led major test developers like Cambridge ESOL and Educational
Testing Service (ETS) to investigate the impact of their tests on
stakeholders and to use the research results to promote improvements
in assessment practices. Today most high-stakes English language
tests include task-based performance assessment and because the tasks
are based on the results of both theoretical and empirical research,
they better replicate the language content and experiences examinees
encounter in non-test language use situations, including in language
learning classrooms. Isubmit that concern for test consequences
has not only increased the positive washback for examinees and
those preparing to take the tests but it has also contributed to more
congruency between the tests and current classroom assessment
practices (Stoynoff 2009).
DIALANG is a good example of progress in the effort to align
large-scale language assessments more closely with classroom
assessment purposes and practices. It is a self-directed assessment
available for 14 languages (including English) that is accessed for free
via the internet. Examinees are able to assess their reading, writing,
listening, grammar, and vocabulary abilities in a foreign language,
and they receive feedback on their strengths and weaknesses as well
as their language level based on the Common European Framework
of Reference for Languages (CEFR). DIALANG reflects recent trends
in assessment, including the emphasis on learner autonomy, the
diagnosis of gaps, and the provision of feedback. Moreover, unlike most
high-stakes tests, DIALANG does not impose time constraints on test

526

Stephen Stoynoff

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

It is clear not only from the Upshur and Turner article but also from
what was reported in other journals at the time that assessment
practices were beginning to change. New approaches to language
assessmentincluding the interaction-ability, task-based performance
assessment, and interactional approacheswere affecting both what
was assessed and how it was assessed. Test developers and classroom
teachers were employing a broader range of assessment procedures
and relying less on objectively scored selected response item types
epitomized by the multiple-choice test. In the years leading up to and
following publication of Prodromous article, task-based assessment
was increasingly perceived as a more authentic and direct assessment
procedure and an appropriate approach to assessing the knowledge and
skills needed to use language for communicative purposes. At the same
time, language teaching curricula and pedagogy increasingly focused
on developing learners capacity to perform relevant communicative
tasks in both in-class and out-of-classcontexts and teachers classroom
assessment procedures routinely required learners to construct oral and
written responses to tasks.

takers, and this is another feature of the assessment that contributes to


positive washback.

What endings
have occurred?

Certainly, the hegemony of the psychometric orientation to assessment


has ended, and there is less reliance on an approach to assessment
that yields a single test score obtained by means of multiple-choice
items that focus on discrete language skills and elements. The adoption
of other approaches to language assessment have led to the use of a
broader range of assessment procedures, and these procedures take
into account the nature of language ability, the characteristics of test
takers and tasks, and the context of language use.

Additionally, the separation of teaching, learning, and assessment has


largely ended. Most teachers no longer view assessment as something
that only occurs after the fact. Rather they recognize the benefits of
conducting assessment before, during, and following teaching and
learning.

What transitions
are underway?

In addition to these endings, several transitions appear to be


underway. The emergence of a new (sociocultural) perspective on
learning has occurred and it is becoming the dominant paradigm. It
generally views learning as a process that is developmental, socially
constructed, interactive, and reflective. Proponents of the sociocultural
perspective believe individual knowledge and learning are developed
through socially mediated interaction in a context that is affected by
social, cultural, and individual considerations (Shepard 2006). In
the language assessment literature, the perspective is manifest in a
concern for fairness in assessment and consideration of the social,
political, and educational implications of assessment for individuals
and language teaching and learning. As this orientation becomes more
prevalent, classroom assessment practices will increasingly reflect the
characteristics Davison and Leung (op.cit.) use to define teacher-based
assessment. Classroom-based assessment will

integrate the teacher fully into the assessment process including


planning assessment, evaluating performance, and making decisions
based on the results of assessment;
be conducted by and under the direction of the learners teacher (as
opposed to an external assessor);

Classroom-based language assessment

527

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

It is also clear the consequences of assessment are no longer


overlooked. There is widespread recognition that the use of
assessments and the interpretation of assessment results have
consequences for individuals, teaching and learning, and society
at large. Since the consequences can often be significant, language
teachers are obliged to monitor the effects of assessment and to seek
to minimize the negative consequences and maximize the positive
consequences of assessment use.

yield multiple samples of learner performance that are collected over


time and by means of multiple assessment procedures and activities;
be applied and adapted to meet the teaching and learning objectives
of different classes and students;
integrate learners into the assessment process and utilize self- and
peer-assessment in addition to teacher-assessment of learning;
foster opportunities for learners to engage in self-initiated enquiry;
offer learners immediate and constructive feedback;and
monitor, evaluate, and modify assessment procedures to optimize
teaching and learning.

Recent approaches to language assessment and a sociocultural


view of learning have spurred interest in alternatives to the use
of a single, discrete-point test comprised of selected response
items to assess language ability. Some of the most frequently cited
alternative assessment procedures used in language classrooms are
portfolios, projects, journals, conferences, observations, interviews,
and simulations. In addition to utilizing a variety of assessment
procedures, alternative assessment often means sharing responsibility
for evaluating task performance and having learners engage in self- and
peer-evaluation of performance. Advocates of alternative assessment
list among its advantages the fact it (a) is consistent with emerging
perspectives on teaching and learning (i.e. those that view learning as
developmental, socially constructed, interactive, and reflective in nature)
and (b) is capable of contributing more information about learners than
a single test comprised of selected response items (Fox 2008). In the
past decade, alternative assessment procedures, including the use of
portfolio assessment, peer-assessment, and self-assessment, have been
widely discussed in ELT Journal as evidenced by the publication of more
than 20 articles.
As interest in integrating classroom teaching, learning, and assessment
has grown, several scholars have responded by proposing frameworks
to guide teachers in organizing classroom assessment into a
more systematic process. Davison (Davison and Leung op.cit.), for
example, proffers a practical, four-step process (planning assessment,
collecting information on student learning, making judgements about
performance, and providing appropriate feedback or advice) to assist
teachers. Dynamic Assessment (DA) is a theory-based framework that
integrates assessment and instruction by combining the two into a
528

Stephen Stoynoff

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

The growing influence of the sociocultural perspective has led


some assessment specialists to reconsider the distinctions between
formative and summative assessment. Typically, formative
assessment has been conducted informally and frequently in
classrooms for the primary purpose of promoting learning, and
summative assessment has been formally planned and periodically
conducted for the primary purpose of documenting learners progress
or achievement. Yet, if formative and summative assessments are to be
more effectively used to support learning, they need to be conceptually
aligned and teachers need to be able to integrate the results of both
forms of assessment into a more unified process.

single activity that both diagnoses and develops the learners ability
by providing a key form of support (mediation) whereby the learner
is made aware of problems and assisted in overcoming them. DA is
based on the notion of Vygotskys Zone of Proximal Development and
the belief that important cognitive functions are mediated through
social interactions and physical and symbolic artefacts (Poehner 2009).
Purpura (2009) argues teachers need to use frameworks that are
based on both theoretical and empirical research in second language
acquisition (SLA) and language testing; his learning-oriented model of
language assessment represents such a framework. Each of the above
examples addresses the issue of formative assessment and embeds
classroom assessment in the learning process.

To summarize, a review of developments in classroom-based


assessment over the last decade reveals professional practice is in
transition. Anew perspective toward learning and assessment is
emerging, and it has implications for teachers and classroom-based
assessment practices. The sociocultural perspective is prompting a
reconsideration of how to forge stronger connections between language
learning and assessment. Teachers are using an expanded array of
(alternative) assessment procedures, and there are frameworks available
to assist them in systematically integrating assessment into language
learning. Finally, the distinct characteristics and needs of young
language learners are being taken into account and efforts are being
made to use appropriate assessment procedures with these learners and
to make appropriate interpretations with the assessment results.
Research on classroom-based language assessment is nascent and
What beginnings
are not fully realized? therefore many issues warrant investigation. To date, the majority of

published studies have focused on teachers assessment practices;


the influence of language assessment frameworks and standards on
assessment practices; the impact of language policies and political
agendas on assessment practices; teachers knowledge, abilities, and
beliefs; and teacher professional development in assessment. Based on

Classroom-based language assessment

529

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

As standards-based curricula proliferate and the emphasis on


classroom-based formative assessment grows, attention has focused
on young learners. McKay (op.cit.: 1)defined young language learners
as those between the ages of 5 and 12who are learning a second
or foreign language during the first six or seven years of formal
schooling. She averred these learners are distinct from the general
language learning population because they are experiencing cognitive,
social, emotional, and physical growth and developing literacy.
They are also psychologically fragile and sensitive to the comments
and actions of teachers and peers. These factors must be taken into
account when assessing this population. Some of the issues related
to the assessment of young learners include consideration of (a) the
context of the language learning, (b) the purpose(s) of assessment
and appropriateness of the assessment procedure(s), and (c) the
developmental and cultural factors affecting young learners and the
impact of these factors on the design, use, and interpretation of results
of language assessments.

the available evidence, teachers assessment practices appear to vary


greatly and are influenced by numerous factors, including teachers
professional experience and training and the characteristics of the
assessment schemes they are required to use (Davison and Leung
op.cit.). In terms of future developments in classroom-based language
assessment, scholars have called for more research on the following
issues.

The effect of using specific frameworks and systematic processes


on the quality (particularly the qualities of reliability and validity) of
various classroom assessment procedures
As noted previously, one of the most vexing issues in classroom-based
assessment has to do with the quality of the assessment procedures.
In particular, it is not clear what standard to apply to the key qualities
of reliability and validity and some classroom teachers do not know
how to validate classroom assessments. For instance, score reliability
on an assessment can be enhanced by training teachers and students
in the design and use of the assessment procedure, and validity can
be enhanced by ensuring assessment tasks adequately sample and
represent the content, processes, and complexity of what students are
learning. Case studies that explore and report on the development and
validation of classroom-based assessments for various purposes and in
specific contexts can reveal how interpretations of assessment results
contribute to instructional decisions and promote student learning. The
dissemination of case study results will encourage best practices and
further understandings of classroom assessment.
The impact of language policies and educational reforms
on classroom-based assessment (especially with regard to
standards-based classroom assessment)
One of the most significant educational reforms in recent years has
been the widespread adoption of standards-based curricula. However,
there are limited accounts of how standards affect teachers classroom
530

Stephen Stoynoff

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

The effect of socioculturally oriented assessment practices and


procedures on language learning, including the effect of different
assessors (self, peer, and teacher)
A new period in classroom-based language assessment is beginning.
It is one in which teaching, learning, and assessment are viewed as
interconnected and embedded in socially mediated interaction that
occurs in contexts of language development and use. Research that
investigates the effect of socially situated assessment practices on
language learning will provide empirical evidence to support the use of
classroom practices that are consonant with the new perspective. This
research needs to investigate the extent to which different assessment
procedures, including the use of self-, peer-, and teacher-assessment,
promote student reflection, self-directed learning, and a positive
learning environment. There is also a need for research that (a)
integrates developments in SLA and language assessment, (b) occurs
longitudinally and in multiple phases, and (c) focuses on the formative
purposes of assessment.

assessment practices or how teachers use standards, including the


extent to which the standards influence the assessments teachers adopt,
adapt, or develop for use in their classrooms. See Llosa (2011) for a
recent review of issues and research on standards-based classroom
assessment. There is a need for more descriptions of the effects of
particular standards-based classroom assessment practices on language
teaching and learning.

The assessment knowledge, abilities, beliefs, and professional


development needs of teachers
Finally, teachers assessment knowledge, abilities, and beliefs
affect their assessment practices and research that explores these
considerations has implications for teacher preparation and
professional development (Taylor 2009). Survey research can
provide some of the essential empirical evidence needed to establish
what teachers need to know and what they need to do when they
conduct classroom assessment. However, survey results need to be
complemented with other empirical evidence of the effect of teacher
characteristics on assessment practices.

Implications for
language teachers

The developments described in this paper have a number of


implications for language teachers. First, teachers need to reflect on
their assessment practices and beliefs and determine how they can
use assessment practices and results to improve student language
learning. Assessment frameworks can guide teachers in organizing
the assessment process in their classrooms. Second, teachers need to
make optimal use of assessments. This means selecting appropriate
assessment procedures based on curricular aims, the assessment
purpose, and the learners. Third, teachers need to attain and
sustain sufficient expertise in assessment to fulfil their professional
responsibilities. In most cases, this will include receiving training in
assessment during a teacher preparation programme and periodic
professional development in assessment thereafter. Finally, teachers
need to investigate their classroom-based assessment practices and
share their findings.
Clearly, we are only beginning to examine issues related to
classroom-based assessment and teachers have a role to play in
advancing research in this area. Case study methodology is a

Classroom-based language assessment

531

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

The language assessment of young learners


Because so little research has been conducted on the assessment of
young learners, many basic issues remain unexplored. These include
such considerations as the kinds of language assessments teachers
report using in the formative assessment of young learners or the kinds
of guidelines, frameworks, or resources teachers report affect their
assessment practices with young learners. It is also important to gain
insights into the extent to which the second language standards and
performance descriptions used with young learners are consistent with
their developmental level and based on theoretical conceptualizations
and empirical research.

particularly relevant research design for teachers who want to explore


aspects of their professional practice, and Ireported in 58/4 (Stoynoff
2004) on how the methodology has been applied to investigations of
ELT practice. Since case study investigations can include qualitative or
quantitative data or both, they can be tailored to the research expertise,
interests, and local circumstances of the teacher-researcher.
Januss ability to look backward and forward was unique among
Roman deities. As ELT professionals, we have the same ability as Janus
and using it ensures we stay abreast of developments and aware of
future directions in our field. Presently, classroom-based assessment
issues are among the least explored and most important in language
assessment. In the years ahead, we can expect more attention to and
increased understanding of them.

532

Stephen Stoynoff

Language Testing Matters. Cambridge: Cambridge


University Press.
Rea-Dickens, P. 2008. Classroom-based language
assessment in E. Shohamy and N. Hornberger
(eds.). Encyclopedia of Language and Education
Volume 7 Language Testing and Assessment. New
York, NY: Springer.
Shepard, L. 2006. Classroom assessment in
R. Brennan (ed.). Educational Measurement
(fourth edition). Westport, CT: Praeger
Publishers.
Stoynoff, S. 2004. Case studies in TESOL
practice. ELT Journal 58/4: 37993.
Stoynoff, S. 2009. Recent developments in
language assessment and the case of four
large-scale tests of ESOL ability. Language
Teaching 42/1: 140.
Taylor, L. 2009. Developing assessment literacy.
Annual Review of Applied Linguistics 29: 2136.
Upshur, J. and C.Turner. 1995. Constructing
rating scales for second language tests. ELT
Journal 49/1: 312.
The author
Stephen Stoynoff is Professor in the Department
of English, Minnesota State University, Mankato,
USA, where he teaches graduate courses in
second language testing, literacy development, and
research methods, in the MA TESL program. He
is co-editor, with Carol Chapelle, of ESOL Tests and
Testing (TESOL 2005)and co-editor, with Christine
Coombe, Peter Davidson, and Barry OSullivan, of
The Cambridge Guide to Second Language Assessment
(2012). He is past editor of TESOL Journal and a
former member of the ELT Journal Editorial Panel.
Email: stephen.stoynoff@mnsu.edu

Downloaded from http://eltj.oxfordjournals.org/ at Universidad de Chile on January 28, 2015

References
Bachman, L. 2007. What is the construct? the
dialectic of abilities and contexts in defining
constructs in language assessment in J. Fox,
M. Wesche, D. Bayliss, L. Cheng, C. Turner, and
C. Doe (eds.). Language Testing Reconsidered.
Ottawa, Canada: University of Ottawa Press.
Davison, C. and C.Leung. 2009. Current issues
in English language teacher-based assessment.
TESOL Quarterly 43/3: 393415.
Fox, J. 2008. Alternative assessment in
E. Shohamy and N. Hornberger (eds.).
Encyclopedia of Language and Education Volume 7
Language Testing and Assessment. New York, NY:
Springer.
Llosa, L. 2011. Standards-based classroom
assessments of English proficiency: a review
of issues, current developments, and future
directions for research. Language Testing 28/3:
36782.
McKay, P. 2006. Assessing Young Language
Learners. Cambridge: Cambridge University
Press.
Morrow, K. 2012. Communicative language
testing in C. Coombe, P. Davidson, B. OSullivan,
and S. Stoynoff (eds.). The Cambridge Guide
to Second Language Assessment. Cambridge:
Cambridge University Press.
Poehner, M. 2009. Group dynamic assessment:
mediation for the L2 classroom. TESOL Quarterly
43/3: 47191.
Prodromou, L. 1995. The backwash effect: from
testing to teaching. ELT Journal 49/1: 1325.
Purpura, J. 2009. The impact of large-scale
and classroom-based language assessments on
the individual in L. Taylor and C. Weir (eds.).

You might also like