ESL/EFL instructors' classroom assessment practices and purposes

ESL=EFL instructors classroom
assessment practices: purposes,

methods, and procedures
Liying Cheng Queens University, Todd Rogers and Huiqin Hu
University of Alberta
Student assessment plays a central and important role in teaching and learn-
ing. Teachers devote a large part of their preparation time to creating instru-
ments and observation procedures, marking, recording, and synthesizing
results in informal and formal reports in their daily teaching. A number of
studies of the assessment practices used by teachers in regular school class-
rooms have been undertaken (e.g., Rogers, 1991; Wilson, 1998; 2000). In
contrast, less is known about the assessment practices employed by instruc-
tors of English as a Second Language (ESL) and English as a Foreign Lan-
guage (EFL), particularly at the tertiary level. This article reports a
comparative survey conducted in ESL=EFL contexts represented by Cana-
dian ESL, Hong Kong ESL=EFL, and Chinese EFL in which 267 ESL or
EFL instructors participated, and documents the purposes, methods, and
procedures of assessment in these three contexts. The ndings demonstrate
the complex and multifaceted roles that assessment plays in dierent teaching
and learning settings. They also provide insights about the nature of assess-
ment practices in relation to the ESL=EFL classroom teaching and learning
at the tertiary level.
I Background to the study

Many students from countries in which English is not their primary
or rst language must demonstrate an adequate level of language
prociency on the Test of English as a Foreign Language (TOEFL)
prior to their admission to tertiary studies in Canada and the USA.
For example, 703 021 students worldwide sat the TOEFL in
200001 (ftp:==ftp.ets.org=pub=toe=10496.pdf). To prepare them-
selves for this examination, many students enroll in short-term
courses (typically 215 weeks long) specically designed to help
them reach the required score. The content and assessment practices
in these courses have been revealed as heavily inuenced by the
Address for correspondence: Liying Cheng, Faculty of Education, Queens University,

Kingston, Ontario K7L 3N6, Canada; email: chengl@educ.queensu.ca
Language Testing 2004 21 (3) 360389 10.1191/0265532204lt288oa # 2004 Arnold

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015
Liying Cheng, Todd Rogers and Huiqin Hu 361
content and format of the TOEFL (Alderson and Hamp-Lyons,

1996; Hamp-Lyons, 1998). However, prior to enrolling in these spe-
cial test preparation courses, the students will probably have com-
pleted more extended English programs. Some will have studied
English as a Second Language (ESL) in a setting like Canada where
English is the dominant language. Others will study English as a
Second Language in a setting like Hong Kong where English and
the students rst language are more or less equally dominant. Still
others will study English as a Foreign Language (EFL) in a setting
like Beijing where English is not the dominant language. Regardless
of setting, the students often take their English courses at a univer-
sity given their intent to enter university and pursue degree edu-
cation. The number of students enrolled in ESL programs in seven
major universities in Canada that oer an ESL program is approx-
imately 3500 per year. The corresponding numbers for Hong Kong
and Beijing are, respectively, 24 000 and 60 million (www.te-china.-
net). The sheer number of students who are studying ESL or EFL
in these three places oers a potential research context in which to
gain a clear understanding of the kind of assessment practices used
by ESL=EFL instructors1 to assess and evaluate their students
learning, and the potential inuence of these practices on this large
number of students learning.
II The central role assessment plays in teaching and learning

Assessment plays a central and important role in teaching and
learning. Every model of the teachinglearning process requires
that teachers base their decisions instructional, grading, and
reportingon some knowledge of student attainment of and pro-
gress towards desired learning outcomes (Anderson, 1989; 1990;
educational Testing Service, 1995; Wilson, 1998; 2000). The com-
plexity of classroom assessment the process by which inferences
are drawn about the knowledge, skills, attitudes, and behaviors pos-
sessed by each student in a class is seen from the range and fre-
quency of teachers decisions (see, e.g., Davison, this issue) and the
plethora of student learning outcomes and behaviors (see Scott and
Erduran, this issue) that teachers consider when making these deci-
sions and associate the assessment purposes with their chosen
approaches and procedures (Shulman, 1986; Calderhead, 1996).
1
We have chosen to use the term instructors to refer to those who are teaching ESL=EFL at
the tertiary level, and the term teachers to refer to those who are teaching in the school
system.

362 ESL=EFL instructors classroom assessment practices
The study of the assessment practices of ESL=EFL teachers and

instructors is somewhat limited. The few studies that have been con-
ducted have focused on the inuence of external testing on
ESL=EFL teachers in Hong Kong and China (Li, 1990; Andrews
and Fullilove, 1993; He, 1996; Cheng, 1999; Yang, 1999). Not unex-
pectedly, the assessment methods and procedures identied in these
two locations tend to mirror those of the external tests, especially
towards the end of the term just before the administration of the
examinations when the teachers felt obliged to coach their students
for the tests. Recently, studies investigating classroom-based assess-
ment practices within the ESL=EFL school contexts have begun to
appear. For example, Davison and Leung (2001; 2002) examined
changing interpretations of construct validity by teachers of English
as an additional=second language in English-medium schools in
Britain, Australia, and Hong Kong. Davison (2002) argued that
teachers could only develop an understanding of the inherent diver-
sity of their professional community and enhance the validity of
their assessments through questioning their own and other teachers
assessment decisions. She called for further studies into the class-
room assessment practices in those language classrooms. Cumming
(2001) interviewed 48 highly experienced ESL=EFL instructors
about their writing assessments at tertiary level or in immigrant
settlement program courses. He found that the instructors con-
ceptualization of student assessment varied depending on whether
the courses were for general or specic purposes for learning Eng-
lish. This study indicated the clear linkage of instructional and
assessment purposes in relation to the instructors assessment practi-
ces. Breen et al. (1997) conducted a comprehensive study on how
teachers made judgments about their ESL childrens progress and
achievements in learning English in dierent Australian schools.
This group of researchers investigated teachers views and uses of
assessment frameworks, the relationship between assessment and
instruction, and the purposes and consequences of assessment. This
research provides a direct link for the present study, which inves-
tigates ESL=EFL instructors assessment purposes, methods, and
procedures in relation to their teaching and learning contexts.
In general education, several studies of the assessment and evalu-
ation practices used by elementary and secondary classroom teach-
ers have been completed in Canada and the USA (Rogers, 1991;
Wilson, 1998; 2000). These have shown that while the teachers in
these studies valued classroom assessment as an instructional tool
with benets for their students, formative purposes were seen to
give way to summative purposes with increasing degree. Further-
more, while the teachers felt that they were in charge and preferred

to create their own assessments, external expectations, often in the

form of mandated external testing, colored many of their assessment
data gathering procedures (on the tensions experienced by teachers
in the implementation of national teacher assessment standards and
frameworks, see Arkoudis and OLoughlin, this issue). Conse-
quently, although the teachers reported that they used a variety of
assessment methods, their most frequently reported item formats
corresponded most often with those used by formal external exami-
nations, i.e., objectively scored (e.g., truefalse, matching, multiple-
choice) and short-answer=completion paper-and-pencil items. They
focused their assessment items on factual knowledge and lower
order thinking skills; their assessments often provided little indi-
cation of the attainment of higher cognitive processes. In turn,
student grades were based more on low level functioning and less on
higher level functioning, reecting the external testing.
However, as suggested earlier, less is known about the assessment
and evaluation practices used by ESL=EFL instructors at the ter-
tiary level. Therefore, given:
. the large number of students who nd it necessary to study EFL
or ESL;
. the inuence of high-stakes external testing on ESL=EFL teach-
ing and learning; and
. the central role that student assessment plays in the teaching
and learning process.
The purpose of the present study was to conduct a comparative
analysis of the assessment practices used by the instructors at uni-
versities in three dierent ESL=EFL contexts in Canada,
Hong Kong, and Beijing. Presented in this article are some results
from the survey conducted during the rst year (200001) of this
three-year study. The research question explored in this survey is as
follows. What assessment purposes, methods and procedures
do ESL=EFL instructors report using in their student assessment
in three dierent tertiary contexts?
III Methodology
For the purposes of the present study, assessment is dened as the
process of collecting information about a student to aid in decision-
making about the progress and language development of the
student. Evaluation is dened as the interpretation of assessment
results that describes the worth or merit of a students perfor-
mance in relation to a set of learner expectations or standards of
performance.

1 Survey questionnaire
The survey questionnaire (see Appendix 1) comprised ve parts
illustrating major constructs in classroom assessment (see Code for
Fair Testing Practices for Education, 1988; Rogers, 1993; Standards
for Teacher Competence in Educational Assessment of Students,
1990), the design of which was based on the studies reviewed in Sec-
tion I above. The questionnaire was piloted among a small number
of instructors in each of the three contexts. Approximately 40
minutes was required to complete the survey questionnaire.
2 Samples
Purposive sampling was employed to select ESL=EFL instructors
teaching at universities in the provinces of Alberta, British Colum-
bia, and Ontario in Canada and in Hong Kong and Beijing in
China. The three samples represented, respectively, three ESL=EFL
instructional settings: English-dominant, bilingual (English and
Cantonese), and Mandarin-dominant as it was expected that teach-
ing and learning contexts would dier. In each of these locations,
ESL=EFL instructors at each university that formally oered an
ESL=EFL program were sent a questionnaire and a self-addressed
envelope. Four researchers coordinated the study - two in the West
and East of Canada respectively, one in Hong Kong and one in
China. Wherever possible, meetings were held with the co-coordi-
nator at each university to explain the purpose of the study and to
answer any questions prior to the distribution of the questionnaires.
A glossary of assessment terms was also provided to promote the
validity of the questionnaire. Altogether 461 questionnaires 191
in Canada, 140 in Hong Kong, and 130 in Beijing were dis-
tributed. Of this number, 98 (51.3%) were returned in Canada, 45
(32.0%) in Hong Kong, and 124 (95.3%) in Beijing.
3 Analyses
The responses to the survey questionnaire were entered into a com-
puter le with 100% verication. Examination for missing item level
data revealed that the amount of missing items was such that 4
respondents were excluded from the analysis. The nal sample sizes
were 95 for Canada, 44 for Hong Kong, and 124 for Beijing.
Descriptive analyses were used to summarize the bio-demo-
graphic information provided by the respondents. These analyses
revealed that it was not possible to cross any of the bio-demo-
graphic and teaching variables with setting due to insucient
sample sizes in some of the cells. Consequently, the comparative

analyses were restricted to the three settings. Two non-parametric

analyses were performed on the responses to the items contained in
the rst three parts of the questionnaire. The non-parametric ap-
proach was adopted because of the dichotomous nature of the vari-
ables. First, the median test for k 3 independent groups was used
to compare the total number of purposes, assessments methods
used, and assessment procedures followed among the three settings
(Siegel, 1956). Second, Marascuilos (1966) v2 square test for k 3
independent binomial samples was used to analyze the dichotomous
responses to the items in the rst three parts of the questionnaire:
X 3
v2 Wk pk p0 2 v22 ;
k1
where pk is the proportion of respondents in sample k who
responded yes to the item in question;
1
Wk ;
var pk
and
P3
Wk pk
k1
p0 K :
P
Wk
k1
Schee type condence intervals were used to complete post-hoc

analysis following a signicant omnibus v2 value:
s
q
pk 1 pk pk0 1 pk0
pk pk0 0:95 v22 :
nk nk0
Each item was analysed separately given the research interest was at
the item level.
Given the exploratory nature of this study and the belief that the
consequences of Type II error would be more costly than the conse-
quence of a Type I error, all analyses were completed at the 0.05 level
of signicance. Signicant pair-wise contrasts among the three settings
were claimed only if the dierences reected the transitivity property.2
2
For example, as will be shown, there was signicant dierence for the purpose formally
document growth in learning of my students. However, the pattern of dierences among the
three settings did not meet the property of transitivity. While the percentage for Beijing was
signicantly lower than the percentage for Canada, the dierence between Canada and Hong
Kong and between Beijing and Hong Kong was not signicant. In cases such as these, the
signicant dierence is not claimed.

IV Results
This section provides background information on the instructors in
the three contexts, as these variables inuence and determine the
assessment practices (Breen et al., 1997). This is followed by the
results on the purposes, methods, and procedures of assessment
used by ESL=EFL instructors.
1 Description of participants
The percentage of male instructors in Hong Kong (36.4%) out-
numbered those in either Canada (14.7%) or Beijing (14.5%). Those
in Beijing were younger than the instructors in Canada and Hong
Kong (83.9% vs. 38.9%; with 34.1% under 41 years of age). Re-
garding educational qualications, all but 3 instructors in Canada
and 5 instructors in Beijing possessed a university degree, but there
were dierences in the highest degree attained: 90.7% of the instruc-
tors in Hong Kong possessed a masters or doctoral degree in com-
parison to 55.8% in Canada and 59.5% in Beijing. Lastly, slightly
more than 2 out of 5 instructors in Beijing reported that they had
attended a course or workshop of more than three hours in which
the topic of assessment and evaluation was considered. In contrast,
approximately 4 out of 5 instructors in Canada and Hong Kong
indicated they had attended a full course on assessment and evalu-
ation (45.3%, 43.1%) or a course in which assessment and evalu-
ation were key topics (31.6%, 52.3%).
The instructors in Hong Kong had more years of ESL=EFL
teaching experience than their counterparts in Canada and in
Beijing; respectively, 17.16, 13.58, and 10.32 years. More than 90%
of the instructors in Hong Kong and Beijing had full-time teaching
appointments, as compared to approximately 75% in Canada. The
majority of courses taught in Hong Kong and Beijing were univer-
sity degree courses (93.2% and 93.7%); in Canada they were evenly
divided between degree (52.6%) and diploma=certicate courses
(45.3%). The ranges and mean numbers of classes taught were simi-
lar across the 3 groups, with the average class size in Beijing the
largest (45 students) followed, in turn, by Hong Kong (19 students)
and Canada (15 students).
2 Purposes of assessment and evaluation

In Part 1 of the survey, the respondents were asked to indicate
which of 13 purposes corresponded to their own purposes for
assessing and evaluating their students. The percentages of respond-
ents in Canada (C), Hong Kong (H), and Beijing (B) who selected

each purpose are reported in Table 1 together with identication of

the signicant (p < 0.05) dierences identied employing Mar-
ascuilos v2-testing sequence for each purpose. After the initial
analysis, the purposes for assessment are organized by three under-
lying constructs: student centered-, instruction-, and administration-
based assessment purposes.
The instructors in Hong Kong identied a signicantly fewer
number of assessment purposes in their teaching than did the
instructors in Canada and Beijing. For example, the median number
of purposes identied by the instructors in Hong Kong was 8, which
is signicantly smaller than median number of purposes identied
Table 1 Purpose of assessment and evaluation
Purpose Canada Hong Kong Beijing Sig. diff.
n % n % n %
Student centered
Obtain information on 94 98.8 42 97.4 124 97.9
my students progress
Provide feedback to my 95 98.8 44 100.0 119 86.6 B < (C H)
students as they progress
through the course.
Diagnose strengths and 94 97.6 43 89.7 121 86.6
weaknesses in my students
Determine final grades 94 91.7 43 92.3 115 75.3 B < (C H)
for my students
Motivate my students 94 86.9 43 79.5 124 93.8
to learn
Formally document growth 93 82.1 41 64.1 111 51.6
in learning of my students
Make my students work 91 64.3 41 69.2 119 94.8 (C H) < B
harder
Prepare students for tests 94 53.4 41 7.7 121 68.0 H < (C B)
they will need to take
in the future (e.g., TOEFL,
MELAB, CET)
Instruction
Plan my instruction 94 92.9 42 66.7 117 87.6 H < (C B)
Diagnose strengths and 94 76.2 43 64.1 119 92.8 (H C) < B
weaknesses in my own
teaching and instruction
Group my students at 91 65.5 41 5.1 119 60.8 H < (C B)
the right level of instruction
in my class
Administration
Provide information to the 93 83.3 44 89.7 116 65.0 B < (C H)
central administration
Provide information to an 88 16.7 40 12.8 108 8.2
outside funding agency

by those in Canada and Beijing (v2 1451; p < 0.05). Further,

while the median ranked purpose for Hong Kong was identied as
66.7% (Plan my instruction; see Table 1), the median ranked pur-
poses in Canada and Beijing were identied, respectively, by 83.3%
(Provide information to the central administration) and 86.6% (Pro-
vide feedback to my students as the progress through the course;
Diagnose strengths and weaknesses in my students).
a Student centered purposes: Turning now to the individual pur-

poses within each of the three settings, we identify ve most com-
mon purposes as being student centered:
. obtain information on my students progress;
. provide feedback to my students as they progress through the
course;
. diagnose strengths and weaknesses in my students;
. determine nal grades for my students; and
. motivate my students to learn.
There were signicant dierences among the three settings for two
of these purposes. While all but one instructor in Canada and all
the instructors in Hong Kong reported that they used assessment
and evaluation to provide feedback to their students during the
course, 86.6% of the instructors in Beijing indicated that they did
so. Second, while approximately 9 out of 10 instructors in Canada
and Hong Kong used assessment and evaluation to determine the
nal grades for their students, approximately three-quarters of the
instructors in Beijing indicated they did the same. Signicant dier-
ences were also found for 2 of the 3 remaining student centered pur-
poses. While approximately 95% of the instructors in Beijing
reported that they used assessments and evaluations to make their
students work harder, slightly less than 65% of the instructors in
Canada and 70% of the instructors in Hong Kong reported that
they did this. On the other hand, a much smaller percentage of
instructors in Hong Kong (9.8%) than in Canada (55.3%) and
Beijing (70.2%) used assessments and evaluations to prepare their
students for external tests that their students needed to take in the
future, which links directly to prevailing instructional goals in
Canada and Beijing, i.e., the preparation for their students to
continue their degree education in English-medium universities.
b Instructional purposes: A greater number of instructors in

Canada and Beijing than in Hong Kong used the results of their
student assessments and evaluations to plan and improve their own
instruction. Approximately 9 out of 10 instructors in Canada and

Beijing reported using the results from their student assessments and
evaluations to plan their instruction, in contrast with two-thirds of
the instructors in Hong Kong. Further, slightly more than nine out
of 10 instructors in Beijing used their assessment results to diagnose
strengths and weaknesses in their own teaching and instruction,
compared with approximately 3 out of 4 instructors in Canada and
less than two out of three instructors in Hong Kong. Lastly, while
between 6 and 7 out of 10 instructors in Canada and Beijing used
their assessments and evaluations to group their students for in-
structional purposes, only 1 out of 20 instructors in Hong Kong
used the results for this purpose.
c Administrative purposes: 85% of the instructors in Canada and

90% of the instructors in Hong Kong indicated they provided as-
sessment and evaluation information to their central administration,
compared to 65% of the instructors in Beijing. Only a small percent-
age of instructors in each setting indicated that they provided assess-
ment and evaluation information to an outside funding agency.
3 Assessment methods for reading, writing, and speaking=listening

In this section, ESL=EFL instructors classroom assessment practi-
ces in reading, writing, and speaking=listening are reported. The fol-
lowing three categories below are used to categorize the ndings in
each skill:
. instructor-made assessment methods;
. student-conducted assessment methods;
. standardized testing in reading, writing, and speaking=listening.
These three categories are derived from work in educational assess-
ment, notably Rogers, 1993; Stiggins, 1997; Young and Wilson,
2000. They also represent the assessment constructs designed in this
survey (Code for Fair Testing Practices for Education, 1988; Stand-
ards for Teacher Competence in Educational Assessment of Stu-
dents, 1990; Rogers, 1993). Instructor-made assessment methods in
this article refer to those assessment methods designed and adminis-
tered by instructors, whereas student-conducted assessment methods
are those that directly involve students participation in the assess-
ment process. It should be pointed out that other terms such as
select-, supply-type or performance-based assessments also exist to
refer to the dierent assessment methods in the eld of education
(Wilson, 1996; Linn and Gronlund, 2000). The above three catego-
ries are chosen as they best summarize the assessment methods
reported in this survey.

a Assessing reading: The instructors in Canada and Beijing

reported using a greater variety of assessment methods to assess
the reading performance of their students than those in Hong
Kong. The median number of methods used by the instructors in
Canada and Beijing, 10, was signicantly greater than that
reported for Hong Kong, 3, (v2 6.38; p < 0.05). Further, as
shown in Table 2, at least 80% of the instructors in Beijing
reported using short answer items; truefalse and multiple-
choice items; summaries prepared by the students of what they,
the students, had read; oral interviews=dialogues; and stand-
ardized reading tests. At least 80% of the instructors in Canada
used short answer completion items or student summary writing
to assess students reading performance, and approximately 75%
used matching items and interpretative items. In contrast, the
most popular procedures used in Hong Kongshort completion
items and student summaries were used by two-thirds of the
instructors; the remaining methods were used by less than half of
the instructors in Hong Kong.
. Instructor-made assessment methods: Signicant dierences
among the three settings were found for 6 of the 9 types of items
Table 2 Methods of assessing reading
Assessment methods Canada Hong Kong Beijing Sig. diff.

(n 70) (n 15) (n 122)
Instructor-made
Short answer items 82.9 66.7 86.1
Matching items 75.7 27.7 61.5 H < (C B)
Interpretive items 75.7 40.0 50.8 (H B) < C
Truefalse items 72.9 33.3 86.9 H < (C B)
Multiple-choice items 68.6 46.7 91.8 (H C) < B
Cloze items 62.9 40.0 69.7
Sentence completion items 61.4 20.0 63.9 H < (C B)
Editing 50.0 6.7 48.4 H < (C B)
Completion of forms 27.1 6.7 20.5
(e.g., application)
Student-conducted
Student summaries of 91.4 66.7 89.3
what they read
Student journal 61.4 6.7 19.7 (H B) < C
Oral interviews=questioning 58.6 40.0 85.2 (H C) < B
Peer assessment 47.1 20.0 28.7
Read aloud=dictation 37.1 20.0 68.8 (H C) < B
Self assessment 40.0 26.7 41.0
Student portfolio 35.7 20.0 13.9
Non-instructor developed
Standardized reading test 27.1 6.7 83.6 H<C<B

included in written tests developed by the instructors to assess

reading (see Table 2). The percentages of instructors in Canada
and Beijing who used matching items (75.7%, 61.5%),
truefalse items (72.9%, 86.9%), sentence completion items
(61.4%, 63.9%), and items in which the students edited a piece
of writing (50.0%, 48.4%) were signicantly greater than the
corresponding percentages of instructors in Hong Kong who
used these methods (27.7%, 33.3%, 20.0%, and 6.7%). For one
of the remaining two items for which signicant dierences were
found, a greater percentage of instructors in Canada (75.7%)
than in Hong Kong (40.0%) and Beijing (50.8%) used inter-
pretive items. In contrast, slightly more than 90% of the in-
structors in Beijing used multiple-choice items while just over
two-thirds of the instructors in Canada and less that half of
the instructors in Hong Kong did so.
. Student-conducted assessment methods: With the exception of
the assessment method that required students to summarize
what they had read, the instructors generally made less use of
instructor-made methods not included in written tests of reading
(see Table 2). Nevertheless, signicant dierences among the set-
tings were found for 3 of the 7 instructor-made items that were
not included in written tests of reading. A greater percentage of
instructors in Canada (61.4%) than in Beijing (19.7%) and
Hong Kong (6.7%) used student journals to assess the reading
achievement of their students. In contrast, the percentage of
instructors in Beijing who used oral interviews and questioning
(85.2%) was signicantly greater than the percentages of
instructors in Canada (58.6%) and Hong Kong (40.0%) who
did so. A similar but lower value pattern was observed for the
use of read aloud=dictation.
. Standardized testing: As shown in Table 2, approximately 8
out of 10 instructors in Beijing use standardized reading tests to
assess their students in contrast with approximately 3 out of 10
in Canada and 1 out of 10 instructors in Hong Kong.
b Assessing writing: In the case of writing, the median number

of methods used by instructors in Canada to assess the writing
performance of their students, 6, is signicantly greater than the me-
dian numbers used by the instructors in Hong Kong, 3, and Beijing,
4, (v2 22.08; p < 0.05). This nding is supported by the results
reported in Table 3. At least 85% of the instructors in all three set-
tings used short essays to assess students writing performances. Of
the remaining 10 methods for assessing writing, 6 were used by at

least half of the instructors in Canada but only by one instructor in

Hong Kong and by two instructors in Beijing.
. Instructor-made assessment methods: In contrast to the popu-
lar and equal use of short essays in all three settings, the percen-
tages of instructors who reported using long essays diered
among the three settings: more than half of the instructors in
Canada (55.4%) and Hong Kong (72.1%) and less than two of
10 instructors in Beijing. A signicantly greater percentage of
instructors in Canada (86.5%) than in Hong Kong (53.5%) and
Beijing (69.5%) included items in their written tests that
required their students to edit a sentence or paragraph. Approx-
imately a third of the instructors in Canada and half in Beijing
reported using multiple-choice items to assess their students
ability to identify grammatical errors in a sentence; fewer than 2
out of 10 instructors in Hong Kong used this method. Although
not frequently used, the percentage of instructors in Canada and
Beijing (25.7%) who used matching items to assess writing is
signicantly greater than the percentage in Hong Kong (2.3%).
. Student-conducted assessment methods: Student journals are
used signicantly more frequently in Canada than in Hong Kong
and Beijing: approximately 3 out of 4 instructors in Canada, 2
out of 10 instructors in Hong Kong, and 3 out of 10 instructors
in Beijing. Greater percentages of instructors in Canada (55.4%)
and Hong Kong (44.2%) than in Beijing (10.5%) use student
portfolios. The use of peer- and self-assessments to assess
Table 3 Methods of assessing writing
Assessment methods Canada Hong Kong Beijing Sig. diff.

(n 74) (n 43) (n 105)
Instructor-made
Short essay 91.9 86.0 88.6
Editing a sentence or paragraph 86.5 53.5 69.5 (H B) < C
Long essay 55.4 72.1 13.3 B < (C H)
Multiple-choice items to identify 32.4 14.0 48.6 H < (C B)
grammatical errors in a sentence
Matching items 25.7 2.3 25.7 H < (C B)
Truefalse items 16.2 4.6 26.7
Student-conducted
Student journal 73.0 18.6 31.4 (H B) < C
Peer assessment 60.8 41.9 38.1
Student portfolio 55.4 44.2 10.5 B < (C H)
Standardized writing test 27.0 14.0 75.2 (H C) < B

writing did not vary signicantly among the three settings (see
Table 3).
. Standardized testing: Like reading, the use of a standardized
measure, in this case writing, in Beijing is signicantly greater
(75.2%) than in Canada (27%) and Hong Kong (14%).
c Assessing speaking and listening: As with reading, the instruc-

tors in Canada and Beijing used a greater variety of assessment
methods to assess speaking and listening. The median numbers of
methods used by the instructors in Canada and Beijing, 8 and 10,
are signicantly greater than the median number used by the
instructors in Hong Kong, 4 (v2 28.54; p < 0.05). The most com-
mon format for assessing speaking and listening in all three settings
is student oral presentations.
. Instructor-made assessment methods: Three of the assessment
methods taking notes, preparing summaries, and responding
to a multiple-choice item following an aural input can be
administered in a written form to a class of students. The per-
centages of instructors in Canada and Beijing who indicated
they used note taking (70.3%, 70.5%) and summaries of what is
heard (69.1%, 75.6%) are signicantly greater than the corre-
sponding percentages (20.5%, 30.8%) of Hong Kong instructors
who use these methods. For the third assessment method in this
set, slightly more that 80% of the instructors in Beijing used the
multiple-choice method to assess listening while approximately
60% of the instructors in Canada and 15% of the instructors in
Hong Kong did so.
. Student-conducted assessment methods: Given the nature of
speaking and listening, it is unsurprising that 11 instructor-made
methods involved nonwritten material. Indeed, of all of the
methods, oral presentations were the most popular in all three
settings. Of 7 of the remaining 10 methods, the percentages of
instructors in Canada and Beijing who used each method were
signicantly greater than the corresponding percentages of
instructors in Hong Kong (see Table 4). For example, while the
majority of instructors across all three contexts reported using
retelling a spoken story to assess speaking and listening, only 1
out of 10 instructors in Hong Kong used this method. Further,
approximately 80% of the instructors in Canada and Beijing
conducted oral interviews=dialogues to assess their students
speaking and listening, compared with approximately 50% in
Hong Kong. Likewise, approximately 73% of the instructors in
Canada and Beijing used oral discussions with their students to

Table 4 Methods of assessing speaking and listening
Assessment method Canada Hong Kong Beijing Sig. diff.

(n 81) (n 39) (n 119)
Instructor-made
Take notes 70.3 20.5 70.5 H < (C B)
Prepare summaries of what is heard 69.1 30.8 75.6 H < (C B)
Multiple-choice items following 58.0 12.8 83.2 H<C<B
listening to a spoken passage
Student-conducted
Oral presentation 95.1 94.9 88.2
Oral interviews=dialogues 77.8 51.3 79.8 H < (C B)
Oral discussion with each student 72.8 43.6 73.1 H < (C B)
Retell a story after listening to a passage 71.6 10.2 85.7 H < (C B)
Provide an oral description 62.9 23.1 71.4 H < (C B)
of an event or thing
Peer assessment 51.8 48.7 21.8 B < (C H)
Oral reading=dictation 49.3 12.8 79.0 H < C B)
Follow directions given orally 39.5 15.4 41.1 H < (C B)
Public speaking 37.0 20.5 53.8
Give oral directions 29.6 7.7 39.5 H < (C B)
Standardized speaking test 18.5 12.8 32.8
Standardized listening test 25.9 5.1 79.0 H < C < B}
assess speaking and listening, as compared with approximately

44% in Hong Kong. Although the reported frequency of use of
other assessment methods generally decreases, similar dierences
exist for orally describing an event, oral reading=dictation, and
following and giving oral directions. Of the remaining instruc-
tor-made methods, peer assessment is used by half of the
instructors in Canada and Hong, in contrast to approximately
20% of the instructors in Beijing. No consistent signicant dif-
ferences among the three settings were observed in the use of
self-assessment and public speaking.
. Standardized testing: As shown in Table 4, with the exception
of the use of standardized listening tests by instructors in
Beijing, less than one-third of the instructors in the 3 settings
used standardized tests to assess speaking and listening. As with
reading and writing, the use of a standardized listening test in
Beijing (8 out of 10 instructors) is signicantly greater than in
either Canada or Hong Kong.
4 Procedures and assessment and evaluation

In this part of the survey, the respondents were asked to indicate
the source(s) of their test items, the method(s) they used to provide

Table 5 Sources of assessment items
Source Canada Hong Kong Beijing Sig. diff.

(n 81) (n 39) (n 119)
Instructor 98.9 81.5 84.7 (H B) < C

Other instructors 57.9 72.7 65.3
Print sources
Textbooks 69.5 20.5 91.9 H<C<B
Published assessments 15.8 13.6 81.5 (C H) < B
Mandated syllabus= 14.7 27.3 46.8 (C H) < B
Curriculum guide
Internet 18.9 11.4 19.4
feedback and to report to their students, and how much time they
devoted to assessment related activities in relation to their teaching.
a Source of assessment items and tasks: All but one instructor in

Canada indicated that they developed their own assessments; 8 out of
10 instructors in Hong Kong and in Beijing did likewise. Approx-
imately 60% of the instructors in Canada, 70% of the instructors in
Hong Kong, and 65% of the instructors worked together to develop
their assessments. Additionally, as shown in Table 5, a greater per-
centage of instructors in Beijing than in Canada and Hong Kong
obtained their assessment items from available printed sources, with
9 out of 10 instructors in Beijing reporting textbooks as a resource;
this contrasts with 2 out of 10 for Hong Kong. Similarly, while 80%
of the instructors in Beijing used items and tasks from published
assessments, a much smaller percentage of the instructors in Canada
(15%) and instructors in Hong Kong (13%) reported these as a re-
source for their own assessment practices. Whilst not as great, the
percentage of instructors in Beijing (46.8%) who used a mandated
syllabus or curriculum guide to inform their assessment practices was
signicantly greater than in Canada (14.7%) and Hong Kong
(27.3%). Fewer than 20% of the instructors across all settings
reported obtaining assessment items and tasks from the Internet.
b Methods for providing feedback and reporting: Once an assess-

ment method has been administered and scored, the information
derived from the results needs to be communicated to the students.
This feedback and reporting can take place during instruction and
at the end of the course in the form of a nal report.
. During the course: The median number of ways of providing
student feedback during courses was 5 in Canada and 4 in both
Beijing and Hong Kong. In all three settings, this feedback was

Table 6 Form of feedback provided to students
Canada Hong Kong Beijing Sig. diff.

(n 95) (n 44) (n 124)
During the course

Verbal feedback 95.8 90.9 89.5
Written comments 95.8 95.5 77.4 B < (C H)
Conference with 80.0 72.7 42.7 B < (C H)
students
Total test scores 73.7 29.5 75.0 H < (C B)
Letter grades 49.5 72.7 35.5 (C B) < H
Checklist 48.4 43.2 37.1
Final report
Written comments 54.7 59.1 54.0
Conference with students 58.9 27.3 28.2 (H B) < C
Total test scores 70.5 25.0 87.9 H<C<B
Letter grades 62.1 86.4 29.0 B<C<H
Checklist 20.0 13.6 29.0
most often presented verbally or in the form of written com-

ments. Similarly a signicantly greater percentage of instructors
in Canada (80.0%) and Hong Kong (72.7%) than in Beijing
(42.7%) provided feedback to their students in a separate con-
ference with each student. While approximately 3 out of 4
instructors in Canada and Beijing provided feedback as total
test scores, only 3 out of 10 instructors in Hong Kong used this
form of feedback. In contrast, 3 out of 4 instructors in Hong
Kong used letter grades as feedback, with less than half of the
instructors in Canada and Beijing doing so. The dierences
reported in the use of checklists as a feedback mechanism during
the course across the three settings were not found to be
signicant.
. Final report: For the nal reporting, the medians were lower
but followed the same pattern observed for feedback during the
course: 3 for Canada and 2 for both Beijing and Hong Kong.
Total test scores and=or letter grades were the most common
forms of reporting in the three settings. However they were not
equally used: 87.9% of the instructors in Beijing used total
scores to report the nal results to their students, as compared
with 25.0% in Hong Kong. Further, whilst 86.4% of instructors
in Hong Kong used letter grades to report the nal results, only
29.0% reported this practice in Beijing. Student conferences as a
means of reporting nal results to students were reported by
approximately 60% of instructors in Canada and approximately
30% of the instructors in both Hong Kong and Beijing. Between
54% and 60% of the instructors in each location also provided

written comments. End-of-course checklists were less frequently

used by all instructors (20.0% in Canada; 13.6% in Hong Kong;
29.0% in Beijing).
c Time spent on assessment and evaluation: The distribution of

time spent by instructors on assessment and evaluation as a percent-
age of total time spent on instruction, assessment, and evaluation is
presented in Table 7 for each setting. However, there were no
signicant dierences found among the three distributions (v210
11:50; n.s.). The mean percentage of time ESL=EFL instructors
spend on instruction and assessment in the three contexts is 24.0,
23.6, and 25.5 for Canada, Hong Kong, and Beijing.
V Discussion and conclusions

Taken together, these survey ndings illustrate some of the com-
plexity of assessment and evaluation practices in ESL=EFL courses
oered at the tertiary level. These complex practices are revealed
across:
. assessment purposes: student-centered, instruction-centered,
and administration-related;
. methods: instructor-made, student-made assessments, and
standardized testing in reading, writing, and speaking=listening;
and
. procedures: sources of assessment, feedback and reporting, and
the time spent on assessment.
Further, they illustrate that the assessment and evaluation practices
instructors report using within these university-based ESL=EFL
courses vary both within and across the dierent settings. This is
not surprising considering the dierences in instruction in the three
settings, and the interplay between instruction and assessment
Table 7 Time spent on assessment and evaluation
Proportion of total time Canada (n 94) Hong Kong (n 43) Beijing (n 122)
10% or less 8.5 9.3 13.9

1115% 12.8 23.3 15.5
1620% 21.3 14.0 22.1
2130% 28.7 23.3 21.3
3140% 13.8 11.6 13.9
40% or more 14.9 18.6 24.6
Mean 24.0 23.6 25.5

(Wilson, 1998; 2003) which inuences the day-to-day decisions

ESL=EFL instructors make in their classrooms. What are fascinat-
ing are the kinds of dierences and similarities that we found in our
study, and the potential reasons that may have contributed to these.
The dierences in instruction we found in the three settings are
derived from the nature of the courses themselves, the teaching experi-
ences of ESL=EFL instructors, their knowledge in assessment, the
needs and levels of the students, the teaching and learning environ-
ment (e.g., size of the classes), and the role and impact of external
testing on teaching and learning. These external and internal factors in
instruction within the three settings seem to have collectively con-
tributed to the varying dierences in the assessment decision-making
of ESL=EFL instructors in our study. The pattern of dierences,
interestingly, lies either between instructors in Hong Kong and
Canada vs. instructors in Beijing, or instructors in Hong Kong vs.
Canada and Beijing. For example, instructors in Hong Kong ident-
ied signicantly fewer assessment purposes and used fewer objec-
tively scored assessment methods than their counterparts. This
could be explained by the nding that instructors in Hong Kong
are more experienced, better qualied, and relatively more knowl-
edgeable in assessment. In addition, instructors in Hong Kong did
not report any existing external testing dominating the teaching of
English, whilst in China every university student in their EFL
course needs to pass the national College English Test (CET) so
that they can obtain their bachelor degrees (Cheng and Gao,
2002). In Canada, the majority of the students in the English
language programs in our study need to pass TOEFL to gain
admission into university degree programs.
Regarding purposes of assessment and evaluation, a greater num-
ber of instructors in Canada and Hong Kong reported using more
assessments for student-centered purposes whereas in Beijing the
focus was on assessments for instructional purposes. More instruc-
tors in Canada and Beijing than in Hong Kong reported use of
assessment results to plan their own instruction in the preparation
of students for standardized tests such as the TOEFL and (in
Beijing) the national College English Test (CET). More instructors
in Canada and Hong Kong than in Beijing provided information to
their central administration indicating the service role language
programs play and the nature of the programs in these two
contexts.
Turning to the assessment of students reading, writing, speaking,
and listening in ESL=EFL classroom settings, the most common
reading assessment strategies used in Canada and Hong Kong were
student summaries of what is read and short answer items; in

Beijing, many more formatted assessment methods, such as

multiple-choice items and standardized reading tests, were reported;
and in Hong Kong student summaries of what was read and short
answer items were the most popular forms of assessment. Student-
constructed assessments, such as student journals and portfolio,
were used to a greater extent in Canada followed, in turn, by Hong
Kong and Beijing. However, dierences among the three settings
were not as great in respect of speaking and listening. For example,
retelling a story was used more or less equally across the three set-
tings. This nding may be traced to the nature of speaking and lis-
tening in language education where retelling a story is commonly
used. The observation that for the most part the assessment
methods reported for Beijing teaching and learning contexts are
more structured than in either Hong Kong or Canada is likely to be
attributable to the two standardized testing programs mentioned
above, as well as to larger class sizes the average class size in
Beijing being twice and three times the average class size in Hong
Kong and Canada, respectively. Furthermore, instructors in Beijing
appear to be more reliant on available print sources (e.g., published
textbooks, published mandated syllabuses=curricula) for their
assessments, which may be related to the nding that instructors in
Canada and Hong Kong possess more years of teaching experience
and are better trained in assessment than their counterparts in Beijing.
Oral feedback was found to be the most common form of feed-
back provided to students during their courses, but was reported to
a lesser extent by the Beijing instructors. This feedback was used in
conjunction with test scores in Canada and Beijing but with letter
grades in Hong Kong. In Canada, instructors reported greater use
of student conferencing. Total scores and letter grades were the two
most common forms for nal reporting in the three contexts, with
total scores most frequently reported in Beijing, letter grades most
frequently reported in Hong Kong, and both total scores and letter
grades most frequently in Canada. Test scores and letter grades
were accompanied by written comments for about half of the
instructors across all three settings. The dierences in reporting may
be due to the greater latitude for instructors in the Canadian context
in contrast to more structured reporting systems in Hong Kong and
Beijing. This latitude, in turn, may reect the dierences noted in class
size. Less structured assessments require more time to mark and,
therefore, may not be feasible in the large classes reported in Beijing.
Despite the dierences in the purposes, assessment methods, and
reporting procedures across the three settings, the proportion of
total time spent on assessment in relation to instruction in all three
settings was, surprisingly, essentially the same as were the number

of courses taught by the instructors in each setting. A likely expla-

nation for this is that the greater use in Hong Kong and Canada of
assessment methods that require more marking time (judge and
score) is oset by the greater amount of time needed to process the
more widespread use of objectively scored assessments in Beijing
due to the marked dierences in class size and the inuence of the
objectively scored CET.
As in elementary and secondary schools, assessment and evalu-
ation in university-based ESL=EFL courses is a necessary, but com-
plex undertaking reecting the teaching and learning context where
assessment takes place. Further, while the purposes served by the
assessment methods and procedures used demonstrate the central
role that assessment plays in ESL=EFL teaching and learning, the
assessment methods and procedures used are diverse in these three
locations. Breen et al. (1997) and Gullickson (1984) have suggested
that teachers beliefs and attitudes inuence their assessment and
evaluation practices (see Arkoudis and OLoughlin, Davison, and
Edelenbos and Kubanek-German, this issue). In a companion arti-
cle to this, Rogers et al. (under review) have found that the
ESL=EFL instructors in the three settings held, with the exception
of the purposes and uses of assessment and evaluation, mixed views
about assessment and evaluation that corresponded to the dier-
ences in assessment practices reported in the present article. Thus, it
appears that an additional factor to those already identied above
to explain the dierences in assessment and evaluation practices
between the three settings is related to beliefs and attitudes held by
the EFL=ESL instructors in the three settings.
The ndings described above are based on ESL=EFL instructors
self-reports of their own assessment practices and attitudes elicited
through a survey questionnaire. As with any mail surveys, there is the
potential that not all instructors interpreted the survey questions in
the same way. As pointed out in the introduction to this article, the
survey was the rst part of a three-year study. In an attempt to con-
rm the ndings of the present study and to gain a fuller understand-
ing as to why the instructors implement assessment in the ways that
they do, a sample of respondents to the survey questionnaire will be
interviewed during the second part of the study. The data and infor-
mation gleaned from these interviews should help to oer a fuller
explanation, understanding, and validation of these survey ndings.
Acknowledgements
Support for the project was made possible in part through funds
from the Social Sciences and Humanities Research Council

(SSHRC) of Canada. We would like to express our sincere thanks

to Professors Jo Lewkowicz (University of Hong Kong) and Yian
Wu (Beijing Foreign Studies University) for coordinating the pro-
ject in Hong Kong and Beijing, and Margaret Thompson for her
contribution to this project as a research assistant at Queens Uni-
versity. Thanks also go to the many instructors who took the time
and eort in responding to our survey in the three settings. Without
them, we would not have been able to complete the study.
VI References
Alderson, J.C. and Hamp-Lyons, L. 1996: TOEFL preparation courses: a
study of washback. Language Testing 13, 28097.
Anderson, J.O. 1989: Evaluation of student achievement: teacher practices
and educational measurement. The Alberta Journal of Educational
Research 35, 12333.
1990: Assessing classroom achievement. The Alberta Journal of Edu-
cational Research 36, 13.
Andrews, S. and Fullilove, J. 1993: Backwash and the use of English oral
speculations on the impact of a new examination upon sixth form
English language testing in Hong Kong. New Horizons 34, 4652.
Breen, M.P., Barratt-Pugh, C., Derewianka, B., House, H., Hudson, C.,
Lumley, T. and Rohl, M. 1997: Proling ESL Children: how teachers
interpret and use national and state assessment frameworks. Canberra
City, Australia: Department of Employment, Education, Training
and Youth Aairs.
Calderhead, J. 1996: Teachers: beliefs and knowledge. In Berliner, D.C.
and Calfee, R.C., editors, Handbook of educational psychology.
New York: Macmillan Library Reference, 70925.
Cheng, L. 1999: Changing assessment: washback on teacher perspectives
and actions. Teaching and Teacher Education 15, 25371.
Cheng, L. and Gao, L. 2002: Passage dependence in standardized reading
comprehension: exploring the College English Test. Asian Journal of
English Language Teaching 12, 16178.
Code of Fair Testing Practices for Education 1988: Washington, DC: Joint
Committee on Testing Practices. Available online from: http:==www.
apa.org=science=fairtestcode.html (March 2004).
Cumming, A. 2001: ESL=EFL instructors practices for writing assessment:
specic purposes or general purposes? Language Testing 18, 20724.
Davison, C. and Leung, C. 2001: Researching teacher-based language
assessment: whose criteria, whose language? Colloquium presented
at the American Association of Applied Linguistics, St. Louis, MD.
2002: Problems in (re)interpreting construct validity: diverse com-
munities of practice in school-based teacher assessment. Colloquium
presented at the American Association of Applied Linguistics, Salt
Lake City, UT.

Davison, C. 2002: What criteria, what community? Reinterpreting har-

mony and agreement in Hong Kong Secondary school language
assessment. Paper presented at the American Association of Applied
Linguistics, Salt Lake City, UT.
Educational Testing Service 1995: Capturing the power of classroom assess-
ment. Princeton, NJ: Educational Testing Service.
Gullickson, A.R. 1984: Teacher perspectives of their instructional use of
tests. Journal of Educational Research 77, 24448.
Hamp-Lyons, L. 1998: Ethical test preparation practice: the case of the
TOEFL. TESOL Quarterly 32, 32937.
He, A.E. 1996: Teaching English at Chinas tertiary universities: policy,
practice and future prospective. Unpublished doctoral dissertation,
Monash University, Melbourne, Australia.
Li, X.J. 1990: How powerful can a language test be? The MET in China.
Journal of Multilingual and Multicultural Development 11, 393404.
Lin, R.L. and Gronlund, N.E. 2000: Measurement and assessment in teach-
ing. Englewood Clis, NJ: Prentice Hall.
Marascuilo, L.A. 1966: Large-sample multiple comparisons. Psychological
Bulletin 65, 28090.
Rogers, T. 1991: Educational assessment in Canada: evolution or extinc-
tion? The Alberta Journal of Educational Research 37, 17992.
1993: Principles for fair student assessment practices for education
in Canada. Edmonton, Alberta: University of Alberta, Centre for
Research in Applied Measurement and Evaluation.
Rogers, T., Cheng, L. and Hu, H. under review: ESL=EFL Instructors opin-
ions about assessment and evaluation. Language Assessment Quarterly.
Shulman, L.S. 1986: Paradigms and research programs in the study of
teaching: a contemporary perspective. In Wittrock, M.C., editor,
Handbook of research on teaching, 3rd edition, New York: Macmillan
Publishing Company, 336.
Siegel, S. 1956: Nonparametric statistics for the behavioral sciences.
New York: McGraw-Hill.
Standards for Teacher Competence in Educational Assessments of Students
1990: Washington, DC: American Federation of Teachers, National
Council on Measurement in Education, and National Educational
Association. Available online at: http:==www.unl.edu/buros/article
3.html (March 2004).
Stiggins, R.J. 1997: Student-centered classroom assessment: Englewood
Clis, NJ: Prentice Hall.
Wilson, R.J. 1996: Assessing students in classrooms and schools. Toronto,
Canada: Allyn & Bacon.
1998: Aspects of validity in large-scale programs of student assess-
ment. Paper presented at the Conference on Measurement and
Evaluation: Current and Future Research Directions for the New
Millennium, Ban, Canada.
2000: A model of assessment-in-practice. Paper presented at the
Annual Conference of the Canadian Society for the Study of Edu-
cation, Edmonton, Alberta.

2003: Integrating assessment into teaching and learning. Unpublished

manuscript at http:==educ.queensu.ca= wilsonr=Integrating
Assessment.doc (March 2004).
Yang, H. 1999: The validation study of the National College English Test.
Paper presented at AILA (International Association of Applied
Linguistics), Tokyo, Japan.
Young, S.F. and Wilson, R.J. 2000: Assessment and learning: the ICE
approach. Manitoba: Portage and Main Press.
Appendix 1 The questionnaire
A Purposes of assessment and evaluation

Students are assessed and evaluated for dierent purposes or rea-
sons. Listed below are several of these purposes and reasons.
Please put a check mark (X) in the yes space for each pur-
pose=reason that you have for assessing your students or in the no
space if it is not a purpose or reason that applies to your teaching.
Spaces have been provided at the end of the list for purpo-
ses=reasons not on the list. If you use other purposes=reasons,
please be sure to write or describe what they are.
Purpose=Reason Yes No
1. To group my students for instruction purposes in my

class
2. To obtain information on my students progress
3. To plan my instruction
4. To diagnose strengths and weaknesses in my own teach-
ing and instruction
5. To provide feedback to my students as they progress
through the course
6. To motivate my students to learn
7. To make my students work harder
8. To prepare my students for standardized tests they will
need to take in the future (e.g. the Test of English as a
Foreign Language (TOEFL), Michigan English Language
Assessment Battery (MELAB), or College English Test
(CET))
9. To diagnose strengths and weaknesses in my students
10. To formally document growth in learning
11. To determine the final grades for my students
12. To provide information to the central administration (e.g.
school, university)
13. To provide information to an outside funding agency
14. Other:
15. Other:
16. Other:

B Methods of assessment and evaluation

Several assessment methods can be used to assess the learning and
progress of students learning English as a Second Language (ESL)
or as a Foreign Language (EFL). We would like to know what
methods you actually use, rst for reading, then for writing, fol-
lowed by speaking and listening.
You will be provided with three tables below. We would like to
know:
What assessment methods do you use to evaluate your students?
Please follow the instructions provided on each page. They are pre-
sented in two sets.
Reading
If you do not teach reading, please put a check mark here and
go to the next page.
Instruction: Please put a check mark (X) in the space to the left
for each method you use to evaluate your students in reading.
Spaces have been provided at the end of the list for methods not on
Methods I use Assessment methods

to assess reading (X)
1. Read aloud=dictation
2. Oral interviews=questioning
3. Teacher-made tests containing
a. cloze items
b. sentence completion items
c. true-false items
d. matching items
e. multiple-choice items
f. interpretative items (e.g.
reading passage; interpret a
map or a set of directions)
g. forms such as an application form
or an order form of some kind
h. short answer items
i. editing a piece of writing
4. Student summaries of what is read
5. Student journal
6. Student portfolio
7. Peer assessment
8. Self assessment
9. Standardized reading tests
10. Other:
11. Other:

the list. If you use other methods, please be sure to write or describe
what the other methods are.
Writing
If you do not teach writing, please put a check mark here and
go to the next page.
Instruction set 1: Please put a check mark (X) in the space to the
left for each method you use to evaluate your students in writing.
Methods I use Assessment methods

to assess writing (X)
1. Teacher-made tests containing

a. true-false items
b. matching items
c. multiple-choice items to identify
grammatical error(s) in a
sentence
d. editing a piece of writing such as a
sentence or a paragraph
e. short essay
f. long essay
2. Student journal
3. Peer assessment
4. Self assessment
5. Student portfolio
6. Standardized writing tests
7. Other:
8. Other:
Speaking and listening

If you do not teach speaking and listening, please put a check mark
here and go to the next page.
Instruction set 1: Please put a check mark (X) in the space to the
left for each method you use to evaluate your students oral skills.

Methods I use to Assessment methods

assess oral skills (X)
1. Oral reading=dictation
2. Oral interviews=dialogues
3. Oral discussion with each student
4. Oral presentations
5. Public speaking
6. Teacher made tests asking students to
a. give oral directions
b. follow directions given orally
c. provide an oral description of an
event or object
d. prepare summaries of what is heard
e. answer multiple-choice test items
following a listening passage
f. take notes
g. retell a story after listening to a passage
7. Peer assessment
8. Self assessment
9. Standardized speaking test
10. Standardized listening tests
11. Other:
12. Other:
C Procedures of assessment and evaluation

1. Which of the following represents your primary source(s) for
test items and other assessment procedures?
Please check (X) all that apply.
a. Items developed by myself
b. Items prepared together with other teachers
c. Items from published textbooks
d. Items from mandated syllabuses=curricula
e. Items found on the Internet
f. Other published test items
g. Other (please specify): _________________________________________________
2. When you give feedback to your students during the course,

how do you provide that feedback? Please check (X) all that
apply.
a. Verbal feedback
b. Checklist
c. Written comments
d. Teaching diary=log
e. Conference with student
f. Total test score
g. A letter grade
h. Other (please specify): _________________________________________________

3. When you give a nal report to your students, how do you

provide that information? Please check (X) all that apply:
a. Checklist
b. Written comments
c. Teaching diary=log
d. Conference with student
e. Total test score
f. A letter grade
g. Other (please specify): _________________________________________________
4. Approximately what percentage of the total time you spend on

instruction and assessment during a term or semester do you
spend on preparing for an assessment, collecting the assess-
ment information, scoring the responses, and reporting assess-
ment results? Please include time spent both at your university
and at home.
I spend approximately (please check (X) one of the options
below)
a. 5%
b. 10%
c. 15%
d. 20%
e. 30%
f. 40%
g. 50%
h. more than 50%
D Opinions about assessment and evaluation (Not reported in this

paper)
E About you and your classes

Now we would like to know something about you and your classes.
1. I am (please check (X))
female
male
2. How old are you? (Please select one)

a. Less than 20 b. 2025
c. 2630 d. 3135
e. 3640 f. 4145
g. 4650 h. Over 50

3. My educational qualications include (please check (X) all

that apply):
a. Certificate in Teaching ESL=EFL
b. Diploma in Teaching ESL=EFL
c. B.A. degree
d. B.Sc. degree
e. B.Ed. degree
f. Masters degree
g. Masters in Philosophy
h. Doctoral degree
i. Other (please specify): ______________________
4. I have taught ESL=EFL for years.

5. I am teaching
a. part-time.
b. full-time.
6. I am teaching
a. undergraduate ESL=EFL courses only.

b. graduate ESL=EFL courses only.
c. both undergraduate and graduate
EEL=EFL courses.
d. neither undergraduate nor graduate
ESL=EFL courses.
7. How many dierent levels of ESL=EFL (e.g., beginning, inter-

mediate, and=or advanced or Level 1, 2, 3, and=or 4) are you
teaching at this time (please enter the total number of
levels)? ______
8. How many classes are you teaching per week at this
time? _____
9. What is the average size of your classes? ______
10. Are all of your classes in the same institution?
a. Yes
b. No
11. Have you completed a full course in assessment and evalu-

ation, a course in which assessment and evaluation was one of
the topics, or a workshop of more than three hours on assess-
ment and evaluation? (Please check (X) all that apply).
a. Yes, I have completed a full course on assessment

and evaluation.

b. Yes, I have completed a course in which assessment

and evaluation were topics.
c. Yes, I have completed a workshop on assessment
and evaluation.
d. No, I have attended no courses or workshops on as-
sessment and evaluation.
End of Questionnaire

ESL/EFL instructors' classroom assessment practices and purposes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ESL/EFL instructors' classroom assessment practices and purposes

Uploaded by

Copyright:

Available Formats

ESL=EFL instructors classroom

assessment practices: purposes,

I Background to the study

Address for correspondence: Liying Cheng, Faculty of Education, Queens University,

Language Testing 2004 21 (3) 360389 10.1191/0265532204lt288oa # 2004 Arnold

content and format of the TOEFL (Alderson and Hamp-Lyons,

II The central role assessment plays in teaching and learning

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

The study of the assessment practices of ESL=EFL teachers and

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

to create their own assessments, external expectations, often in the

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

analyses were restricted to the three settings. Two non-parametric

Schee type condence intervals were used to complete post-hoc

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

2 Purposes of assessment and evaluation

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

each purpose are reported in Table 1 together with identication of

Table 1 Purpose of assessment and evaluation

Purpose Canada Hong Kong Beijing Sig. diff.

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

by those in Canada and Beijing (v2 1451; p < 0.05). Further,

a Student centered purposes: Turning now to the individual pur-

b Instructional purposes: A greater number of instructors in

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

c Administrative purposes: 85% of the instructors in Canada and

3 Assessment methods for reading, writing, and speaking=listening

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

a Assessing reading: The instructors in Canada and Beijing

Table 2 Methods of assessing reading

Assessment methods Canada Hong Kong Beijing Sig. diff.

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

included in written tests developed by the instructors to assess

b Assessing writing: In the case of writing, the median number

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

least half of the instructors in Canada but only by one instructor in

Table 3 Methods of assessing writing

Assessment methods Canada Hong Kong Beijing Sig. diff.

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

c Assessing speaking and listening: As with reading, the instruc-

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

Table 4 Methods of assessing speaking and listening

Assessment method Canada Hong Kong Beijing Sig. diff.

assess speaking and listening, as compared with approximately

4 Procedures and assessment and evaluation

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

Table 5 Sources of assessment items

Source Canada Hong Kong Beijing Sig. diff.

Instructor 98.9 81.5 84.7 (H B) < C

a Source of assessment items and tasks: All but one instructor in

b Methods for providing feedback and reporting: Once an assess-

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

Table 6 Form of feedback provided to students

Canada Hong Kong Beijing Sig. diff.

During the course

most often presented verbally or in the form of written com-

Downloaded from ltj.sagepub.com at UNIV ARIZONA LIBRARY on November 5, 2015

written comments. End-of-course checklists were less frequently

c Time spent on assessment and evaluation: The distribution of

V Discussion and conclusions

Table 7 Time spent on assessment and evaluation