Professional Documents
Culture Documents
Abstract
This paper examines the importance of learner characteristics in relation to
learner performance on ESL tests. It is argued here that test taker
characteristics are not included in the design of most ESL tests. Empirical
evidence is provided to support the hypothesis that performance on various
ESL tests is closely related to test takers’ educational and language
backgrounds. It is also argued that in order to account for those factors,
and thus decrease test bias, the theoretical definition of language
proficiency should be modified. Finally, some guidelines to dealing with
test taker characteristics are suggested.
Introduction
For many years, language testers have focused on the theoretical and statistical
dimensions of language testing. However, there is another dimension which has not
received sufficient attention, namely, the people who take these tests.
As early as 1961, Carroll pointed out that the diversity of students’ backgrounds and
their previous preparations would make the task of language testers very demanding.
Probably, because of the complexities involved in this issue, language testers have
virtually ignored a very crucial factor in language testing, one that involves the
characteristics of the test taker.
In this article, some of the theoretical and practical issues in ESL testing will be
examined. More specifically, a) some of the inadequacies of various definitions of
language proficiency with respect to test taker characteristics will be discussed, b)
empirical evidence in support of the relationship between test taker characteristics and
performance on language tests will be provided, and c) guidelines to dealing with test
taker characteristics in language testing will be suggested.
Theoretical Problems
One of the principles of a scientific theory is the generation of hypotheses in which
variables can be defined as clearly as possible. By a scientific theory, I mean a theory
which can be substantiated and validated through empirical investigation. And by a
hypothesis, I mean a tentative statement which predicts the relationship between two
or more variables.
115
Measures of language proficiency Hossein Farhady
1979; Farhady, 1980a) have been developed, and some have been supported by
research results. These theories have generated numerous hypotheses involving such
variables as the structure of language, instruments, test takers, and so forth. However,
because of inadequate definitions, some of these hypotheses need to be reconsidered
and the variables involved reexamined.
It should be noted that the lack of a scientific theory will weaken the external and/or
internal validity of research and thus the validity of the results obtained from such
research projects. Furthermore, if the hypotheses of the theory are poorly stated, they
will result in poorly defined variables which will also make the interpretation of the
results less defensible.
Language proficiency is one of the most poorly defined concepts in the field of
language testing. Nevertheless, in spite of differing theoretical views as to its
definition, a general issue on which many scholars seem to agree is that the focus of
proficiency tests is on the students’ ability to use language. Proficiency tests are
supposed to be independent of the ways in which language is acquired. Brière (1972)
points out that the parameters of language proficiency are not easy to identify.
Acknowledging the complexities involved in the concept of language proficiency,
Brière states:
The term ‘proficiency’ may be defined as: the degree of competence or the
capability in a given language demonstrated by an individual at a given point
in time independent of a specific textbook, chapter in the hook, or
pedagogical method (1972, p.332).
Such a complicated definition could very well result in vague hypotheses about
language proficiency and language proficiency tests. They could be vague with
respect to unspecified terms such as “competence”, “capability”, “demonstrated”, and
“individual”. The term competence could refer to linguistic, socio-cultural, or other
types of competence. The term capability could refer to the ability of the learner to
recognize, comprehend, or produce language elements (or a combination of them).
Demonstration of knowledge could be in either the written or the oral mode. Finally,
the expression individual could refer to a language learner as listener, speaker, or
both. These concepts should be clarified and their characteristics should be identified
in order to develop explicit hypotheses.
116
Measures of language proficiency Hossein Farhady
other types of language tests. This may be one of the factors that have slowed progress
in testing language proficiency. Some scholars believe that language proficiency
testing is the least advanced area in language testing (Clark, 1972). Although it is not
an easy task to account for all aspects of language proficiency, it may be possible to,
at least, clarify some of the ambiguous concepts involved in the definition of
proficiency.
One of the major problems with the definitions above, and others as well, is that none
of them includes test taker characteristics as a potential dimension in language testing.
Theoreticians as well as practitioners have simply assumed that what the learners have
learned and how they have learned it are irrelevant to language proficiency. This is, in
my view, a gross and misleading assumption.
It has been demonstrated that learners from different educational backgrounds have
certain performance profiles which indicate strengths and weaknesses in different
language skills (Farhady, 1978; Hisama, 1977, 1978). Due to the educational policies
in their home countries, students have differing views, conceptions, and perceptions of
language tests as well as language instruction. Most of them differ in their relative
needs for the use of language in their academic and social lives. The seriousness of the
problem was observed twenty years ago by Carroll, who stated:
There are many variables on which present tests are simply not designed to provide
information. For example, factors such as learners’ experience with test types, their
weak and strong areas in various language skills, their knowledge of how and where
to use language, the objectives of language courses they may be taking, and the
relevance of these objectives to the students’ academic as well as social lives, to name
a few, have not been incorporated into the design of language proficiency testing.
Including each and any of these variables in a theory of language testing will require
careful investigation and detailed examination of the nature of language tests. Testers
should consider what the tests are measuring and what they should be measuring;
what they expect a test to accomplish and what they should expect; which learner
characteristics are included in language testing and which learner characteristics
should be included. In short, the critical issue which deserves serious attention is what
language testing is versus what it should be.
117
Measures of language proficiency Hossein Farhady
review the arguments for and against such terminology because they have been
frequently discussed in the literature (Oller, 1976, 1978; Spolsky, 1972, 1978; Clark,
1979; Hinofotis, 1976; Rand, 1972; Vollmer, 1979). What I intend to do is attempt to
define some of the terms and propose research hypotheses which are empirically
testable.
Empirical Evidence
The data presented here are part of a large-scale study designed to develop and
validate functional language tests (Farhady, 1980). The experiment was carried out at
UCLA with 800 incoming foreign students who took the UCLA English as a Second
Language Placement Examination (ESLPE). The components of the Fall 1979 ESLPE
are presented in Table 1.
118
Measures of language proficiency Hossein Farhady
One part of the data analysis examined the relationship between learner variables and
learner performance. These variables included sex, university status (graduate vs.
undergraduate), major field of study, and nationality. The analyses were conducted
using standardized scores (T-scores) to eliminate the effect of the unequal number of
items in the various subtests in the ESLPE.
To compare the scores of male and female subjects on the study measures, t-tests were
performed on their scores. The results, presented in Table 2, indicate no significant
difference between male and female students in their performance on all but the
listening comprehension subtest. Here, female students significantly outperformed
male students. These results are illustrated in Figure 1.
119
Measures of language proficiency Hossein Farhady
120
Measures of language proficiency Hossein Farhady
The higher performance of graduate students on reading and grammar subtests may be
due to their extensive practice in these areas, whereas undergraduate students have not
had enough opportunities to master grammatical rules or read much material in
English. On the other hand, the higher performance of undergraduate students on
listening comprehension could be due to various factors such as their age
(undergraduate students tend to be younger than graduates) or length of residence in
English speaking communities. It could also be due to recent changes in educational
systems in foreign countries which emphasize oral-aural skills more than traditional
systems did.
The eighth group included 152 subjects whose major fields were not determined and
thus were not included in the analyses. Descriptive statistics for the subtest scores by
major field of study are presented in Table 5 and illustrated in Figure 3.
121
Measures of language proficiency Hossein Farhady
122
Measures of language proficiency Hossein Farhady
The results of the ANOVAs, reported in Table 6, indicate that students from different
major fields of study performed significantly different on reading, grammar, and
functional subtests. These differences should be suggestive rather than definitive
because they could be due to procedures used to classify various major fields. A
detailed classification of the major fields and multivariate analyses would be
necessary to validate these results.
123
Measures of language proficiency Hossein Farhady
The findings of previous research on the ESLPE (Farhady, 1978, 1979b) suggested
that there was a significant relationship between the examinees’ nationalities and their
performance profile. It was hypothesized that the relationship was due to different
educational policies in different countries. That is, in some countries, ESL instruction
may emphasize one language skill more than others. Therefore, the data in the present
study were analyzed to either support or reject the validity of previous findings.
Twelve countries (those with more than 15 students taking the ESLPE) were included
in the analysis. The descriptive statistics are presented in Table 7 and illustrated in
Figure 4.
124
Measures of language proficiency Hossein Farhady
The results of the ANOVAs, reported in Table 8, support the findings of the previous
investigations. The data indicate that nationality is a strong factor in relation to the
students’ degree of language proficiency in various tests. For all subtests, the F values
are significant at the .01 level indicating that students from different countries
performed differently on the various language skill tests in this study.
The results reported on here do not support the findings of previous research on the
ESLPE. Sanneh (1977) reported that factors such as students’ sex, status, major field
of study, and nationality did not have a significant relationship to student performance
on the ESLPE. However, the data in this study suggest that these factors are
significantly related to student performance on the ESLPE. The discrepancies between
the results of this study and those reported on by Sanneh could be due to the
modifications in the content of the ESLPE or changes in the student population and
their proficiency patterns, or both.
Discussion
The differential performance of the subjects from different countries suggests that
students coming to the university do not have similar training with respect to different
language skills. It is not clear at this point why in some countries the focus of
instruction is on one skill rather than another. What is clear, however, is that these
differences do exist and should be dealt with somehow. Identifying and/or controlling
the instructional factors, which are related to variations in incoming students’
performance, is probably not within the power of language testers because they do not
prescribe English language programs around the world. The important point, however,
is that such differences, which may influence the efficiency of ESL testing and
teaching at the universities, should be accounted for by modifying the test content,
instructional objectives, or both.
125
Measures of language proficiency Hossein Farhady
difference in the total scores of language groups on the ESLPE, it could be assumed
that test taker characteristics were the factors which resulted in different performance
patterns. Thus, if some of these variables could be incorporated in the design of
testing programs, it would be a step in the right direction.
No matter what the purpose of the test may be, learner variables will definitely
influence test scores in one way or another. That is, placement, selection, aptitude,
proficiency, and other uses of language test scores will be sensitive to those who take
the test. Furthermore, performance on discrete point, integrative, functional, or other
types of tests will also be sensitive to those who take the test. Therefore, considering
these factors in language test designs seems warranted.
Suggestions
Most existing ESL tests, which do not include learner variables in the data analysis
and in the interpretation of test scores, have probably failed to assess learners’
language abilities accurately. Consequently, a number of language learners may have
been misplaced in ESL classes or denied admission to universities because of their
low scores on language tests. These potential misjudgments could frustrate students or
reduce their motivation. To avoid some of these undesirable consequences, the steps
suggested below may be useful.
There are a number of factors that could contribute to improving ESL testing
processes. These factors could be classified into three major categories: psychometric,
typologic, and learner factors.
The psychometric factors involve the reliability and validity of the tests. Almost all
language tests have been reported to have reasonably high internal consistency (alpha)
coefficients. However, except for concurrent validity reports, few language tests have
been examined for their content and construct validity. Almost all language tests seem
to consist of randomly selected items of non-specified content materials. This is one
area that could be improved. Administrators in ESL programs should be willing to
examine the correspondence between test items and instructional objectives in order
to increase the content validity of their tests.
For example, for a program intended to prepare competent speakers, a multiple choice
test of grammar, no matter how reliable and valid it may be, will be neither sufficient
nor appropriate. In other words, a given test will not be suitable for all examinees in
all programs. There should be a direct relationship between the student, the test, and
the instructional objectives (Carroll, 1980).
The typologic factors refer to the types of tests being used. Discrete-point, integrative,
and functional tests all have their own advantages and disadvantages. It is not safe to
assume that either one, or a combination of all for that matter, will constitute a perfect
measure of language proficiency.
If the characteristics of these tests were known (i.e., item specifications were clearly
developed) and if ESL programs were designed with specified objectives (i.e.,
instructional objectives were clearly developed), then, there would be no room to
compromise because it would be easy to decide on an appropriate test format
126
Measures of language proficiency Hossein Farhady
regardless of learner variables. However, item specifications for most of the tests have
not been developed and the tests seem to serve very similar purposes and provide
similar information about examinees’ performance. Nor have the objectives of
instructional programs been clearly defined. Most ESL programs follow a similar
general English instruction format. Therefore, a definite decision cannot be made
about the format of the tests at present.
The learner factors involve variables related to the people who take the test. This is
the area which needs careful reexamination.
It has been demonstrated that given a reliable and seemingly valid test, different types
of tests provide similar information on the examinee’s proficiency (Farhady, 1979a,
1979b). However, students with different backgrounds tend to perform differently on
various language tasks. There seem to be two solutions to this problem, a short-term
solution and a long-term solution.
A short-term solution, which is reasonably easy and immediately applicable, calls for
a detailed analysis of test scores with a multidimensional design, i.e., including the
learner variables mentioned above. The variables which yield significant differences
among examinees will be selected for further exploration. Through various multiple
regression analyses, adjustment formulas could be developed. Finally, examinees’
total scores would be computed on the basis of regression coefficients associated with
every subskill score. Such total scores will be unbiased, at least statistically, with
respect to learner variables and test formats.
A long-term solution may apply for all language programs. That is, any program
would start with a detailed analysis of learner needs. Then, the instructional objectives
of the program would be established and achievement criteria would be determined.
Finally, the testing procedures would be similar to those suggested by Carroll (1980).
According to Carroll, there would be a two-phase testing program for language
learners. The first phase would assess the learners’ general and what Hinofotis (1981)
calls base level English. Those who receive a satisfactory score on this test would go
on to the next phase and take a test which is developed on the basis of careful analyses
of learner needs. This test would assess a selective functional proficiency of the
learners in various academic areas (Farhady, 1981). This means that learners from
different educational disciplines might be required to take different tests.
In this manner, neither learner variables nor format factors can interfere with the
decisions made on the basis of test scores because the test is devised to measure the
elements which are necessary for the group. This is where tests of English for specific
programs could be developed and utilized effectively.
Conclusions
Making conclusive statements about accurate assessment of learner’s language
proficiency is premature at this point. However, I believe that language testing is at a
critical stage of evolution. The trend is shifting, on the one hand, from testing
linguistic elements to testing communicative functions, and on the other hand, from
using one all-purpose language test to specific and discipline-oriented measures.
Therefore, it seems crucial to consider as many variables as possible and take them
127
Measures of language proficiency Hossein Farhady
into account in designing language tests. Without careful planning, the diversification
of language tests will not be as effective as it should be.
* This is the revised version of the paper printed in TESOL Quarterly (1982), 16(1).
128
Measures of language proficiency Hossein Farhady
Bibliography
Briere, E.J. (1972). Are we really measuring proficiency with our foreign language
tests? In H.B. Allen & R.N. Campbell (eds.), Teaching English as a second
languages: A book of readings (2nd ed.). 1972. New York: McGraw-Hill Book
Company.
_____ (ed.) (1978). Direct testing of speaking proficiency: Theory and application.
Princeton, NJ: Educational Testing Service.
_____ (1979). Direct vs. semi-direct tests of speaking ability. In E.J. Briere & F.
Hinofotis (eds.), New concepts in language testing: Some recent studies.
Washington, D.C.: TESOL.
_____ (1979a). The disjunctive fallacy between discrete-point and integrative tests.
TESOL Quarterly, 13(3).
_____ (l980c). On the plausibility of the unitary language proficiency factor. In W.J.
Oller, Jr. (ed.), Issues in language testing research. Rowley, Mass.: Newbury
House Publishers, Inc.
_____ (1980d). New directions for ESL proficiency testing: Language proficiency
factor. In W.J. Oller, Jr. (ed.), Issues in language testing research. Rowley, Mass.:
Newbury House Publishers, Inc.
129
Measures of language proficiency Hossein Farhady
_____ (1981). Perspectives on language testing: Past, present, future. Nagoya Gakuin
University Roundtable on languages, linguistics, and literature. Nagoya Gakuin
University, Seta, Aichi, Japan.
_____ (1978). An analysis of various ESL proficiency tests. In W.J. Oller, Jr. & K.
Perkins (eds.), Research in language testing. Rowley, Mass.: Newbury House
Publishers, Inc.
_____ (1961). Language testing: The construction and use of foreign language tests.
New York: McGraw-Hill Book Company.
Oller, J.W., Jr. (1976). Evidence for a general language proficiency factor: An
expectancy grammar. Die Neueren Sprachen, 2, 165-171.
Oller J.W., Jr. & F. Hinofotis (1980). Two mutually exclusive hypotheses about
second language ability: Indivisible or partially divisible competence. In J.W.
Oller & K. Perkins (eds.), Research in language testing. Rowley, Mass.:
Newbury House Publishers, Inc.
_____ (1978). Approaches to language testing. Arlington, VA: Center for Applied
linguistics.
Upshur, J.A. (1979). Functional proficiency theory and a research role for language
tests. In E.J. Briere & F. Hinofotis (eds.), New concepts in language testing: Some
recent studies. Washington, D.C.: TESOL.
Vollmer, H.J. (1979). Why are we interested in general language proficiency? Paper
130
Measures of language proficiency Hossein Farhady
_____ (1980). Competing hypotheses about second language ability: A plea for
caution. Berlin: Osnabruck.
131