Costs and Benefits of Educational Testing Programs

By Richard Phelps, Ph.D.
Economist
Education Consumers Consultants Network
Benefit-cost analysis is imbedded in all studies that ask the essential question of an activity, "Is it worth
doing?" Benefit-cost analysis is a set of techniques, philosophy, and logic that can impose an order and rigor
on the process used to answer the essential question.
The logic of benefit-cost analysis is that of the accountant's spreadsheet. Indeed, one could accurately
describe it as economists' accounting method. The essential idea is to capture all relevant costs and benefits,
broadly considered, on one sheet of paper and weigh them in the balance. If the enterprise or project shows
more benefit than cost (i.e. net benefits are positive) it can be said to be economically worthwhile. It is
assumed that the researcher will do an honest and responsible job of trying to capture all the relevant benefits
and costs. If they can't be estimated with any precision, the researcher should at least enumerate them and
leave it to the reader to estimate their value.
What one person considers a benefit, however, another person may not. Indeed, what one person considers a
benefit, another person may regard as a cost. The details of benefit-cost analyses, then, are often subject to
debate. It is, however, considered incumbent upon the researcher to properly identify what perspective she is
adopting. Ideally, a benefit-cost analysis calculates the benefits and costs as they accrue to all of society -
such is the nature of a social benefit-cost analysis. Anything less - an analysis that calculates benefits and
costs for a sub-group - is a private benefit-cost analysis, and the researcher is obligated to explicitly declare it
as such.
Benefit-cost analysis should be most welcome in education research. Benefit-cost analysis imposes a
structure in which "the whole picture" gets considered. It provides a framework that can impose rigor and
honesty onto evaluations that could otherwise be sloppy.
By the same token, most readers are probably also well aware of how benefit-cost analysis can be misused. A
researcher can make unreasonable or dishonest estimates, ignore some relevant benefits or costs, and
include some irrelevant benefits or costs, or double count. There can be a tendency among advocates to
exclude or include benefits or costs according to their preferences.
What costs and benefits are relevant? Generally, they are the marginal costs or benefits that are attributable to
the activity in question and not another activity. When someone argues that the cost of a test is X, the
appropriate cost to cite is the marginal cost of the test, the cost that can be attributed to the existence of the
test and not to any other activity. Looked at another way, a marginal cost of a test is a cost that is caused by
the test, one that doesn't exist without the test. An heuristic one can use to determine if an activity or object is
a marginal cost of a test or not: take the test away and see if the activity or object disappears.
It turns out that the costs of standardized testing are minuscule by comparison with huge potential benefits.
This fact is little known among educators, as few mainstream education researchers are trained to attempt
such studies, more common to economists, and the few who have attempted such studies have produced
bungled, or biased, results.
In the early 1990s, the Center for the Study of Testing, Evaluation, and Educational Policy (CSTEEP), at
Boston College, calculated a "high" estimate of $22.7 billion spent on standardized testing per year. U.S.
schools, the CSTEEP report claimed, suffer from "too much standardized testing" that amounts to "a complete
and utter waste of resources." Their estimate breaks down to about $575 per student per year.
A report from the federally-funded Center for Research on Education, Standards, and Student Testing
(CRESST) counted cost components in much the same way as the CSTEEP study estimated costs of a
certain state test at between $848 and $1,792 per student tested ($1,320 would be mid-range).
Testing critics exaggerate their cost estimates by counting the costs of any activities "related to" a test as costs
of a test. In the CRESST study of Kentucky's performance-based testing program, for example, teachers were
asked to count the number of hours they spent "preparing materials related to the assessment program for
classroom use." In an instructional program like Kentucky's, with the intention of unifying all instruction and
assessment into a "seamless" web, where the curriculum and the test mutually determine each
other, all instruction throughout the entire school year will be "related to" the assessment.
The CSTEEP study counted even more cost items, such as student time. The CSTEEP researchers assumed
that there is no instructional value whatsoever to student time preparing for or taking a test (i.e., students learn
absolutely nothing while preparing for or taking tests). Then they calculated the present discounted value of
that "lost" learning time against future earnings, assuming all future earnings to be the direct outcome of
school instruction. The CSTEEP researchers also counted building overhead (maintenance and capital costs)
for the amount of time spent testing, even though those costs are constant (i.e., "sunk") and not affected by the
existence of a test. In sum, CSTEEP counts any and all costs incurred simultaneously to tests, not just those
caused by testing, which would not exist without testing.
In stark contrast to these incredible estimates are the actual prices charged for tests such as the ACT, SAT,
and AP exams, ranging from $20 to $70 a student. The makers of these tests must cover all their costs, or
they would go out of business.
The bipartisan U.S. General Accounting Office (GAO) also conducted a survey of state and local testing
directors and administrators to learn the costs of statewide and districtwide tests. The GAO estimate of $15 to
$33 per student contrasts markedly with CRESST and CSTEEP estimates of $575 and $1,320. And, the GAO
estimates counted all relevant costs, including that for teacher time used in administering tests. The GAO
estimate for the total national cost of systemwide testing of about $500 million contrasts with a CSTEEP
estimate 45 times higher.
> The GAO estimated all-inclusive, stand-alone marginal costs of large-scale, systemwide tests, costs that
would portend in a situation where the tests had to be administered independent of any school system
structure or schedule, say during the summer months and by hired personnel. The independent SAT, ACT,
and AP exams are administered this way.
Recalculating the GAO study's estimates under two reasonable assumptions: (1) that the tests, as is usually
the case, would be administered during the regular school year, using regular school personnel, and would be
integral parts of the school system curricular and instructional plan; and (2) that the tests would be used in
many school districts to replace, rather than supplement, some preexisting test. With these adjustments,
marginal costs become $2 per student for multiple-choice and $11 per student for performance tests.
Far from being the hugely expensive enterprise that some testing critics claim for it, standardized testing is not
very expensive by most standards. Even under the rather unrealistic assumptions of the GAO study's upper-
bound estimates, systemwide tests impose a time and cost burden, as one state testing director put it, "on a
par with field trips."
Distilled to the most rudimentary elements, the main benefits of standardized testing are four - information,
motivation, organizational clarity, and goodwill. But, that amounts to quite a thorough distillation. The
information benefits alone can manifest themselves in several different forms, to several different audiences.
Test results can tell us about the performance of an individual student. They can provide information about a
teacher, a curriculum, a textbook, a school, a program, a district, or a state policy. Moreover, the information
provided by test results can inform one or more among many parties - parents, voters, employers, higher
education institutions, other schools, state departments of education, and so on.
Perhaps the simplest, and least disputed, benefit of standardized tests is in diagnosis. Test results can
pinpoint a student's academic strengths and weaknesses, areas that need work, and areas where help is
needed. Test scores provide a measurement tool that can be used to judge the effectiveness of preexisting or
proposed school programs. Test results can inform teachers, schools, and school systems about their
curricular and instructional strengths and weaknesses. That may lead to a better alignment of curriculum with
instruction, a benefit often enumerated by teachers and administrators in evaluations of testing programs.
Teachers have also reported that they learn more about their students, their own teaching, and other teachers'
methods from high-stakes external tests.
Information can also be used for accountability purposes. Higher-level school system administrators can use
information to make judgments about performance at the school or school district level and to increase
efficiency. In an environment of school choice (e.g., school districts with open enrollment), information about
school performance can help parent-student school shoppers to make a better-informed selection.
Finally, information benefits can consist of signaling, screening, and credentialing effects. College admissions
counselors and employers can make a more informed decision about applicants' academic achievement with
test scores than they can without. Colleges, for example, use measures of predictive validity (correlation
coefficient of entrance test score with college achievement) to justify requiring applicants to submit scores from
college admissions tests (ACT or SAT). Measures of allocative efficiency (efficient sorting of applicants to
organizations) are more difficult to measure, but are relevant benefits as well.
Of the four main categories of benefits listed above, information is arguably the only one common to
educational tests whether or not they have "stakes," and whether or not they are conducted "internally" or
"externally." The other categories of benefits - motivation, organizational clarity and efficiency, and goodwill -
are unlikely to occur when tests "do not count."
Motivation may not be an end in itself, but can lead to desirable behaviors, such as a student paying greater
attention in class and studying more-activities that, in turn, lead to the accumulation of more knowledge and
understanding. Like information benefits, motivation can affect many different parties to the educational
enterprise and provide benefits to many different sectors of our society. Motivational effects are manifest when
rewards or punishments are provided (or imposed upon) students, teachers, administrators, schools, districts,
programs, service providers, politicians, or even parents. The beneficial effects of motivated efforts accrue to
all of the parties above, employers, higher education institutions, and society in general.
Just one example of the organizational clarity or efficiency benefit of standardized testing is provided by the
testimony of teachers in many states, provinces, and countries who participate in test development,
administration, and scoring. Overwhelmingly, they assert that the experience helps them as instructors. After
struggling, along with other teachers and testing experts, to design and score assessments fairly, they
understand better how their students might misunderstand concepts and how they might better explain the
concepts. Moreover, they can much more efficiently align their own instructional program with state standards
after undergoing a deep immersion into the state standards.
The final general category of benefit cited above - goodwill - is certainly the most often overlooked, and is the
most difficult to measure, but may be the most important. The public pays for the public schools and hands
over responsibility for its children's welfare to the public school authorities for substantial periods of time. The
public has a right to objective, impartial information about the performance of the public schools' main function
- the academic achievement of their children. Classroom grades are unreliable and often invalid sources of
such information. Standardized tests, when they are used validly, provide far more reliable and trustworthy
information.
Examples of goodwill, then, include: renewed public confidence in the school system; public faith that the
schools really are working to uphold standards; and the peace of mind that teachers and school administrators
might gain in the wake of the new parental and public trust. Students have also reported in some surveys
feelings of genuine achievement and accomplishment when they pass important, meaningful tests.
Even the four categories of benefits mentioned above, in all their varied manifestations, does not cover the
gamut. Still other benefits probably exist, but may be more difficult to pin down, more hypothetical, or more
difficult to measure. The economist John Bishop, for example, argues that it is illogical and counterproductive
to insist that a teacher be both a "coach" and a "judge." The teacher is a coach when she helps a student to
succeed; a judge when she grades a student's test and decides that the student should not be promoted to the
next grade or level of education. By Bishop's theory, this dual role puts the teacher in a moral dilemma that is
often resolved through social promotion. Most teachers would rather be coaches than judges and, so, promote
students to the next level even though they are not ready. After a few years of social promotion, of course,
students may be so far behind that they cannot possibly succeed by any objective standard. They may
become disillusioned, give up trying, and drop out. Bishop argues for external high-stakes testing as a means
to free each teacher to be a coach the student can trust to help him meet the challenge of the examination
which is "external" to both of them.
We may have reached the point in the United States where standardized tests provide the only pure measure
of subject-matter mastery. For some time now, education schools have encouraged teachers to grade
students using a cornucopia of criteria that include perceived persistence or effort; perceived level of handicap
due to background, participation or enthusiasm, and perceived need. Subject matter mastery is just one, and
usually not the most important factor, considered in calculating a student's course grade. In addition to the
missionary directive of the education schools, Bishop's theory of the irreconcilability of the coach and judge
roles may also explain the degradation of grades. But, regardless of the reason, if standardized tests are,
indeed, the only trustworthy measure of academic achievement, can our society afford to not use them?
External standardized tests may be the only reliable source of information on education performance not
controlled by groups with an incentive to corrupt or suppress it.
Even for teachers who desire to grade their students only on the basis of academic achievement, few have
training in testing and measurement. Those who criticize standardized tests for their alleged imperfections of
structure and content seldom mention that standardized tests are written, tested, and retested by large groups
of Ph.D.s with highly technical training in testing and measurement. By contrast, the typical classroom teacher
has had no training in testing and measurement.
The full effect of all the benefits mentioned above, however, numerous as they are, cannot be felt so long as
standardized tests are in use. "External" measures, such as systemwide standardized test scores, serve as a
check on other measures of performance (psychometricians label this phenomenon generally "restriction of
range"). To fully appreciate the benefits of external standardized testing, one must imagine a society without
standardized testing. What would happen to grade inflation if there were no standardized test scores to which
one could compare the grades? How much effort would students, teachers, and administrators make to
improve achievement if there were no standardized tests with which to check their progress?
Economic studies that have focused primarily on the motivational, or incentive, effects of high-stakes testing
programs estimate average benefits to students over their lifetimes of around $13,000 per subject area tested.
That is, students in jurisdictions with high-stakes testing programs tend to learn more, and that increased
amount of knowledge and skill is rewarded throughout their lives, through higher wages and greater job
security.
Psychologists have conducted many studies - in excess of a thousand, actually - on the predictive validity and
allocative efficiency of tests. Education professors have attacked the dollar estimates based on such studies
but even they concede benefits on the order of $5,000 to $8,000 per student lifetime.
Total testing benefits vastly outweigh the costs, by a benefit-to-cost ratio that probably exceeds a thousand.
The benefits can be so high because they affect a large number of people and they produce lasting and
cumulative effects. Meanwhile, the testing costs are low and incurred only once or a few times.
For further reading:
Bishop, John H. "Education Quality and the Economy," Paper presented at the Conference on Socio-
Economics of the Society for the Advancement of Socio-Economics, Arlington, VA., 1995
Boudreau, John W. "Economic Considerations in Estimating the Utility of Human Resource Productivity
Improvement Programs," Personnel Management, N.36, 1983.
Hunter, John E. and Schmidt, Frank L. (1982) "Fitting People to Jobs: The Impact of Personnel Selection on
National Productivity," in Marvin D. Dunnette and Edwin A. Fleishman, Eds. Human Performance and
Productivity: Volume 1 -- Human Capability Assessment. Hillsdale, NJ, Lawrence Erlbaum Associates.
Phelps, Richard P. "Estimating the Cost of Standardized Student Testing in the United States," Journal of
Education Finance. v.25, n.3, Winter 2000, pp. 343-380.
Phelps, Richard P. "Test Basher Benefit-Cost Analysis," Network News & Views, Education Excellence
Network, (http://www.edexcellence.net/issuespl/subject/standar/testbash.html).
Schmidt, Frank L. and Hunter, John E. (1998) "The Validity and Utility of Selection Methods in Personnel
Psychology: Practical and Theoretical Implication of 85 Years of Research Findings." Psychological Bulletin,
v.110.
Solmon, Lewis C. and Cheryl L. Fagnano, "Speculations on the Benefits of Large-scale Teacher Assessment
Programs: How 78 Million Dollars Can be Considered a Mere Pittance," Journal of Education Finance,
Vol.16, Summer, 1990, pp. 21-36.
U.S. General Accounting Office, Student Testing: Current Extent and Expenditures, With Cost Estimates
for a National Examination. GAO/PEMD-93-8. Washington, D.C.: author, January, 1993.

Costs and Benefits of Educational Testing Programs

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Costs and Benefits of Educational Testing Programs

Uploaded by

Copyright:

Available Formats

By Richard Phelps, Ph.D.

For further reading:

You might also like