Professional Documents
Culture Documents
WHAT TO EXPECT
Content Update
BASIC CONEPTS
Test
An instrument designed to measure any quality, ability, skill or knowledge.
Comprised of test items of the area it is designed to measure.
Measurement
A process of quantifying the degree to which someone/something possesses a given trait (i.e.
quality, characteristics or features)
A process by which traits, characteristics and behaviors are differentiated.
Assessment
A process of gathering and organizing data into an interpretable form to have basis for decision-
making
It is a prerequisite to evaluation. It provides the information which enables evaluation to take place.
Evaluation
A process of systematic analysis of both qualitative and quantitative data in order to make sound
judgment or decision.
It involves judgment about the desirability of changes in students.
MODES OF ASSESSMENT
c. Matching Type consists of two parallel columns: Column A, the column of premises
from which a match is sought; Column B, the column of responses from which the
selection is made.
2. Supply Test
a. Short Answer uses a direct question that can be answered by a word, phrase, a
number, or a symbol
b. Completion Test it consists of an incomplete statement
3. Essay Test
a. Restricted Response limits the content of the response by restricting the scope
of the topic
b. Extended Response allows the students to select any factual information that
they think is pertinent, to organize their answers in accordance with their best
judgment
b. Scales these instruments that indicate the extent or degree of ones response
1) Rating Scale measures the degree or extent of ones attitudes, feelings, and
perception about ideas, objects and people by marking a point along 3- or 5- point
scale
2) Semantic Differential Scale measures the degree of ones attitudes, feelings and
perceptions about ideas, objects and people by marking a point along 5- or 7- or 11-
point scale of semantic adjectives
3) Likert Scale measures the degree of ones agreement or disagreement on positive
or negative statements about objects and people
b. Surveys measures the values held by an individual by writing one or many responses
to a given question
c. Essays allows the students to reveal and clarify their preferences, hobbies, attitudes,
feelings, beliefs, and interests by writing their reactions or opinions to a given question
Specific Suggestions
B. Matching Type
1. Use only homogenous materials in a single matching exercise.
2. Include an unequal number of responses and premises, and instruct the pupils that response may
be used once, more than once, or not at all.
3. Keep the list of items to be matched brief, and place the shorter responses at the right.
4. Arrange the list of responses in logical order.
5. Indicate in the directions the basIs for matching the responses and premises.
6. Place all the items for one matching exercise on the same page.
C. Multiple Choice
1. The stem of the item should be meaningful by itself and should present a definite problem.
2. The item should include as much of the item as possible and should be free of irrelevant information.
3. Use a negatively stated item stem only when significant learning outcome requires it.
4. Highlight negative words in the stem for emphasis.
5. All the alternatives should be grammatically consistent with the stem of the item.
6. An item should only have one correct or clearly best answer.
7. Items used to measure understanding should contain novelty, but beware of too much.
8. All distractors should be plausible.
9. Verbal association between the stem and the correct answer should be avoided.
10. The relative length of the alternatives should not provide a clue to the answer.
11. The alternatives should be arranged logically.
12. The correct answer should appear in each of the alternative positions and approximately equal
number of times but in random number.
13. Use of special alternatives such as none of the above or all of the above should be done
sparingly.
14. Do not use multiple choice items when other types are more appropriate.
15. Always have the stem and alternatives on the same page.
16. Break any of these rules when you have a good reason for doing so.
D. Essay Type of Test
1. Restrict the use of essay questions to those learning outcomes that cannot be satisfactorily
measured by objective items.
2. Formulate questions that will cell forth the behavior specified in the learning outcome.
3. Phrase each question so that the pupils task is clearly indicated.
4. Indicate an approximate time limit for each question.
5. Avoid the use of optional questions.
VALIDITY - is the degree to which a test measures what is intended to be measured. It is the usefulness
of the test for a given purpose. It is the most important criteria of a good examination.
Appropriateness of test it should measure the abilities, skills and information it is supposed
to measure
Directions it should indicate how the learners should answer and record their answers
Reading Vocabulary and Sentence Structure it should be based on the intellectual level of
maturity and background experience of the learners
Difficulty of Items- it should have items that are not too difficult and not too easy to be able to
discriminate the bright from slow pupils
Construction of Items it should not provide clues so it will not be a test on clues nor should
it be ambiguous so it will not be a test on interpretation
Length of Test it should just be of sufficient length so it can measure what it is supposed to
measure and not that it is too short that it cannot adequately measure the performance we
want to measure
Arrangement of Items it should have items that are arranged in ascending level of difficulty
such that it starts with the easy ones so that pupils will pursue on taking the test
Patterns of Answers it should not allow the creation of patterns in answering the test
RELIABILITY - it refers to the consistency of scores obtained by the same person when retested using
the same instrument or one that is parallel to it.
Length of the test as a general rule, the longer the test, the higher the reliability. A longer
test provides a more adequate sample of the behavior being measured and is less distorted by
chance of factors like guessing.
Difficulty of the test ideally, achievement tests should be constructed such that the average
score is 50 percent correct and the scores range from zero to near perfect. The bigger the
spread of scores, the more reliable the measured difference is likely to be. A test is reliable if
the coefficient of correlation is not less than 0.85.
Objectivity can be obtained by eliminating the bias, opinions or judgments of the person
who checks the test.
Administrability the test should be administered with ease, clarity and uniformity so that
scores obtained are comparable. Uniformity can be obtained by setting the time limit and oral
instructions.
Scorability the test should be easy to score such that directions for scoring are clear, the
scoring key is simple, provisions for answer sheets are made
Economy the test should be given in the cheapest way, which means that answer sheets
must be provided so the test can be given from time to time
Adequacy - the test should contain a wide sampling of items to determine the educational
outcomes or abilities so that the resulting scores are representatives of the total performance
in the areas measured
Type of Reliability
Method Procedure Statistical Measure
Measure
Give a test twice to the same
group with any time interval
Test-Retest Measure of stability Pearson r
between sets from several
minutes to several years
Measure of Give parallel forms of test at the
Equivalent Forms Pearson r
equivalence same time between forms
Give parallel forms of test with
Test-Retest with Measure of stability
increased time intervals between Pearson r
Equivalent Forms and equivalence
forms
Give a test once. Score equivalent Pearson r and
Measure of Internal
Split Half halves of the test (e.g. odd-and Spearman Brown
Consistency
even numbered items) Formula
Give the test once, then correlate
Kuder- Measure of Internal the proportion/percentage of the Kuder Richardson
Richardson Consistency students passing and not passing Formula 20 and 21
a given item
Procedure: Example:
1. Group data in the form a frequency Class Limits M F MF
distribution 45-49 47 2 94
2. Compute the midpoints of all class 40-44 42 0 0
limits (M) 35-39 37 12 444
3. Multiply the midpoints by their 30-34 32 13 416
corresponding frequencies (M x F) 25-29 27 10 270
4. Get the sum of the products of the 20-24 22 5 110
midpoints and frequencies (MF) 15-19 17 4 68
5. Divide the sum by the number of cases 10-14 12 4 48
(N) N=50 MF=1450
Formula:
MF MF 1450
X X 29
N N 50
Procedure: Example:
1. Arrange the scores from highest to a. N is odd (N=7)
lowest or vice versa
2. Get the middlemost score/s, its/their 1. 90
identical score/s and its/their 2. 88
counterparts either above or below the 3. 87
middlemost score/s. 4. 85 middlemost scores
3. Compute their average and the 5. 85
average is the median score. 6. 84 Median = 87+85+85
7. 82 3
= 85.67
Example:
b. N is even (N=10)
1. 90
2. 88
3. 87
4. 85 Median = 85+84+83+83
5. 84 4
6. 83 = 83.75
7. 83
8. 82
9. 81
10. 80
MEDIAN
2) Grouped Data: (N>30)
Procedure: Example:
1. Add up or accumulate the frequencies
starting from the lowest to the highest Class Limits F CF
class limit. Call this the cumulative 45-49 2 50
frequency (CF). 40-44 0 48
2. Find one-half of the number of cases in 35-39 12 48
the distribution (N/2) 30-34 13 (FMdn) 36 - median class)
3. Find the cumulative frequency which is 25-29 10 23 (CFb)
equal or closest (but higher than) to the 20-24 5 13
half of the number of cases. The class 15-19 4 8
containing this frequency is the median 10-14 4 4
class. N=50
4. Find the lowest limit (LL) of the median
class by subtracting 0.5 from the lowest
i =5
score of the median class.
N/2 = 50/2 = 25
5. Get the cumulative frequency of the
LL = 30 0.5 = 29.5
class below the median class (CFb)
6. Subtract this from the half of the
number of cases in the distribution
(N/2 CFb) N CFb
~
X LL i 2
FMdn
7. Get the frequency of the median class
(FMdn)
8. Find the class interval (i) then follow 25 23
the given formula below. 29.5 5
13
Formula: 2
29.5 5
N CFb 13
X LL i 2
~
29.5 0.77
FMdn
30.27
Where:
LL = lowest limit of the median class
i = class interval
N/2 = half the number of cases
CFb = cumulative frequency below
the median class
FMdn = frequency of the median class
MODE
1) Ungrouped Data: (N>30)
When there are more than three modes, they are called polymodal or multimodal
When there is no mode, it is described as a rectangular distribution
2) Grouped Data
a. Crude Mode refers to the midpoint of the class limit with the highest frequency.
Procedure: Example:
1. Find the class limit with the highest
frequency. Class Limits F
2. Get the midpoint of that class limit. 45-49 2
3. The midpoint is the crude mode. 40-44 0
35-39 12
30-34 13 - highest frequency
25-29 10
20-24 5 Mode = midpoint of 30-34
15-19 4 = 32
10-14 4
N=50
b. Refined Mode refers to the mode obtained from an ordered arrangements or a class
frequency distribution
Procedure: Example:
1. Get the mean and the median of the
grouped data. Using the data in the mean and its median
2. Multiply the median by three (3Mdn)
3. Multiply the mean by two (2Mn) X 3X 2X
4. Find the difference 3(30.27 ) 2(29 )
Formula: 90.81 58
X 3Mdn 2Mn 32.81
The value that represents a set of data will be the basis in determining whether the group is
performing better or poorer that the other groups.
MEASURES OF VARIABILITY
RANGE (R)
1. Ungrouped Data the difference between the highest and lowest score
2. Grouped Data the difference between the highest limit of the highest class limit and the lowest
limit of the lowest class limit
Procedure: Example:
1. Find the mean. ( X)
X d= X - X ) d2
2. Subtract the mean from each score to
5 -4 16
get the deviation. (d= X - X )
7 -2 4
3. Square the deviation (d2)
4. Get the sum of the squared deviations 9 0 0
(d2) 11 2 4
5. Divide the sum by the number of cases 13 4 16
(d2/N-1) =40
6. Get the square root of the quotient. X=9
Formula: N=5
d2 40
SD SD =
N1 4
= 3.16
STANDARD DEVIATION (SD)
Grouped Data
CLASS DEVIATION METHOD
Procedure: Example:
1. Like what you did in the mean; get the
deviation (d) and the product of the Class
frequency and deviation of each score Limits f d fd fd2
(fd) 45-49 2 3 6 18
2. multiply the product of the frequency- 40-44 0 2 0 0
deviation (fd) by deviation (d) = (fd2) 35-39 12 1 12 12
3. Get the summation of fd2 30-34 13 0 0 0
4. Compute the standard deviation using 25-29 10 -1 -10 10
the formula below 20-24 5 -2 -10 20
15-19 4 -3 -12 36
Formula: 10-14 4 -4 -16 64
N=50 fd=-30 fd2=160
fd2 fd2
SD i
N N2
fd2 fd2
where: SD i
N N2
i = interval
N = number of cases 160 30 2
5
fd = sum of the product of frequency 50 50 2
and deviation
fd2 = sum of the product of frequency 5 3.2 0.36
and squared deviation
5 2.84
5 (1.69)
SD = 8.4
MIDPOINT METHOD
Procedure: Example:
1. Square the midpoint (M2) and multiply it
Class
by the frequency. (Shortcut: multiply Limits f M fM fM2
Midpoint (M) by the product of 45-49 2 47 94 4418
frequency-midpoint (FM). 40-44 0 42 0 0
2. Write the product of (M) and (FM) and 35-39 12 37 444 16428
label the column (FM2) 30-34 13 32 416 13312
3. Use the formula below to compute for 25-29 10 27 270 7290
Standard Deviation: 20-24 5 22 110 2420
15-19 4 17 68 1156
Formula:
10-14 4 12 48 576
N=50 fM=1450 fM2=45600
FM2
SD ( X) 2
N X = 29
fM2
SD ( X) 2
N
45600
(29 ) 2
50
912 (841)
71
SD = 8.4
HOW to INTERPRET the STANDARD DEVIATION
The result will help you determine if the group is homogeneous or not.
The result will also help you determine the number of students that fall below and above the average
performance.
88 + 87
Q3 = = 87.5
2
Q3 - Q1 87.5 - 77.5
QD = = = 5
2 2
Grouped Data
Example:
Procedure:
1. Compute for the value of 1st Quartile. Class Limits F CF
45-49 2 50
Formula: 40-44 0 48
N - CFb 35-39 12 48
Q 1 LL i 4
FQ1 30-34 13 36
25-29 10 23
where: Q1 = 1st Quartile 20-24 5 13 Q1 class
LL = lowest limit of Q1 class 15-19 4 8
N/4 = one-fourth of the total number 10-14 4 4
of cases N=50
CFb = cumulative frequency below i =5
the Q1 class
FQ1 = frequency of Q1 class Q1 class = N/4 = 50/4 = 12.5
i = interval N - CFb
12.5 8
Q1 LL i 4 19.5 5
FQ1 5
Q1 = 24
Procedure: Example:
2. Compute for the value of 3rd Quartile.
Class Limits F CF
Formula: 45-49 2 50
3N - CFb 40-44 0 48
Q 3 LL i 4
35-39 12 48 Q3 class
FQ3
30-34 13 36
where: Q3 = 3rd Quartile 25-29 10 23
LL = lowest limit of Q3 class 20-24 5 13
3N/4 = three-fourth of the total 15-19 4 8
number of cases 10-14 4 4
CFb = cumulative frequency below N=50
the Q1 class i =5
FQ3 = frequency of Q3 class
i = interval Q3 class = 3N/4 = 150/4 = 37.5
3N - CFb
37.5 36
Q 3 LL i 4 34.5 5
FQ1 12
3. Compute for the interquartile range or
QD. Q3 = 35.125
Formula:
Q 3 - Q1
Q 3 - Q1 QD
QD 2
2
35.125 - 24
= 5.56
2
The result will help you determine if the group is homogeneous or not.
The result will also help you determine the number of students that fall below and above the average
performance.
Kuder-Richardson Formula 20
Where:
K pq K number of items of a test
KR20 1 2
K 1 S p proportion of the examinees
who got the item right
q proportion of the examinees
who got the item wrong
S2 variance or standard deviation
squared
Kuder-Richardson Formula 21
Where:
K Kpq X
KR 21 1 2 p
K 1 S K
q=1-p
STANDARD SCORES
Indicate the pupils relative position by showing how far his raw score is above or below average
Express the pupils performance in terms of standard unit from the mean
Represented by the normal probability curve or what is commonly called the normal curve
Used to have a common unit to compare raw scores from different tests
PERCENTILE
tells the percentage of examines that lies below ones score
Example:
P85 = 70 (This means the person who scored 70 performed better than 85% of the
examinees)
85%N CFb
Formula: P85 LL i
FP85
Z-SCORES
tells the number of standard deviations equivalent to a given raw score
XX Where:
Formula: Z X individuals raw score
SD
X mean of the normative group
SD standard deviation of the
normative group
Example:
X X 27 26 1 X X 25 26 1
Z Z
SD 2 2 SD 2 2
Z = 0.5 Z = -0.5
T-SCORES
it refers to any set of normally distributed standard deviation score that has a mean of 50 and
a standard deviation of 10
computed after converting raw scores to z-scores to get rid of negative values
Example:
Josephs T-score = 50 + 10(0.5)
= 50 + 5
= 55
Could be in:
percent such as 70%, 88% or 92%
letters such as A, B, C, D or F
numbers such as 1.0, 1.5, 2.75, 5
descriptive expressions such as Outstanding (O), Very Satisfactory (VS), Satisfactory (S),
Moderately Satisfactory (MS), Needs Improvement (NI)
Could represent:
how a student is performing in relation to other students (norm-referenced grading)
the extent to which a student has mastered a particular body of knowledge (criterion-
referenced grading)
how a student is performing in relation to a teachers judgment of his or her potential
Could be for:
Certification that gives assurance that a student has mastered a specific content or
achieved a certain level of accomplishment
Selection that provides basis in identifying or grouping students for certain educational paths
or programs
Direction that provides information for diagnosis and planning
Motivation that emphasizes specific material or skills to be learned and helping students to
understand and improve their performance
1. Explain your grading system to the students early in the course and remind them of the grading
policies regularly.
2. Base grades on a predetermined and reasonable set of standards.
3. Base your grades on as much objective evidence as possible.
4. Base grades on the students attitude as well as achievement, especially at the elementary and
high school level.
5. Base grades on the students relative standing compared to classmates.
6. Base grades on a variety of sources.
7. As a rule, do not change grades, once computed.
8. Become familiar with the grading policy of your school and with your colleagues standards.
9. When failing a student, closely follow school procedures.
10. Record grades on report cards and cumulative records.
11. Guard against bias in grading.
12. Keep pupils informed of their standing in the class.
References:
Fraenkel, J.R. & Wallen, N.E. (1993) How to Design and Evaluate Research in Education, 2nd
Edition, New York: McGrawHill, Inc.
Nackmeas, C.F. and Nachmeas, D (1996). Research Methods in the Social Sciences, 5th Edition,
London: St. Martius Press, Inc.
Oriondo, Leonora, et.al. (1996) Evaluating Educational Outcomes. Quezon City: Rex Printing
Company, Inc.
Ornstein, Allan C. (1990). Strategies for Effective Teaching. New York: Harper Collins Publisher,
Navotas, M.M.