LET - REVIEW - Measurement - Assessment of Learning

LICENSURE EXAMINATION FOR TEACHERS (LET)
WHAT TO EXPECT
FOCUS: PROFESSIONAL EDUCATION
AREA: ASSESSMENT OF LEARNING

LET Competencies:
1. Diagnose learning and strengths and difficulties

2. Construct appropriate test items for given objectives
3. Use/Interpret measures of central tendency, variability and standard
scores
4. Assign marks and grades
5. Apply basic concepts and principles of evaluation in classroom
instruction, testing and measurement
Content Update
BASIC CONEPTS
Test
An instrument designed to measure any quality, ability, skill or knowledge.
Comprised of test items of the area it is designed to measure.
Measurement
A process of quantifying the degree to which someone/something possesses a given trait (i.e.
quality, characteristics or features)
A process by which traits, characteristics and behaviors are differentiated.
Assessment
A process of gathering and organizing data into an interpretable form to have basis for decision-
making
It is a prerequisite to evaluation. It provides the information which enables evaluation to take place.
Evaluation
A process of systematic analysis of both qualitative and quantitative data in order to make sound
judgment or decision.
It involves judgment about the desirability of changes in students.
MODES OF ASSESSMENT
MODE DESCRIPTION EXAMPLES ADVANTAGES DISADVANTAGES

The objective Standardized Scoring is Preparation of
paper-and-pen Tests objective instrument is
test which usually Teacher-made Administration time-
assesses low-level Tests is easy consuming
Traditional thinking skills because Prone to
students can cheating
take the test at
the same time
A mode of Practical Test Preparation of Scoring tends
assessment that Oral and Aural the instrument to be subjective
requires actual Tests is relatively without rubrics
demonstration of Projects easy Administration
Performance
skills or creation Measures is time
of products of behaviours consuming
learning that cannot be
deceived
A process of Working Measures Development is
gathering multiple Portfolios students time consuming
indicators of Show growth and Rating tends to
student progress Portfolios development be subjective
Portfolio to support course Documentary Intelligence- without rubrics
goals in dynamic, Portfolios fair
ongoing and
collaborative
process
FOUR TYPES OF EVALUATION PROCEDURES
PLACEMENT SUMMATIVE FORMATIVE DIAGNOSTIC

EVALUATION EVALUATION EVALUATION EVALUATION
done before done after reinforces successful determine recurring

instruction instruction learning or persistent difficulties
certifies mastery of searches for the
provides continuous
determines the intended underlying causes of
feedback to both
learning outcomes these problems that
mastery students and teachers
graded concerning learning do not respond to first
of prerequisite
examples: quarter success and failures aid treatment
skills exams, unit or
not graded helps formulate a
chapter tests, final plan for a detailed
not graded exams examples: short
quizzes, recitations remedial instruction
determines the extent of what the pupils

have achieved or mastered in the objectives
of the intended instruction administered during instruction
determine the students strength and
designed to formulate a plan for
weaknesses
remedial instruction
place the students in specific learning groups
to facilitate teaching and learning modify the teaching and learning
process
serve as a pretest for the next unit
not graded
serve as basis in planning for a relevant
instruction
PRINCIPLES OF EVALUATION
Significance
Evaluation is an essential component of the teaching-learning process.
Continuity
Evaluation takes place before, during and after instruction.
Scope
Evaluation should be comprehensive and as varied as the scope of objectives.
Compatibility
Evaluation must be well-matched with the stated objectives.
Validity
There must be a close relationship between what an evaluation instrument actually measure and what it is
supposed to measure.
Objectivity
Although effective evaluation should use all the available information, it is generally believed that this
information is more worthwhile if it is objectively obtained.
Reliability
Evaluation instrument should be consistent in measuring what it does measure.
Diagnostic Value
Effective evaluation should distinguish not only between level of learners performance but also between the
processes which result in acceptable performance.
Participation
Evaluation should be a cooperative effort of school, administrators, teachers, students and parents.
Variety
Evaluation procedures are of different types like standardized tests, teacher-made tests, systematic
observation, rating scales, etc.
Fairness
Evaluation should provide students equal opportunity to demonstrate their knowledge, skills and
performance.
DIFFERENT TYPES OF TESTS
MAIN POINTS FOR TYPES OF TESTS

COMPARISON
Psychological Educational
Aims to measure students Aims to measure the result
intelligence or mental ability in a of instructions and learning
large degree without reference to (e.g. Achievement Tests,
Purpose what the students has learned Performance Tests)
Measures the intangible
characteristics of an individual
(e.g. Aptitude Tests, Personality
Tests, Intelligence Tests)
Survey Mastery
Covers a broad range of Covers a specific objective
objectives
Scope of Content Measures general achievement Measures fundamental
in certain subjects skills and abilities
Constructed by trained Typically constructed by
professional the teacher
Norm-Referenced Criterion-Referenced
Result is interpreted by Result is interpreted by
comparing one students comparing students
performance with other students performance based on a
Interpretation
performance predefined standard
Some will really pass All or none may pass
There is competition for a limited There is no competition for
percentage of high scores a limited percentage of
high score
Describes pupils performance Describes pupils mastery
compared to others of course objectives
Verbal Non-Verbal
Words are used by students in Students do not use words
attaching meaning to or in attaching meaning to or
Language Mode
responding to test items in responding to test items
(e.g. graphs, numbers, 3-D
subjects)
Standardized Informal
Constructed by a professional Constructed by a
item writer classroom teacher
Covers a broad range of content Covers a narrow range of
covered in a subject area content
Uses mainly multiple choice Various types of items are
used
Construction
Items written are screened and Teacher picks or writes
the best items were chosen for items as needed for the
the final instrument test
Can be scored by a machine Scored manually by the
teacher
Interpretation of results is usually Interpretation is usually
norm-referenced criterion-referenced
Individual Group
Mostly given orally or requires This is a paper-and-pen
actual demonstration of skill test
One-on-one situations, thus, Loss of rapport, insight
Manner of many opportunities for clinical and knowledge about each
Administration observation examinee
Chance to follow-up examinees Same amount of time
response in order to clarify or needed to gather
comprehend it more clearly information from one
student
Objective Subjective
Scorers personal judgment does Affected by scorers
not affect the scoring personal opinions, biases
and judgments
Effect of Biases Worded that only one answer is Several answers are
acceptable possible
Little or no disagreement on what Possible to disagreement
is the correct answer on what is the correct
answer
Power Speed
Consists of series of items Consists of items
arranged in ascending order of approximately equal in
Time Limit and Level
difficulty difficulty
of Difficulty
Measures students ability to Measures students speed
answer more and more difficult or rate and accuracy in
items responding
Selective Supply
There are choices for the answer There are no choices for
the answer
Multiple choice, True or False, Short answer, Completion,
Matching Type Restricted or Extended
Essay
Format
Can be answered quickly May require a longer time
to answer
Prone to guessing Less chance to guessing
but prone to bluffing
Time consuming to construct Time consuming to answer
and score
Types of Test According to FORMAT
1. Selective Type provides choices for the answer

a. Multiple Choice consists of a stem which describes the problem and 3 or more
alternatives which give the suggested solutions. The incorrect alternatives are the
distractors.
b. True-False or Alternative Response consists of declarative statement that one

has to mark true or false, right or wrong, correct or incorrect, yes or no, fact or
opinion, and the like.
c. Matching Type consists of two parallel columns: Column A, the column of premises
from which a match is sought; Column B, the column of responses from which the
selection is made.
2. Supply Test
a. Short Answer uses a direct question that can be answered by a word, phrase, a
number, or a symbol
b. Completion Test it consists of an incomplete statement
3. Essay Test
a. Restricted Response limits the content of the response by restricting the scope
of the topic
b. Extended Response allows the students to select any factual information that
they think is pertinent, to organize their answers in accordance with their best
judgment
Types of NON-COGNITIVE TEST
1. Closed-Item or Forced-choice Instruments ask for one or specific answer

a. Checklist measures students preferences, hobbies, attitudes, feelings, beliefs,
interests, etc. by marking a set of possible responses
b. Scales these instruments that indicate the extent or degree of ones response
1) Rating Scale measures the degree or extent of ones attitudes, feelings, and
perception about ideas, objects and people by marking a point along 3- or 5- point
scale
2) Semantic Differential Scale measures the degree of ones attitudes, feelings and
perceptions about ideas, objects and people by marking a point along 5- or 7- or 11-
point scale of semantic adjectives
3) Likert Scale measures the degree of ones agreement or disagreement on positive
or negative statements about objects and people
c. Alternative Response measures students preferences, hobbies, attitudes, feelings,

beliefs, interests, etc. by choosing between two possible responses
d. Ranking measures students preferences or priorities by ranking a set of responses
2. Open-Ended Instruments they are open to more than one answer

a. Sentence Completion measures students preferences over a variety of attitudes and
allows students to answer by completing an unfinished statement which may vary in
length
b. Surveys measures the values held by an individual by writing one or many responses
to a given question
c. Essays allows the students to reveal and clarify their preferences, hobbies, attitudes,
feelings, beliefs, and interests by writing their reactions or opinions to a given question
General Suggestions in Writing Tests
1. Use your test specifications as guide to item writing.

2. Write more test items than needed.
3. Write the test items well in advance of the testing date.
4. Write each test item so that the task to be performed is clearly defined.
5. Write each test item in appropriate reading level.
6. Write each test item so that it does not provide help in answering other items in the test.
7. Write each test item so that the answer is one that would be agreed upon by experts.
8. Write test items so that it is the proper level of difficulty.
9. Whenever a test is revised, recheck its relevance.
Specific Suggestions
SUPPLY TYPE OF TESTS

1. Word the item/s so that the required answer is both brief and specific.
2. Do not take statements directly from textbooks to use as a basis for short answer items.
3. A direct question is generally more desirable than an incomplete statement.
4. If the item is to be expressed in numerical units, indicate type of answer wanted.
5. Blanks should be equal in length.
6. Answers should be written before the item number for easy checking.
7. When completion items are to be used, do not have too many blanks. Blanks should be at the center
of the sentence and not at the beginning.
SELECTIVE TYPE OF TESTS

A. Alternative-Response
1. Avoid broad statements.
2. Avoid trivial statements.
3. Avoid the use of negative statements especially double negatives.
4. Avoid long and complex sentences.
5. Avoid including two ideas in one sentence unless cause and effect relationship is being measured.
6. If opinion is used, attribute it to some source unless the ability to identify opinion is being specifically
measured.
7. True statements and false statements should be approximately equal in length.
8. The number of true statements and false statements should be approximately equal.
9. Start with false statement since it is a common observation that the first statement in this type is
always positive.
B. Matching Type
1. Use only homogenous materials in a single matching exercise.
2. Include an unequal number of responses and premises, and instruct the pupils that response may
be used once, more than once, or not at all.
3. Keep the list of items to be matched brief, and place the shorter responses at the right.
4. Arrange the list of responses in logical order.
5. Indicate in the directions the basIs for matching the responses and premises.
6. Place all the items for one matching exercise on the same page.
C. Multiple Choice
1. The stem of the item should be meaningful by itself and should present a definite problem.
2. The item should include as much of the item as possible and should be free of irrelevant information.
3. Use a negatively stated item stem only when significant learning outcome requires it.
4. Highlight negative words in the stem for emphasis.
5. All the alternatives should be grammatically consistent with the stem of the item.
6. An item should only have one correct or clearly best answer.
7. Items used to measure understanding should contain novelty, but beware of too much.
8. All distractors should be plausible.
9. Verbal association between the stem and the correct answer should be avoided.
10. The relative length of the alternatives should not provide a clue to the answer.
11. The alternatives should be arranged logically.
12. The correct answer should appear in each of the alternative positions and approximately equal
number of times but in random number.
13. Use of special alternatives such as none of the above or all of the above should be done
sparingly.
14. Do not use multiple choice items when other types are more appropriate.
15. Always have the stem and alternatives on the same page.
16. Break any of these rules when you have a good reason for doing so.
D. Essay Type of Test
1. Restrict the use of essay questions to those learning outcomes that cannot be satisfactorily
measured by objective items.
2. Formulate questions that will cell forth the behavior specified in the learning outcome.
3. Phrase each question so that the pupils task is clearly indicated.
4. Indicate an approximate time limit for each question.
5. Avoid the use of optional questions.
SUGGESTIONS IN WRITING NON-TEST OF ATTITUDINAL NATURE

1. Avoid statements that refer to the past rather than to the present.
2. Avoid statements that are factual or capable of being interpreted as factual.
3. Avoid statements that may be interpreted in more than one way.
4. Avoid statements that are irrelevant to the psychological object under consideration.
5. Avoid statements that are likely to be endorsed by almost everyone or by almost no one.
6. Select statements that are believed to cover the entire range of affective scale of interests.
7. Keep the language of the statements simple, clear and direct.
8. Statements should be short, rarely exceeding 20 words.
9. Each statement should contain only one complete thought.
10. Statements containing universals such as all, always, none and never often introduce ambiguity and
should be avoided.
11. Words such as only, just, merely, and others of similar nature should be used with care and
moderation in writing statements.
12. Whenever possible, statements should be in the form of simple statements rather than in the form of
compound or complex sentences.
13. Avoid the use of words that may not be understood by those who are to be given the completed
scale.
14. Avoid the use of double negatives.
CRITERIA TO CONSIDER IN CONSTRUCTING GOOD TESTS
VALIDITY - is the degree to which a test measures what is intended to be measured. It is the usefulness
of the test for a given purpose. It is the most important criteria of a good examination.
FACTORS influencing the validity of tests in general
Appropriateness of test it should measure the abilities, skills and information it is supposed
to measure
Directions it should indicate how the learners should answer and record their answers
Reading Vocabulary and Sentence Structure it should be based on the intellectual level of
maturity and background experience of the learners
Difficulty of Items- it should have items that are not too difficult and not too easy to be able to
discriminate the bright from slow pupils
Construction of Items it should not provide clues so it will not be a test on clues nor should
it be ambiguous so it will not be a test on interpretation
Length of Test it should just be of sufficient length so it can measure what it is supposed to
measure and not that it is too short that it cannot adequately measure the performance we
want to measure
Arrangement of Items it should have items that are arranged in ascending level of difficulty
such that it starts with the easy ones so that pupils will pursue on taking the test
Patterns of Answers it should not allow the creation of patterns in answering the test
WAYS of Establishing Validity
Face Validity is done by examining the physical appearance of the test

Content Validity is done through a careful and critical examination of the objectives of the
test so that it reflects the curricular objectives
Criterion-related validity is established statistically such that a set of scores revealed by a
test is correlated with scores obtained in another external predictor or measure. Has two
purposes:
Concurrent Validity describes the present status of the individual by correlating
the sets of scores obtained from two measures given concurrently
Predictive Validity describes the future performance of an individual by correlating
the sets of scores obtained from two measures given at a longer time interval
Construct Validity is established statistically by comparing psychological traits or factors

that influence scores in a test, e.g. verbal, numerical, spatial, etc.
Convergent Validity is established if the instrument defines another similar trait
other than what it intended to measure (e.g. Critical Thinking Test may be correlated
with Creative Thinking Test)
Divergent Validity is established if an instrument can describe only the intended
trait and not other traits (e.g. Critical Thinking Test may not be correlated with
Reading Comprehension Test)
RELIABILITY - it refers to the consistency of scores obtained by the same person when retested using
the same instrument or one that is parallel to it.
FACTORS affecting Reliability
Length of the test as a general rule, the longer the test, the higher the reliability. A longer
test provides a more adequate sample of the behavior being measured and is less distorted by
chance of factors like guessing.
Difficulty of the test ideally, achievement tests should be constructed such that the average
score is 50 percent correct and the scores range from zero to near perfect. The bigger the
spread of scores, the more reliable the measured difference is likely to be. A test is reliable if
the coefficient of correlation is not less than 0.85.
Objectivity can be obtained by eliminating the bias, opinions or judgments of the person
who checks the test.
Administrability the test should be administered with ease, clarity and uniformity so that
scores obtained are comparable. Uniformity can be obtained by setting the time limit and oral
instructions.
Scorability the test should be easy to score such that directions for scoring are clear, the
scoring key is simple, provisions for answer sheets are made
Economy the test should be given in the cheapest way, which means that answer sheets
must be provided so the test can be given from time to time
Adequacy - the test should contain a wide sampling of items to determine the educational
outcomes or abilities so that the resulting scores are representatives of the total performance
in the areas measured
Type of Reliability
Method Procedure Statistical Measure
Measure
Give a test twice to the same
group with any time interval
Test-Retest Measure of stability Pearson r
between sets from several
minutes to several years
Measure of Give parallel forms of test at the
Equivalent Forms Pearson r
equivalence same time between forms
Give parallel forms of test with
Test-Retest with Measure of stability
increased time intervals between Pearson r
Equivalent Forms and equivalence
forms
Give a test once. Score equivalent Pearson r and
Measure of Internal
Split Half halves of the test (e.g. odd-and Spearman Brown
Consistency
even numbered items) Formula
Give the test once, then correlate
Kuder- Measure of Internal the proportion/percentage of the Kuder Richardson
Richardson Consistency students passing and not passing Formula 20 and 21
a given item
SHAPES OF FREQUENCY POLYGONS
1. Normal / Bell-Shaped / Symmetrical

2. Positively Skewed most scores are below the mean and there are extremely high scores
3. Negatively Skewed most scores are above the mean and there are extremely low scores
4. Leptokurtic highly peaked and the tails are more elevated above the baseline
5. Mesokurtic moderately peaked
6. Platykurtic flattened peak
7. Bimodal Curve curve with 2 peaks or modes
8. Polymodal Curve curve with 3 or more modes
9. Rectangular Distribution there is no mode
FOUR TYPES OF MEASUREMENT SCALES
Measurement Characteristics Example

Nominal Groups and labal data Gender (1-male; 2-female)
Rank data Income (1-low, 2-average, 3-
Ordinal Distance between points are high)
indefinite
Distance between points are Test scores
equal Temperature
Interval
No absolute zero (a zero score in a test does not
mean no knowledge at all)
No absolute zero Height
Weight
Ratio
(a zero weight means no
weight at all)
MESURES OF CENTRAL TENDENCY AND VARIABILITY
ASSUMPTIONS WHEN USED APPROPRIATE STATISTICAL TOOLS

MEASURES OF CENTRAL MEASURES OF VARIABILITY
TENDENCY (describes the degree of
(describes the representative spread or dispersion of a set of
value of a set of data) data)
When the frequency Mean the arithmetic average Standard Deviation the root-
distribution is regular or mean-square of the deviations
symmetrical (normal) from the mean
Usually used when data
are numeric (interval or
ratio)
When the frequency Median the middle score in a Quartile Deviation the
distribution is irregular or group of scores that are ranked average deviation of the 1st and
skewed 3rd quartiles from the median
Usually when the data is
ordinal
When the distribution of Mode the most frequent Range the difference
scores is normal and score between the highest and the
quick answer is needed lowest score in the distribution
Usually used when the
data are nominal
MEASURES OF CENTRAL TENDENCY

MEAN
1) Ungrouped Data: used for few cases (N<30)
a. Get the sum of scores (X)
b. Divide the sum by the number of cases (N)
Formula:
X
X
N
Example:
Scores are 5, 7, 9, 11 and 13
Sum of scores is 45
X 45
X 9
N 5
2) Grouped Data: used for large cases (N>30)

a. MIDPOINT METHOD
Procedure: Example:
1. Group data in the form a frequency Class Limits M F MF
distribution 45-49 47 2 94
2. Compute the midpoints of all class 40-44 42 0 0
limits (M) 35-39 37 12 444
3. Multiply the midpoints by their 30-34 32 13 416
corresponding frequencies (M x F) 25-29 27 10 270
4. Get the sum of the products of the 20-24 22 5 110
midpoints and frequencies (MF) 15-19 17 4 68
5. Divide the sum by the number of cases 10-14 12 4 48
(N) N=50 MF=1450
Formula:
MF MF 1450
X X 29
N N 50
b. CLASS DEVIATION METHOD

Procedure:
1. Choose an arbitrary starting point or Example:
origin from any of the class limits. Class Limits F D FD
2. Get the midpoint of the class limit you 45-49 2 3 6
have chosen as your starting point. 40-44 0 2 0
35-39 12 1 12
Call this your Assumed Mean (AM).
30-34 13 0 0 - Origin (AM=32)
3. Get the deviation (D) of each limit from
25-29 10 -1 -10
the class limit where the assumed
20-24 5 -2 -10
mean is. The deviation of the class limit
15-19 4 -3 -12
of the Assumed Mean is 0. Add +1 to
10-14 4 -4 -16
each class limit higher than the
N=50 FD=-30
Assumed Mean consecutively and
subtract 1 (-1) to the class limits lower
than the Assumed Mean. i=5
4. Multiply their frequencies by their
deviations (FD). FD
X AM i
5. Add the products of the frequencies N
and deviations (FD) 30
6. Divide the sum by the number of cases 32 5
(FD/N) 50
7. Multiply the quotient by the number of 32 5( 0.6)
class interval (i) 32 ( 3)
8. Add the product to the Assumed Mean
29
Formula:
FD
X AM i
N
MEDIAN
1) Ungrouped Data:
Case 1. The total number of cases is an odd number.

Procedure: Example: (N=11)
1. Arrange the scores from highest to
lowest or vice versa 1. 100
2. Get the middlemost score. That score 2. 98
is the median score. 3. 97
4. 96
5. 94
6. 92 median score
7. 91
8. 90
9. 88
10. 87
11. 87
Case 2. The total number of cases is an even number.
Procedure: Example: (N=8)

1. Arrange the scores from highest to
lowest or vice versa 1. 100
2. Get the two middlemost scores. 2. 98
3. Compute the average of the two 3. 97
middlemost scores. The average is the 4. 96
median score. middlemost scores
5. 94
6. 92 Median = 96+94
7. 91 2
8. 90 = 95
Case 3. The middle most score occurs twice, thrice or more.
Procedure: Example:
1. Arrange the scores from highest to a. N is odd (N=7)
lowest or vice versa
2. Get the middlemost score/s, its/their 1. 90
identical score/s and its/their 2. 88
counterparts either above or below the 3. 87
middlemost score/s. 4. 85 middlemost scores
3. Compute their average and the 5. 85
average is the median score. 6. 84 Median = 87+85+85
7. 82 3
= 85.67
Example:
b. N is even (N=10)
1. 90
2. 88
3. 87
4. 85 Median = 85+84+83+83
5. 84 4
6. 83 = 83.75
7. 83
8. 82
9. 81
10. 80
MEDIAN
2) Grouped Data: (N>30)
Procedure: Example:
1. Add up or accumulate the frequencies
starting from the lowest to the highest Class Limits F CF
class limit. Call this the cumulative 45-49 2 50
frequency (CF). 40-44 0 48
2. Find one-half of the number of cases in 35-39 12 48
the distribution (N/2) 30-34 13 (FMdn) 36 - median class)
3. Find the cumulative frequency which is 25-29 10 23 (CFb)
equal or closest (but higher than) to the 20-24 5 13
half of the number of cases. The class 15-19 4 8
containing this frequency is the median 10-14 4 4
class. N=50
4. Find the lowest limit (LL) of the median
class by subtracting 0.5 from the lowest
i =5
score of the median class.
N/2 = 50/2 = 25
5. Get the cumulative frequency of the
LL = 30 0.5 = 29.5
class below the median class (CFb)
6. Subtract this from the half of the
number of cases in the distribution
(N/2 CFb) N CFb
~
X LL i 2
FMdn
7. Get the frequency of the median class
(FMdn)
8. Find the class interval (i) then follow 25 23
the given formula below. 29.5 5
13
Formula: 2
29.5 5
N CFb 13
X LL i 2
~
29.5 0.77
FMdn
30.27
Where:
LL = lowest limit of the median class
i = class interval
N/2 = half the number of cases
CFb = cumulative frequency below
the median class
FMdn = frequency of the median class
MODE
1) Ungrouped Data: (N>30)
Get the most frequent score
Example 1: one mode or unimodal

27, 26, 25, 24, 24, 23
Mode is 24
Example 2: two modes or bimodal

27, 27, 26, 25, 24, 24, 23
Modes are 27 and 24
Example 3: three modes or trimodal

27, 27, 26, 25, 24, 24, 23, 23
Modes are 27, 24 and 23
When there are more than three modes, they are called polymodal or multimodal
When there is no mode, it is described as a rectangular distribution
2) Grouped Data
a. Crude Mode refers to the midpoint of the class limit with the highest frequency.
Procedure: Example:
1. Find the class limit with the highest
frequency. Class Limits F
2. Get the midpoint of that class limit. 45-49 2
3. The midpoint is the crude mode. 40-44 0
35-39 12
30-34 13 - highest frequency
25-29 10
20-24 5 Mode = midpoint of 30-34
15-19 4 = 32
10-14 4
N=50
b. Refined Mode refers to the mode obtained from an ordered arrangements or a class
frequency distribution
Procedure: Example:
1. Get the mean and the median of the
grouped data. Using the data in the mean and its median
2. Multiply the median by three (3Mdn)
3. Multiply the mean by two (2Mn) X 3X 2X
4. Find the difference 3(30.27 ) 2(29 )
Formula: 90.81 58
X 3Mdn 2Mn 32.81
HOW to INTERPRET the Measures of CENTRAL TENDENCY
The value that represents a set of data will be the basis in determining whether the group is
performing better or poorer that the other groups.
MEASURES OF VARIABILITY
RANGE (R)
1. Ungrouped Data the difference between the highest and lowest score
2. Grouped Data the difference between the highest limit of the highest class limit and the lowest
limit of the lowest class limit
STANDARD DEVIATION (SD)

Ungrouped Data
Procedure: Example:
1. Find the mean. ( X)
X d= X - X ) d2
2. Subtract the mean from each score to
5 -4 16
get the deviation. (d= X - X )
7 -2 4
3. Square the deviation (d2)
4. Get the sum of the squared deviations 9 0 0
(d2) 11 2 4
5. Divide the sum by the number of cases 13 4 16
(d2/N-1) =40
6. Get the square root of the quotient. X=9
Formula: N=5
d2 40
SD SD =
N1 4
= 3.16
STANDARD DEVIATION (SD)
Grouped Data
CLASS DEVIATION METHOD
Procedure: Example:
1. Like what you did in the mean; get the
deviation (d) and the product of the Class
frequency and deviation of each score Limits f d fd fd2
(fd) 45-49 2 3 6 18
2. multiply the product of the frequency- 40-44 0 2 0 0
deviation (fd) by deviation (d) = (fd2) 35-39 12 1 12 12
3. Get the summation of fd2 30-34 13 0 0 0
4. Compute the standard deviation using 25-29 10 -1 -10 10
the formula below 20-24 5 -2 -10 20
15-19 4 -3 -12 36
Formula: 10-14 4 -4 -16 64
N=50 fd=-30 fd2=160
fd2 fd2
SD i
N N2
fd2 fd2
where: SD i
N N2
i = interval
N = number of cases 160 30 2
5
fd = sum of the product of frequency 50 50 2
and deviation
fd2 = sum of the product of frequency 5 3.2 0.36
and squared deviation
5 2.84
5 (1.69)
SD = 8.4
MIDPOINT METHOD
Procedure: Example:
1. Square the midpoint (M2) and multiply it
Class
by the frequency. (Shortcut: multiply Limits f M fM fM2
Midpoint (M) by the product of 45-49 2 47 94 4418
frequency-midpoint (FM). 40-44 0 42 0 0
2. Write the product of (M) and (FM) and 35-39 12 37 444 16428
label the column (FM2) 30-34 13 32 416 13312
3. Use the formula below to compute for 25-29 10 27 270 7290
Standard Deviation: 20-24 5 22 110 2420
15-19 4 17 68 1156
Formula:
10-14 4 12 48 576
N=50 fM=1450 fM2=45600
FM2
SD ( X) 2
N X = 29
fM2
SD ( X) 2
N
45600
(29 ) 2
50
912 (841)
71
SD = 8.4
HOW to INTERPRET the STANDARD DEVIATION
The result will help you determine if the group is homogeneous or not.
The result will also help you determine the number of students that fall below and above the average
performance.
Main points to remember:

Points above Mean + 1SD = range of above average
Mean + 1SD
= give the limits of an average ability
Mean - 1SD
Points below Mean 1SD = range of below average
QUARTILE DEVIATION (QD)

Ungrouped Data
Procedure: Example:
1. Arrange the scores in descending
order or ascending order. Rank of
2. Compute the Q1 (i.e. 1/4N) and the Scores
10 99
result tells the rank of the Q1 score in 9 90
the ordered arrangement from the
8 88 Q3 score lies
bottom. 7 87 between 87 & 88
3. Look for the score in the rank.
6 85
4. Compute the Q3 score (i.e. 3/4N) and 5 80
the result tells the rank of the Q3 score. 4 79
5. Look for the Q3 score in the rank. 3 78 Q1 score lies
6. Compute for QD. 2 77 between 78 & 77
1 76
Formula: N = 10
Q 3 - Q1 Q1 = (10) = 2.5th score

QD Q3 = (10) = 7.5th score
2
77 + 78
Q1 = = 77.5
2
88 + 87
Q3 = = 87.5
2
Q3 - Q1 87.5 - 77.5
QD = = = 5
2 2
Grouped Data
Example:
Procedure:
1. Compute for the value of 1st Quartile. Class Limits F CF
45-49 2 50
Formula: 40-44 0 48
N - CFb 35-39 12 48
Q 1 LL i 4
FQ1 30-34 13 36
25-29 10 23
where: Q1 = 1st Quartile 20-24 5 13 Q1 class
LL = lowest limit of Q1 class 15-19 4 8
N/4 = one-fourth of the total number 10-14 4 4
of cases N=50
CFb = cumulative frequency below i =5
the Q1 class
FQ1 = frequency of Q1 class Q1 class = N/4 = 50/4 = 12.5
i = interval N - CFb
12.5 8
Q1 LL i 4 19.5 5
FQ1 5

Q1 = 24
Procedure: Example:
2. Compute for the value of 3rd Quartile.
Class Limits F CF
Formula: 45-49 2 50
3N - CFb 40-44 0 48
Q 3 LL i 4
35-39 12 48 Q3 class
FQ3
30-34 13 36
where: Q3 = 3rd Quartile 25-29 10 23
LL = lowest limit of Q3 class 20-24 5 13
3N/4 = three-fourth of the total 15-19 4 8
number of cases 10-14 4 4
CFb = cumulative frequency below N=50
the Q1 class i =5
FQ3 = frequency of Q3 class
i = interval Q3 class = 3N/4 = 150/4 = 37.5
3N - CFb
37.5 36
Q 3 LL i 4 34.5 5
FQ1 12
3. Compute for the interquartile range or
QD. Q3 = 35.125
Formula:
Q 3 - Q1
Q 3 - Q1 QD
QD 2
2
35.125 - 24
= 5.56
2
HOW to INTERPRET the QUARTILE DEVIATION
The result will help you determine if the group is homogeneous or not.
The result will also help you determine the number of students that fall below and above the average
performance.
Main points to remember:

Points above Median + 1QD = range of above average
Median + 1QD
= give the limits of an average ability
Median 1QD
Points below Median 1QD = range of below average
MEASURES OF CORRELATION
PEARSON r
XY X Y Where:
X scores in a test
r
N N N
2 2
Y scores in a retest
X2 X Y2 Y N number of examinees

N N N N
Spearman Brown Formula

Where:
2roe roe reliability coefficient using
reliability of the whole test =
1 roe split-half or odd-even
procedure
Kuder-Richardson Formula 20
Where:
K pq K number of items of a test
KR20 1 2
K 1 S p proportion of the examinees
who got the item right
q proportion of the examinees
who got the item wrong
S2 variance or standard deviation
squared
Kuder-Richardson Formula 21
Where:
K Kpq X
KR 21 1 2 p
K 1 S K
q=1-p
INTERPRETATION OF THE Pearson r

Correlation value
1 ----------- Perfect Positive Correlation

high positive correlation for Validity:
0.5 ----------- Positive Correlation computed r should be at least 0.75 to
low positive correlation be significant
0 ----------- Zero Correlation
low negative correlation for Reliability:
-0.5 ----------- Negative Correlation computed r should be at least 0.85 to
high negative correlation be significant
-1 ----------- Perfect Negative Correlation
STANDARD SCORES
Indicate the pupils relative position by showing how far his raw score is above or below average
Express the pupils performance in terms of standard unit from the mean
Represented by the normal probability curve or what is commonly called the normal curve
Used to have a common unit to compare raw scores from different tests
PERCENTILE
tells the percentage of examines that lies below ones score
Example:
P85 = 70 (This means the person who scored 70 performed better than 85% of the
examinees)
85%N CFb
Formula: P85 LL i
FP85
Z-SCORES
tells the number of standard deviations equivalent to a given raw score
XX Where:
Formula: Z X individuals raw score
SD
X mean of the normative group
SD standard deviation of the
normative group
Example:
Mean of a group in a test: X = 26

SD = 2
Josephs Score Johns Score

X = 27 X = 25
X X 27 26 1 X X 25 26 1
Z Z
SD 2 2 SD 2 2
Z = 0.5 Z = -0.5
T-SCORES
it refers to any set of normally distributed standard deviation score that has a mean of 50 and
a standard deviation of 10
computed after converting raw scores to z-scores to get rid of negative values
Formula: T score 50 10(Z)
Example:
Josephs T-score = 50 + 10(0.5)
= 50 + 5
= 55
Johns T-score = 50 + 10(-0.5)

= 50 5
= 45
ASSIGNING GRADES / MARKS / RATINGS
Marking or Grading is the process of assigning value to a performance
Marks / Grades / Rating SYMBOLS:
Could be in:
percent such as 70%, 88% or 92%
letters such as A, B, C, D or F
numbers such as 1.0, 1.5, 2.75, 5
descriptive expressions such as Outstanding (O), Very Satisfactory (VS), Satisfactory (S),
Moderately Satisfactory (MS), Needs Improvement (NI)
Could represent:
how a student is performing in relation to other students (norm-referenced grading)
the extent to which a student has mastered a particular body of knowledge (criterion-
referenced grading)
how a student is performing in relation to a teachers judgment of his or her potential
Could be for:
Certification that gives assurance that a student has mastered a specific content or
achieved a certain level of accomplishment
Selection that provides basis in identifying or grouping students for certain educational paths
or programs
Direction that provides information for diagnosis and planning
Motivation that emphasizes specific material or skills to be learned and helping students to
understand and improve their performance
Could be based on:

examination results or test data
observations of student works
group evaluation activities
class discussions and recitations
homeworks
notebooks and note taking
reports, themes and research papers
discussions and debates
portfolios
projects
attitudes, etc.
Could be assigned by using:

Criterion-Referenced Grading or grading based on fixed or absolute standards where
grade is assigned based on how a student has met the criteria or a well-defined objectives of
a course that were spelled out in advance. It is then up to the student to earn the grade he or
she wants to receive regardless of how other students in the class have performed. This is
done by transmuting test scores into marks or ratings.
Norm-Referenced Grading or grading based on relative standards where a students
grade reflects his or her level of achievement relative to the performance of other students in
the class. In this system, the grade is assigned based on the average of test scores. The
rating scales that are used in assigning grades are:
Point or Percentage Grading System whereby the teacher identifies points or percentages
for various tests and class activities depending on their importance. The total of these points
will be the bases for the grade assigned to the student.
Contract Grading System where each student agrees to work for a particular grade
according to agreed-upon standards.
GUIDELINES IN GRADING STUDENTS
1. Explain your grading system to the students early in the course and remind them of the grading
policies regularly.
2. Base grades on a predetermined and reasonable set of standards.
3. Base your grades on as much objective evidence as possible.
4. Base grades on the students attitude as well as achievement, especially at the elementary and
high school level.
5. Base grades on the students relative standing compared to classmates.
6. Base grades on a variety of sources.
7. As a rule, do not change grades, once computed.
8. Become familiar with the grading policy of your school and with your colleagues standards.
9. When failing a student, closely follow school procedures.
10. Record grades on report cards and cumulative records.
11. Guard against bias in grading.
12. Keep pupils informed of their standing in the class.
References:
Fraenkel, J.R. & Wallen, N.E. (1993) How to Design and Evaluate Research in Education, 2nd
Edition, New York: McGrawHill, Inc.
Nackmeas, C.F. and Nachmeas, D (1996). Research Methods in the Social Sciences, 5th Edition,
London: St. Martius Press, Inc.
Oriondo, Leonora, et.al. (1996) Evaluating Educational Outcomes. Quezon City: Rex Printing
Company, Inc.
Ornstein, Allan C. (1990). Strategies for Effective Teaching. New York: Harper Collins Publisher,
Navotas, M.M.

LET - REVIEW - Measurement - Assessment of Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LET - REVIEW - Measurement - Assessment of Learning

Uploaded by

Copyright:

Available Formats

LICENSURE EXAMINATION FOR TEACHERS (LET)

FOCUS: PROFESSIONAL EDUCATION

AREA: ASSESSMENT OF LEARNING

1. Diagnose learning and strengths and difficulties

MODE DESCRIPTION EXAMPLES ADVANTAGES DISADVANTAGES

FOUR TYPES OF EVALUATION PROCEDURES

PLACEMENT SUMMATIVE FORMATIVE DIAGNOSTIC

done before done after reinforces successful determine recurring

determines the extent of what the pupils

DIFFERENT TYPES OF TESTS

MAIN POINTS FOR TYPES OF TESTS

1. Selective Type provides choices for the answer

b. True-False or Alternative Response consists of declarative statement that one

Types of NON-COGNITIVE TEST

1. Closed-Item or Forced-choice Instruments ask for one or specific answer

c. Alternative Response measures students preferences, hobbies, attitudes, feelings,

d. Ranking measures students preferences or priorities by ranking a set of responses

2. Open-Ended Instruments they are open to more than one answer

General Suggestions in Writing Tests

1. Use your test specifications as guide to item writing.

SUPPLY TYPE OF TESTS

SELECTIVE TYPE OF TESTS

SUGGESTIONS IN WRITING NON-TEST OF ATTITUDINAL NATURE

CRITERIA TO CONSIDER IN CONSTRUCTING GOOD TESTS

FACTORS influencing the validity of tests in general

WAYS of Establishing Validity

Face Validity is done by examining the physical appearance of the test

Construct Validity is established statistically by comparing psychological traits or factors

FACTORS affecting Reliability

SHAPES OF FREQUENCY POLYGONS

1. Normal / Bell-Shaped / Symmetrical

FOUR TYPES OF MEASUREMENT SCALES

Measurement Characteristics Example

MESURES OF CENTRAL TENDENCY AND VARIABILITY

ASSUMPTIONS WHEN USED APPROPRIATE STATISTICAL TOOLS

MEASURES OF CENTRAL TENDENCY

2) Grouped Data: used for large cases (N>30)

b. CLASS DEVIATION METHOD

Case 1. The total number of cases is an odd number.

Procedure: Example: (N=8)

Case 3. The middle most score occurs twice, thrice or more.

Get the most frequent score

Example 1: one mode or unimodal

Example 2: two modes or bimodal

Example 3: three modes or trimodal

HOW to INTERPRET the Measures of CENTRAL TENDENCY

STANDARD DEVIATION (SD)

Main points to remember:

QUARTILE DEVIATION (QD)

Q 3 - Q1 Q1 = (10) = 2.5th score

HOW to INTERPRET the QUARTILE DEVIATION

Main points to remember:

Spearman Brown Formula

INTERPRETATION OF THE Pearson r

1 ----------- Perfect Positive Correlation

Mean of a group in a test: X = 26

Josephs Score Johns Score

Formula: T score 50 10(Z)

Johns T-score = 50 + 10(-0.5)

ASSIGNING GRADES / MARKS / RATINGS

Marking or Grading is the process of assigning value to a performance

Marks / Grades / Rating SYMBOLS:

Could be based on: