EDA 1130�EDUCATIONAL ASSESSMENT��Chapter 3, 4 & 5�Video B� �
Chapter 3�Meaning of Test scores�
CHAPTER 3 OVERVIEW
What is a Test?
What is Test Score
Raw scores
Types of Scores
Scaled scores
Norm-Referenced & Criterion-Referenced Score Interpretation
Norm-Referenced Interpretations
Criterion-Referenced Interpretations
| Norm-Referenced Test | Criterion-Referenced Test |
Aim | Compare a student’s performance with other students Select students for certification
| Compare a student’s performance against some criteria (e.g. learning outcomes) Extent to which student has acquired the knowledge or skill Improve teaching & learning |
Types of Questions | Questions from simple to difficult | Questions of nearly similar difficulty relating to the criteria |
Reporting of results | Grades are assigned | No grades are assigned (whether skill or knowledge achieved or not) |
Content coverage | Wide content coverage | Specific aspects of the content |
Examples | UPSR, PMR, SPM national examinations, end of semester / year exams | Class tests, exercises and assignments |
Qualitative Description of Scores
Chapter 4 & 5�Reliability and Validity for Teachers�
Chapter 4 & 5 Overview
Introduction
In any measurement/ testing, we get a score e.g. 60%
Does this score tell the true ability of a student?
Observed Score = True Score + Error
(what we see/measure) (Real ability)
It is impossible to develop a test without error,
but what is important is the error is small and
consistence; and the test really measure what it wants to measure (true score)
What is Reliability?
Observed Score = True Score + Error
(what we see/measure) (Real ability)
A good test /instrument must have Reliability and Validity.
The Reliability Coefficient
Reliability True Score Variance
Coefficient, (R) = ----------------------------------
Observed Score Variance
R = 1 means perfect test, no error
(variance is a measure of error)
Interpretation of Reliability Coefficients
Reliability (R) | Interpretation |
0.90 and above 0.80 – 0.90 0.70 – 0.80
0.60 – 0.70 0.50 – 0.60
0.50 and below | Excellent reliability (comparable to the best standardised tests like SAT) Very good for a classroom test Good for a classroom test but there are probably a few items which could be improved Somewhat low. There are probably some items which could be removed or improved The test needs to be revised. Questionable reliability and the test should be replaced or needs major revision |
Methods to Estimate the Reliability of a Test
To the same group of Students (two times test):
Inter-rater and Intra-rater Reliability
For observation and oral presentation method of evaluation
7.5 Validity
WHAT IS VALIDITY?
E.g.
Types of Validity
Types | Description |
Construct V. | actual purpose like math achievement, map skills, reading comprehension |
Content V. | coverage of appropriate and necessary content for the purpose |
Criterion-Related V. | relating the scores obtained to the scores of some other criterion or other related test |
a) Predictive V. | high predictive - TOEFL, SAT |
b) Concurrent V. | Correlate with same skill - MUET & oral test |
Reliability & Validity
Reliability: Practical Strategies for Teachers
Examination of test content
Examination of test fairness
Examination of practical features
Limitation of the questionnaire
Validity Practical Strategies for Teachers
Factors Affecting Reliability and Validity
The Standard Error of Measurement (SEM)
Individual Characteristics
External Characteristics
Sources of Measurement Error
Threat to Validity
Assessment Bias
Problem of Bias in Edu Assessment�