1 of 28

EDA 1130�EDUCATIONAL ASSESSMENTChapter 3, 4 & 5�Video B

Dr Kim Teng Siang

kskim2007@gmail.com

0124661131

2 of 28

Chapter 3Meaning of Test scores

3 of 28

CHAPTER 3 OVERVIEW

  • What is a Test?
  • What is test score?
  • Norm and Criterion-reference scores or both
  • Qualitative description of scores

4 of 28

What is a Test?

  • Test are the tools, which measures the quality & quantity of performance of an individuals
  • A test is a systematic procedure for measuring an individual’s behavior.
  • It is a formal & systematic way of gathering information about individuals

5 of 28

  • A test score is a piece of information, usually a number, that conveys the performance of an examinee on a test.
  • One formal definition is that it is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being measured."

What is Test Score

6 of 28

Raw scores

  • A raw score is a score without any sort of adjustment or transformation, such as the simple number of questions answered correctly

Types of Scores

Scaled scores

  • A scaled score is the result of some transformation(s) applied to the raw score.

7 of 28

Norm-Referenced & Criterion-Referenced Score Interpretation

Norm-Referenced Interpretations

Criterion-Referenced Interpretations

  • Means that we are referencing how your score compares to other people.
  • Norm-referenced tests make comparisons between individuals

  • Criterion-referenced means that we are referencing how your score compares to a criterion such as a cut score or a body of knowledge.
  • Criterion-referenced tests measure a test taker's performance compared to a specific set of standards or criteria

8 of 28

Norm-Referenced Test

Criterion-Referenced Test

Aim

Compare a student’s performance with other students

Select students for certification

Compare a student’s performance against some criteria (e.g. learning outcomes)

Extent to which student has acquired the knowledge or skill

Improve teaching & learning

Types of Questions

Questions from simple to difficult

Questions of nearly similar difficulty relating to the criteria

Reporting of results

Grades are assigned

No grades are assigned (whether skill or knowledge achieved or not)

Content coverage

Wide content coverage

Specific aspects of the content

Examples

UPSR, PMR, SPM national examinations, end of semester / year exams

Class tests, exercises and assignments

9 of 28

Qualitative Description of Scores

10 of 28

Chapter 4 & 5Reliability and Validity for Teachers

11 of 28

Chapter 4 & 5 Overview

  • What is Reliability?
  • The Reliability Coefficient
  • Methods to Estimate Reliability
  • Inter and Intra-rater Reliability
  • Types of Validity
  • Practical Strategies for Teachers
  • Factors Affecting Reliability and Validity

12 of 28

Introduction

In any measurement/ testing, we get a score e.g. 60%

Does this score tell the true ability of a student?

Observed Score = True Score + Error

(what we see/measure) (Real ability)

It is impossible to develop a test without error,

but what is important is the error is small and

consistence; and the test really measure what it wants to measure (true score)

  • Reliability is the consistency of the measurement.

13 of 28

What is Reliability?

Observed Score = True Score + Error

(what we see/measure) (Real ability)

A good test /instrument must have Reliability and Validity.

  • Reliability is the consistency of the measurement or the error is stable and consistence
  • Validity refers to the accuracy of the instrument/ test to reflect the true ability (measure what it intended to measure)

14 of 28

The Reliability Coefficient

  • Reliability coefficient (R ) indicates how reliable an instrument / test is
  • Or how consistence the scoring / testing be

  • It is the variance of the true score divided by the variance of the observed score (R from 0 🡪 1)

Reliability True Score Variance

Coefficient, (R) = ----------------------------------

Observed Score Variance

R = 1 means perfect test, no error

(variance is a measure of error)

15 of 28

Interpretation of Reliability Coefficients

Reliability (R)

Interpretation

0.90 and above

0.80 – 0.90

0.70 – 0.80

0.60 – 0.70

0.50 – 0.60

0.50 and below

Excellent reliability (comparable to the best standardised tests like SAT)

Very good for a classroom test

Good for a classroom test but there are probably a few items which could be improved

Somewhat low. There are probably some items which could be removed or improved

The test needs to be revised.

Questionable reliability and the test should be replaced or needs major revision

16 of 28

Methods to Estimate the Reliability of a Test

To the same group of Students (two times test):

  • Test-Retestrepeat the same test after some time
  • Parallel or Equivalent Forms two equivalent tests (forms) with items (not similar) measuring the same level of knowledge, skills or attitude

  • Internal Consistency (Test one time only)
    • Split-Halftest split into 2 equal parts for analysis
    • Cronbach’s Alpha -individual questions correlate with the total test (commonly used – using computer software)

17 of 28

Inter-rater and Intra-rater Reliability

For observation and oral presentation method of evaluation

  • Inter-rater Reliability - the consistency of grading by two or more raters.

  • Intra-rater Reliability - the consistency of grading by a single rater.

18 of 28

7.5 Validity

WHAT IS VALIDITY?

  • the extent to which a test measures what it was designed to measure

E.g.

  • You want to measure the Maths ability of Year 1 students, can you give an English test?
  • Or give a Form 5 Maths test?
  • Or have items asking them to spell a word or finding the meaning of a word unrelated to Maths?

19 of 28

Types of Validity

Types

Description

Construct V.

actual purpose like math achievement, map skills, reading comprehension

Content V.

coverage of appropriate and necessary content for the purpose

Criterion-Related V.

relating the scores obtained to the scores of some other criterion or other related test

a) Predictive V.

high predictive - TOEFL, SAT

b) Concurrent V.

Correlate with same skill - MUET & oral test

20 of 28

Reliability & Validity

21 of 28

Reliability: Practical Strategies for Teachers

  • Ensure students are familiar with the assessment.
  • Provide detailed Pattern of Question paper to the students before every examination.
  • Revisit the concepts to be covered in the assessment in each subject.
  • An assessment contains many questions.
  • Have a consistent environment for the students

22 of 28

Examination of test content

Examination of test fairness

Examination of practical features

Limitation of the questionnaire

Validity Practical Strategies for Teachers

23 of 28

Factors Affecting Reliability and Validity

  • Length of the Test
  • Selection of Topics
  • Choice of Testing Techniques
  • Method of Test Administration
  • Method of Marking

24 of 28

The Standard Error of Measurement (SEM)

  • Provides an index of the reliability of an individual’s score.
  • The standard deviation of the theoretical distribution of errors
  • The more reliable a test, the smaller the SEM.

25 of 28

Individual Characteristics

  • Anxiety
  • Motivation
  • Health
  • Fatigue

External Characteristics

  • Environmental
  • Scoring errors
  • Biases
  • Sampling size

Sources of Measurement Error

26 of 28

Threat to Validity

27 of 28

Assessment Bias

  • It is present whenever one or more items on a test offend or unfairly penalize students because of those students' personal characteristics such as race, gen- der, socioeconomic status, or religion.

28 of 28

Problem of Bias in Edu Assessment�

  • What is Bias?
  • Past and present concerns
  • Cultural bias & minorities
  • Bias in test contents and internal features of tests
  • Bias in predictive & external factors
  • Cultural free tests, cultural loading and cultural bias