1 of 28

EDA 1130�EDUCATIONAL ASSESSMENT��Chapter 3, 4 & 5�Video B� �

Dr Kim Teng Siang

kskim2007@gmail.com

0124661131

2 of 28

Chapter 3�Meaning of Test scores�

3 of 28

CHAPTER 3 OVERVIEW

What is a Test?
What is test score?
Norm and Criterion-reference scores or both
Qualitative description of scores

4 of 28

What is a Test?

Test are the tools, which measures the quality & quantity of performance of an individuals
A test is a systematic procedure for measuring an individual’s behavior.
It is a formal & systematic way of gathering information about individuals

5 of 28

A test score is a piece of information, usually a number, that conveys the performance of an examinee on a test.
One formal definition is that it is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being measured."

What is Test Score

6 of 28

Raw scores

A raw score is a score without any sort of adjustment or transformation, such as the simple number of questions answered correctly.

Types of Scores

Scaled scores

A scaled score is the result of some transformation(s) applied to the raw score.

7 of 28

Norm-Referenced & Criterion-Referenced Score Interpretation

Norm-Referenced Interpretations

Criterion-Referenced Interpretations

Means that we are referencing how your score compares to other people.
Norm-referenced tests make comparisons between individuals

Criterion-referenced means that we are referencing how your score compares to a criterion such as a cut score or a body of knowledge.
Criterion-referenced tests measure a test taker's performance compared to a specific set of standards or criteria

8 of 28

	Norm-Referenced Test	Criterion-Referenced Test
Aim	Compare a student’s performance with other students  Select students for certification	 Compare a student’s performance against some criteria (e.g. learning outcomes)  Extent to which student has acquired the knowledge or skill  Improve teaching & learning
Types of Questions	Questions from simple to difficult	Questions of nearly similar difficulty relating to the criteria
Reporting of results	Grades are assigned	No grades are assigned (whether skill or knowledge achieved or not)
Content coverage	Wide content coverage	Specific aspects of the content
Examples	UPSR, PMR, SPM national examinations, end of semester / year exams	Class tests, exercises and assignments

9 of 28

Qualitative Description of Scores

10 of 28

Chapter 4 & 5�Reliability and Validity for Teachers�

11 of 28

Chapter 4 & 5 Overview

What is Reliability?
The Reliability Coefficient
Methods to Estimate Reliability
Inter and Intra-rater Reliability
Types of Validity
Practical Strategies for Teachers
Factors Affecting Reliability and Validity

12 of 28

Introduction

In any measurement/ testing, we get a score e.g. 60%

Does this score tell the true ability of a student?

Observed Score = True Score + Error

(what we see/measure) (Real ability)

It is impossible to develop a test without error,

but what is important is the error is small and

consistence; and the test really measure what it wants to measure (true score)

Reliability is the consistency of the measurement.

13 of 28

What is Reliability?

Observed Score = True Score + Error

(what we see/measure) (Real ability)

A good test /instrument must have Reliability and Validity.

Reliability is the consistency of the measurement or the error is stable and consistence
Validity refers to the accuracy of the instrument/ test to reflect the true ability (measure what it intended to measure)

14 of 28

The Reliability Coefficient

Reliability coefficient (R ) indicates how reliable an instrument / test is
Or how consistence the scoring / testing be

It is the variance of the true score divided by the variance of the observed score (R from 0 🡪 1)

Reliability True Score Variance

Coefficient, (R) = ----------------------------------

Observed Score Variance

R = 1 means perfect test, no error

(variance is a measure of error)

15 of 28

Interpretation of Reliability Coefficients

Reliability (R)	Interpretation
0.90 and above 0.80 – 0.90 0.70 – 0.80 0.60 – 0.70 0.50 – 0.60 0.50 and below	Excellent reliability (comparable to the best standardised tests like SAT) Very good for a classroom test Good for a classroom test but there are probably a few items which could be improved Somewhat low. There are probably some items which could be removed or improved The test needs to be revised. Questionable reliability and the test should be replaced or needs major revision

16 of 28

Methods to Estimate the Reliability of a Test

To the same group of Students (two times test):

Test-Retest – repeat the same test after some time
Parallel or Equivalent Forms – two equivalent tests (forms) with items (not similar) measuring the same level of knowledge, skills or attitude

Internal Consistency (Test one time only)

Split-Half – test split into 2 equal parts for analysis
Cronbach’s Alpha -individual questions correlate with the total test (commonly used – using computer software)

17 of 28

Inter-rater and Intra-rater Reliability

For observation and oral presentation method of evaluation

Inter-rater Reliability - the consistency of grading by two or more raters.

Intra-rater Reliability - the consistency of grading by a single rater.

18 of 28

7.5 Validity

WHAT IS VALIDITY?

the extent to which a test measures what it was designed to measure

E.g.

You want to measure the Maths ability of Year 1 students, can you give an English test?
Or give a Form 5 Maths test?
Or have items asking them to spell a word or finding the meaning of a word unrelated to Maths?

19 of 28

Types of Validity

Types	Description
Construct V.	actual purpose like math achievement, map skills, reading comprehension
Content V.	coverage of appropriate and necessary content for the purpose
Criterion-Related V.	relating the scores obtained to the scores of some other criterion or other related test
a) Predictive V.	high predictive - TOEFL, SAT
b) Concurrent V.	Correlate with same skill - MUET & oral test

20 of 28

Reliability & Validity

21 of 28

Reliability: Practical Strategies for Teachers

Ensure students are familiar with the assessment.
Provide detailed Pattern of Question paper to the students before every examination.
Revisit the concepts to be covered in the assessment in each subject.
An assessment contains many questions.
Have a consistent environment for the students

Provide practice tests or sample tests to the new students to allow them to become familiar with the types of questions in the assessment. It will reduce test anxiety which also influences reliability.
This should contain the outline of questions along with the breakup of the distribution of points for each answer.
In each subject, a Day-wise Revision Schedule should be shared with the parents two weeks before every assessment to make them a part of this important step.
a small test of those concepts should be given to the students so that they can easily attempt those questions in the final assessments.
to ensure that all the students have a similar environment and have the same amount of time to take the test in. Different environments, like lighting, room temperature, and noise can influence test performance. For test results to be reliable, it is important that the test environment is consistent.

22 of 28

Examination of test content

Examination of test fairness

Examination of practical features

Limitation of the questionnaire

Validity Practical Strategies for Teachers

Content validity assesses whether a test is representative of all aspects of the construct. To produce valid results, the content of a test, survey or measurement method must cover all relevant parts of the subject it aims to measure.
efined as equitable treatment of all test- takers during the testing process, absence of measurement bias, equitable access to the constructs being measured, and justifiable validity of test score interpretation for the intended purpose(
Practical exams test students' practical skills and techniques usually in laboratory, clinical or field settings. They can be administered individually, in pairs or small groups. These types of performance exams require you to demonstrate your skills, capabilities and knowledge in a practical or field setting
They lack detail. Because the responses are fixed, there is less scope for respondents to supply answers which reflect their true feelings on a topic.

23 of 28

Factors Affecting Reliability and Validity

Length of the Test
Selection of Topics
Choice of Testing Techniques
Method of Test Administration
Method of Marking

24 of 28

The Standard Error of Measurement (SEM)

Provides an index of the reliability of an individual’s score.
The standard deviation of the theoretical distribution of errors
The more reliable a test, the smaller the SEM.

25 of 28

Individual Characteristics

Anxiety
Motivation
Health
Fatigue

External Characteristics

Environmental
Scoring errors
Biases
Sampling size

Sources of Measurement Error

Anxiety: Kebimbangan.
Anxiety is a normal emotion. It’s your brain’s way of reacting to stress and alerting you of potential danger ahead.
But anxiety disorders are different. They’re a group of mental illnesses that cause constant and overwhelming anxiety and fear. The excessive anxiety can make you avoid work, school, family get-togethers, and other social situations that might trigger or worsen your symptoms.
Generalized anxiety disorder. You feel excessive, unrealistic worry and tension with little or no reason.
Panic disorder. You feel sudden, intense fear that brings on a panic attack. During a panic attack you may break out in a sweat, have chest pain, and have a pounding heartbeat (palpitations). Sometimes you may feel like you’re choking or having a heart attack.
Social anxiety disorder. Also called social phobia, this is when you feel overwhelming worry and self-consciousness about everyday social situations. You obsessively worry about others judging you or being embarrassed or ridiculed.
Specific phobias. You feel intense fear of a specific object or situation, such as heights or flying. The fear goes beyond what’s appropriate and may cause you to avoid ordinary situations.
Agoraphobia.You have an intense fear of being in a place where it seems hard to escape or get help if an emergency occurs. For example, you may panic or feel anxious when on an airplane, public transportation, or standing in line with a crowd.
Separation anxiety. Little kids aren’t the only ones who feel scared or anxious when a loved one leaves. Anyone can get separation anxiety disorder. If you do, you’ll feel very anxious or fearful when a person you’re close with leaves your sight. You’ll always worry that something bad may happen to your loved one.
Selective mutism. This is a type of social anxiety in which young kids who talk normally with their family don’t speak in public, like at school.
Medication-induced anxiety disorder. Use of certain medications or illegal drugs, or withdrawal from certain drugs, can trigger some symptoms of anxiety disorder.
Fatigue is a feeling of constant tiredness or weakness and can be physical, mental or a combination of both

26 of 28

Threat to Validity

27 of 28

Assessment Bias

It is present whenever one or more items on a test offend or unfairly penalize students because of those students' personal characteristics such as race, gen- der, socioeconomic status, or religion.

28 of 28

Problem of Bias in Edu Assessment�

What is Bias?
Past and present concerns
Cultural bias & minorities
Bias in test contents and internal features of tests
Bias in predictive & external factors
Cultural free tests, cultural loading and cultural bias