1 of 26

Using Rubrics in High School EFL Learners' Essays

Prepared by

Tan Liang Ye

2 of 26

Content

1. Literature Review

2. Context and what I did

3. Reflections

4. How to create a rubric

3 of 26

Same batch of students for 2 years,

starting from when they were 2^nd graders.

Genre: Argumentative essays

4 of 26

Literature Review

5 of 26

Observed behavior

(Essay)

Rater effects

True

achieve-ment

estimate

Halo effect

Severity effect

Central tendency effect

Inter-rater reliability

(see Eckes 2008)

Cannot be observed

Rating

Instrument

Brunswik's Lens Model Framework

Rater-mediated assessment

Automated Writing Evaluation (AWE) tools (Stevenson & Phakiti 2014)

Validity of instrument (Dawson, 2015; Susser 2010; Yamanishi et.al. 2019)

Rubric as a feedback tool (Allen & Kimberly 2006; Wang 2017; Becker 2016)

Psychometrics

🡪Best-fit statistical modeling

(e.g. Many-facet Rasch measurement) (Eckes 2008; Wind & Walker 2019)

Lexical features

(Vögelina et. al. 2019; Lim 2019)

🡪 their impact on raters

Judgement

(Grade)

6 of 26

History of Rubrics

Early rubric in L1 writing

Designed for administrators to increase inter-rater reliability (Brooks 2012)

“Process approach” in the US in 1970s

Became a feedback tool in L1 writing (Brooks 2012)

Recently entered the L2 classrooms

see Ene & Kosobucki (2016); Becker (2016)

7 of 26

Arguments for & against rubrics

Cons

One-size fits all (Brooks 2012)

Creative work may not fit well and get penalized as a result

Limit learning & rater-judgement if poorly-designed

(Jeong 2015; Wolf & Stevens 2007)

Differences in interpretation by raters (see Eckes 2007)

Pros

Clear learning goals

(Wolf & Stevens 2007)

Fairer for students

(Wolf & Stevens 2007)

Makes expectations clear

Help students improve their metacognition (Andrade 2017)

Thinking about thinking

Evidence of improved inter-rater reliability �(Jeong 2015; Barkaoui 2010)

Seem to be mostly related to design issues

8 of 26

Usefulness of the rubric is limited to its design

(Brookhart 2018; Brooks 2012; Wolf & Stevens 2007)

9 of 26

2 Basic Types of Rubric

Summative
Provides a number or grade that summarizes essay quality

Holistic

Formative
Provides info on level of mastery (columns) of different elements (rows)

Analytic

Evidence seems to indicate that this is preferable

10 of 26

What I did in the first year…

11 of 26

Introduction

Conclusion

Body

5-sentence essay

Logic

12 of 26

What I did in the second year…

13 of 26

Introduction

Conclusion

Body

How to

compare & contrast

14 of 26

Introduction

Conclusion

Body

How to write a rebuttal through debates

15 of 26

Introduction

Conclusion

Body

How to write a conclusion through teaching summary skills

5 Types of evidence

16 of 26

600-word�full essay by the end of the year

Introduction

Conclusion

Body

17 of 26

Reflections

Things I wish I had done:

Class discussion about assessment criteria before handing them out (Wolf & Stevens 2007)
Hand out the rubrics before students start writing

Might be possible if the same course is conducted a second time

18 of 26

Reflections

Things that could be an obstacle:

Comprehensibility of rubrics

Mixed-level classes: Lower-level students may need Japanese for complex rubrics

Inter-rater differences

Differences in interpretation (even after testing the rubric together)
Differences in priority: Accuracy vs Fluency

19 of 26

Guiding principles

Specific, observable, measureable

Comprehensible to students

20 of 26

4 Steps to Creating a Rubric

Step 1

Think about your course objective

List between 3 to 6 key elements you check in essays

Step 2

Use an even number of columns

“Central tendency” effect

21 of 26

4 Steps to Creating a Rubric

Step 3

Start from the middle

What is the student in the 50^th percentile likely to produce?
What quality would you consider to be the “passing grade”?

Step 4

Test the rubric

Pick a couple of essays around the 50^th percentile for a test-run
Test out the rubric with co-teachers 🡪 make criteria as clear as possible

22 of 26

Special thanks

Mr. Kazuhiro Iguchi

Ms. Ritsuko Rita

Mr. Kazunori Yamagishi

Abstract submitted to JALT (2019):

Short Summary

The efficacy of using rubrics to assess student writing has been much debated. Criticisms include a lack of inter-rate reliability, incomprehensibility to EFL students, and its cookie-cutter approach to essay evaluations. However, rubrics can be handy in providing student feedback while allowing teachers to grade large amounts of essays. This workshop covers general guidelines on how to design a rubric, and how to use rubrics as “shifting goalposts” that grow as students’ writing ability develop.

Abstract

This workshop is the result of a 2-year trial-and-error experimentation with using rubrics to evaluate student writing in the EFL writing classroom. The analysis on the use of rubrics in L2 classes is still in its infancy (Brooks, 2013). There has also been some debate on the efficacy of using rubrics to assess student writing. Some of the criticisms cited in the literature include (1) a lack of inter-rate reliability, (2) incomprehensibility to EFL students, and (3) its cookie-cutter approach to essay evaluation. This is compounded by differences in interpretation and expectations between Japanese Teachers of English (JTE) and L1-speaking Assistant Language Teachers (ALT). Despite these associated problems, when well-designed, rubrics can be a powerful and handy tool to provide student feedback, apart from the obvious benefit of allowing teachers to grade large amounts of essays quickly. In this workshop, I shall touch on (1) general guidelines on how to design a rubric, and (2) how to create “live” rubrics that can function as “shifting goalposts” that grow in complexity as students’ writing ability develop over time.

23 of 26

24 of 26

Summary:

Using Brunswik’s (1956) Lens Model Framework on rater-mediated assessment, the workshop presents a step-by-step, hands-on introduction on how EFL teachers can design scoring rubrics that function both as a feedback/learning tool, apart from grading assignments. The typology of rubrics are broadly divided into holistic and analytic rubrics, of which the latter provides a detailed and transparent breakdown of learning goals. The guidelines for constructing an analytic rubric are: (1) define course/lesson objectives and list around 3-5 key elements; (2) start from the middle by considering what the average student is capable of; and (3) create an even number of grading categories (e.g., A, B, C, D) to avoid central tendencies. Other considerations include (1) inter-rater reliability (if more than one teacher will be using the rubric); (2) validity (i.e., whether the elements are assessed accurately); and (3) how the rubric will be used, such as for peer correction, providing feedback, and/or creating a scoring benchmark. As the course progresses in difficulty, the rubric can evolve as a form of “shifting goalpost”, where teacher expectations are clearly elucidated over the course of the writing programme, so as to help students achieve self-regulation while developing their writing skills.

25 of 26

References

Allen, D. & Kimberly, T. (2006) Rubrics: Tools for making learning goals and evaluation criteria explicit for both teachers and learners. Life Science Education, 5(3): 197–203. https://doi.org/10.1187/cbe.06-06-0168

Andrade, H. L. (2018). Feedback in the context of self-assessment. In A. A. Lipnevich & J. K. Smith (Eds.), The Cambridge handbook of instructional feedback (pp. 376–408). Cambridge University Press. https://doi.org/10.1017/9781316832134.019

Barkaoui (2010). Variability in ESL essay rating processes: The role of the rating scale and rater experience. Language Assessment Quarterly, 7(1), 54-74. https://doi.org/10.1080/15434300903464418

Becker, A. (2016). Student-generated scoring rubrics: Examining their formative value for improving ESL students’ writing performance. Assessing Writing, 29, 15-24. https://doi.org/10.1016/j.asw.2016.05.002

Brookhart, S. M. (2018). Appropriate Criteria: Key to Effective Rubrics. Frontiers in Education, 3(22), 1-12. https://doi.org/10.3389/feduc.2018.00022

Brooks, Gavin. (2012). Assessment and Academic Writing: A Look at the Use of Rubrics in the Second Language Writing Classroom. Kwansei Gakuin University Humanities Review. 17, 227-240.

Dawson, P. (2015). Assessment rubrics: towards clearer and more replicable design, research and practice. Assessment & Evaluation in Higher Education, 42(3), 347-360. https://doi.org/10.1080/02602938.2015.1111294

Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to

rater variability. Language Testing, 25(2), 155–185. https://doi.org/10.1177/0265532207086780

Ene, E. & Virginia Kosobucki, V. (2016). Rubrics and corrective feedback in ESL writing: A longitudinal case study of an L2 writer. Assessing Writing, 30, 3-20. https://doi.org/10.1016/j.asw.2016.06.003

26 of 26

References

Jeong, Heejeong (2015). What is your teacher rubric? Extracting teachers’ assessment constructs. Practical Assessment, Research, and Evaluation, 20(6), 1-13. https://doi.org/10.7275/m3sa-p692

Kenneth Wolf, K. & Stevens, E. (2007). The Role of Rubrics in Advancing and Assessing Student Learning. The Journal of Effective Teaching, 7(1), 3-14.

Lim, J. (2019). An investigation of the text features of discrepantly-scored ESL essays: A mixed methods study. Assessing Writing, 39, pp. 1-13. https://doi.org/10.1016/j.asw.2018.10.003

Stevenson, M. & Phakiti, A. (2014). The effects of computer-generationed feedback on the quality of writing. Assessing Writing, 19, pp- 51-65. https://doi.org/10.1016/j.asw.2013.11.007

Susser, B. (2010). Problems in assessing EFL writing on high-stakes tests: A guide to the research. 同志社女子大学　総合文化研究所紀要, 27, 44-62.

Vögelina, C., Jansenb, T., Kellera, S. D., Machtsb, N., & Möllerb, J. (2019). The influence of lexical features on teacher judgements of ESL argumentative essays. Assessing Writing, 39, pp. 50-63. https://doi.org/10.1016/j.asw.2018.12.003

Wang, W. (2017). Using rubrics in student self-assessment: student perceptions in the English as a foreign language writing context. Assessment & Evaluation in Higher Education, 42(8), 1280-1292. https://doi.org/10.1080/02602938.2016.1261993

Wind, S. & Walker, A. A. (2019). Exploring the correspondence between traditional score resolution methods and person fit indices in rater-mediated writing assessments. Assessing Writing, 39, 25-38. https://doi.org/10.1016/j.asw.2018.12.002

Yamanishi, H., Ono, M., & Hijikata, Y. (2019). Developing a scoring rubric for L2 summary writing: A hybrid approach combining analytic and holistic assessment. Language Testing in Asia, 9(13), 1–22. https://doi.org/10.1186/s40468-019-0087-6