1 of 23

A FREE HUMAN-AI INTEGRATED TEXT-READABILITY TOOL

Making the Best of Teacher Intuition and Corpus Research

Ryan Spring

Tohoku University

2 of 23

QUESTION!

How hard is this sentence to read / understand?

Bow before General Zod!!!

3 of 23

ANSWER:

It depends:

  • What is your CEFR level?
  • Do you know Superman?
  • Are you Japanese?

Bow before General Zod!!!

4 of 23

MY POINT:

  • Vocabulary knowledge is somewhat generalizable, but:
  • There are learner differences

How do we handle this??

5 of 23

I DON’T TRUST HUMAN RATERS

  • I know too many human raters

6 of 23

I DON’T TRUST PURE AI SOLUTIONS

  • I know too many AI solutions

7 of 23

MAYBE I JUST HAVE TRUST ISSUES … OR

Over-reliance on JUST humans

Over-reliance on JUST AI

Human-AI integration

8 of 23

INTRODUCTION AND PREVIOUS STUDIES

  • Overly difficult / overly easy text can be detrimental to learning (e.g., Nation, 2014)

  • Some studies suggest instructional texts should have at least 95% familiar vocabulary (Hu & Nation, 2000; Laufer & Ravenhorst-Kalvoski, 2010)

  • Known / unknown vocabulary can also affect listening (Sakurai, 2018)

  • Teachers should know which vocabulary to teach before lessons (e.g., Leis, 2021; Pinchbeck, 2019)

9 of 23

VOCABULARY LISTS

  • New General Service List provides good clues (Brown, 2014)
  • Vocabulary and word level checkers can help teachers know general levels (e.g., Someya, 2006)
  • However, many generalized lists are based on FREQUENCY, so:
  • are not always appropriate for L1 <Japanese> learners (McLean, 2018; Mizumoto et al., 2021)

10 of 23

L1 SPECIFIC VOCAB LISTS:

Some great lists with L1 Japanese Learners in mind:

  • CEFR-J (Tono, 2013; 2019)
  • SEWK-J (Pinchbeck, in press)
  • JACET8000 (JACET, 2003)

11 of 23

VOCABULARY PROFILER TOOLS:

  • New Word Level Checker (Mizumoto et al., 2021)
  • VocabProfiler
  • Word Level Checker (Someya, 2006)

Very helpful for getting a basic guess of most words in a text!

But:

→ Pure AI�→ I like Human-AI integration

12 of 23

WHY HUMAN-AI INTEGRATION?

  • Often makes the work easier for a human
  • Often more accurate than either human or AI alone (Spring, 2023)

E.g.

  • Did you already cover vocabulary in a previous lesson?�(Dictogloss series)
  • Is there a translation equivalent or is it inherently L1?�(business, ninja)
  • Is there a new fad/trend that makes the students know the word?

13 of 23

FADS CHANGE TOO QUICKLY FOR VOCAB LISTS!

14 of 23

PUT THE TEACHER BACK IN THE EQUATION

https://springsenglish.online/textCheck/textChecker.html

Metrics Provided

Number of Words

Mean Length of Sentence

Number of Different Words / CTTR

Flesh-Kincaid (large-grained)

%Coverage by NGSL

%Coverage by various CEFR-J Lists

Extremely primitive checker but:

15 of 23

IMPORTANTLY!

  • Word Designation is CHANGEABLE!!
  • You can decide what your learners know / don’t know and recalculate

Bow before General Zod!!!

NWL Checker:�100% Covered at B1 level�(which words?)

textChecker:�75% Covered at B1 level:

“Zod” is unknown

“bow” and “General” at B1

-> Can be adjusted!

16 of 23

EXAMPLE IN ACTION (1)

The earliest evidence of human occupation in North Carolina dates back 10,000 years, found at the Hardaway Site. North Carolina was inhabited by Carolina Algonquian, Iroquoian, and Siouan speaking tribes of Native Americans prior to the arrival of Europeans. King Charles II granted eight lord proprietors a colony they named Carolina after the king and which was established in 1670 with the first permanent settlement at Charles Town (Charleston).

17 of 23

EXAMPLE IN ACTION (2)

The earliest evidence of human occupation in North Carolina dates back 10,000 years, found at the Hardaway Site. North Carolina was inhabited by Carolina Algonquian, Iroquoian, and Siouan speaking tribes of Native Americans prior to the arrival of Europeans. King Charles II granted eight lord proprietors a colony they named Carolina after the king and which was established in 1670 with the first permanent settlement at Charles Town (Charleston).

NWL Checker

textChecker (start)

textChecker (with teacher)

72.46% A1 Coverage

50.72% A1 Coverage

73.91%

84% A2 Coverage

65.22% A2 Coverage

88.41%

92.75% B1 Coverage

71.01% B1 Coverage

94.20%

I took students to my home; they actually visited many of these places and tribes, so they know them.

18 of 23

TEACHERS MAKE THEIR OWN MATERIALS

In Japan, there are several types of snakes that live in different parts of the country. One common snake is called the Japanese rat snake, which is often found in forests and fields. These snakes are non-venomous and help control the population of rodents like mice and rats, which is good for farmers. 

Another snake you might see in Japan is the Japanese pit viper. It's venomous and can be dangerous, but it's not very aggressive and usually only bites when it feels threatened. These snakes are usually found in mountainous areas and forests.

 In Japanese culture, snakes are often seen as symbols of good luck and protection. You might see images of snakes on traditional clothing or in artwork.

 If you're hiking or exploring nature in Japan, it's important to be aware of snakes and know how to stay safe. You can do this by wearing sturdy shoes and long pants, as well as being careful where you step and putting your hands in places where you can't see.

If you encounter a snake, it's best to give it space and not try to touch it. Most snakes won't bother you if you leave them alone. If you do get bitten by a snake, it's important to seek medical help right away, especially if you think it might be venomous.

Overall, snakes are an important part of the ecosystem in Japan and play a role in keeping the balance of nature. By understanding and respecting these creatures, we can coexist with them safely.

This list is about 90-92% B1 coverage for most students, but for Tohoku Uni. 93% for A2 & 95% for B1�(contains many vocabulary words and word parts they studied)

19 of 23

THIS IS IMPORTANT WHEN

  • Making tests
  • Making quizzes
  • Making reading texts
  • Making listening scripts
  • Etc.

20 of 23

WE SHOULD CONSIDER VOCAB. LEVEL

  • When we make high-stakes tests, we check and argue:
  • Shouldn’t we give our materials the same treatment?
  • Keep the teacher as part of the equation: Human-AI

  • Don’t have to use my tool, but let’s check!

21 of 23

WHERE CAN I GET THESE TOOLS?

Mizumoto et al. (2021):�→ Provides checks on several lists, makes AI assumptions based on LLM

https://nwlc.pythonanywhere.com/

textChecker

uses dumbNLP, but enhanced human integration

https://springsenglish.online/textCheck/textChecker.html

22 of 23

FUTURE WORK:

  • Perhaps integration with Mizumoto et al.
  • Including other clues of text complexity (cohesion, lexical density, etc.)

23 of 23

REFERENCES

Browne, C. (2014). A New General Service List: The better mousetrap we’ve been looking for? Vocabulary Learning and Instruction, 3(2), 1–10.

Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233.

Hu, M., & Nation, P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430.

JACET Kihongo Kaitei Iinkai (JACET, Committee for Revision of the JACET Wordlist). (2003). JACET list of 8000 basic words. JACET

Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30

McLean, S. (2018). Evidence for the adoption of the flemma as an appropriate word counting unit. Applied Linguistics, 39(6), 823–845. https://doi.org/10.1093/ applin/amw050

Mizumoto, A. Pinchbeck, G.G., and McLean, S. (2021). Comparisons of word lists on new word level checker. Vocabulary Learning and Instruction, 10(2), 30–41. https://doi. org/10.7820/vli.v10.2.mizumoto

Nation, P. (2014). How much input do you need to learn the most frequent 9,000 words? Reading in a Foreign Language, 26(2), 1–16.

Pinchbeck, G. G. (2019). Validating the construct of readability in EFL contexts: A proposal for criteria. Vocabulary Learning and Instruction, 8 (1), 8–16

Spring, R. (2023). A human-AI integrated rating scheme for improving second language writing: The case of Japanese learners of English for general academic purposes. Reports Vol. 15 of LET Methodology Special Interest Group (pp. 22–43)

Tono, Y. (ed.) (2013). The CEFR-J Handbook. Taishukan.