A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Index name | Formula Name | Formula | Variable derived from | References | ||||||||||||||||||||||
2 | Variable (i) | Coefficient | |||||||||||||||||||||||||
3 | Flesch-Reading-Ease | Flesch Reading Ease Formula | Constant | 206.835 | N/A | Flesch, R. (1948). A new readability yardstick. Journal of applied psychology, 32(3), 221. | |||||||||||||||||||||
4 | Average number of words per sentence | - (i x 1.015) | spaCy | ||||||||||||||||||||||||
5 | Average number of syllables per word | - (i x 84.6) | custom Python code | ||||||||||||||||||||||||
6 | Flesch-Kincaid-Readability | Flesch Kincaid Grade Level Formula | Constant | -15.59 | N/A | Kincaid, J. P., Fishburne Jr, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. | |||||||||||||||||||||
7 | Average number of words per sentence | i x 0.39 | spaCy | ||||||||||||||||||||||||
8 | Average number of syllables per word | i x 11.8 | custom Python code | ||||||||||||||||||||||||
9 | Automated-Reading-Index | Automated Readability Index | Constant | -21.43 | N/A | Kincaid, J. P., Fishburne Jr, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. | |||||||||||||||||||||
10 | Average number of words per sentence | i x 0.5 | spaCy | ||||||||||||||||||||||||
11 | Average number of characters per word | i x 4.71 | custom Python code | ||||||||||||||||||||||||
12 | SMOG Readability Formula | SMOG Grading | Constant | 3 | N/A | Mc Laughlin, G. H. (1969). SMOG grading-a new readability formula. Journal of reading, 12(8), 639-646. | |||||||||||||||||||||
13 | Square root of pollysyllabic words per 30 sentence | i x 1 | spaCy | ||||||||||||||||||||||||
14 | New Dale-Chall Readability Formula | New Dale-Chall Readability Formula | Average number of words per sentence | i x 0.0496 | spaCy | Chall, J. S., & Dale, E. (1995). Readability revisited: The new Dale-Chall readability formula. Brookline Books. | |||||||||||||||||||||
15 | Percentage of difficult words | i x 0.1579 | ReaderBench | ||||||||||||||||||||||||
16 | CAREC | Crowdsourced algorithm of reading comprehension | Constant | 1.811 | N/A | Crossley, S. A., Skalicky, S., & Dascalu, M. (2019). Moving beyond classic readability formulas: new methods and new models. Journal of Research in Reading, 42(3-4), 541-561. | |||||||||||||||||||||
17 | Average age of acquisition (Kuperman) for all content words | i x 0.022 | TAALES (Kuperman_AoA_CW) | Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behavior research methods, 44(4), 978-990. | |||||||||||||||||||||||
18 | Average bigram range score (COCA) for all words | i x 0.746 | TAALES (COCA_Academic_Bigram_Range) | Davies, M. (2009). The Corpus of Contemporary American English (COCA): 400+ million words, 1990-present (2008). Available online at http://www. americancorpus. org. | |||||||||||||||||||||||
19 | Average trigram proportion score (BNC-written) for all words | - (i x 0.742) | TAALES (BNC_Written_Trigram_Proportion) | BNC Consortium. (2007). The British national corpus, version 3 (BNC XML Edition). Distributed by Oxford University Computing Services on behalf of the BNC Consortium, 5(65), 6. | |||||||||||||||||||||||
20 | Average imageability score (MRC) for all content words | - (i x 0.001) | TAALES (MRC_Imageability_CW) | Coltheart, M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology Section A, 33(4), 497-505. | |||||||||||||||||||||||
21 | Average frequency score (Brown) for all words | i x 0.0000625 | TAALES (Brown_Freq_AW) | Brown, G. D. (1984). A frequency count of 190,000 words in theLondon-Lund Corpus of English Conversation. Behavior research methods, instruments, & computers, 16(6), 502-532. | |||||||||||||||||||||||
22 | Average type token ratio of lemma trigrams for all trigrams | - (i x 0.699) | TAACO (trigram_lemma_ttr) | Crossley, S. A., Kyle, K., & Dascalu, M. (in press). The Tool for the Automatic Analysis of Cohesion 2.0: Integrating Semantic Similarity and Text Overlap. Behavioral Research Methods. | |||||||||||||||||||||||
23 | Proportion of lemma types that occur in the next paragraph for all paragraphs | - (i x 0.111) | TAACO (adjacent_overlap_all_para) | ||||||||||||||||||||||||
24 | Number of temporal connectives divided by number of words in text | - (i x 2.067) | TAACO (all_temporal) | ||||||||||||||||||||||||
25 | Proportion of noun lemma types that occur in the next paragraph for all paragraphs | i x 0.035 | TAACO (adjacent_overlap_noun_sent_div_seg) | ||||||||||||||||||||||||
26 | Number of content word lemma types | i x 0.002 | TAACO (nlemma_content_types) | ||||||||||||||||||||||||
27 | Positive adjective scores derived from 4 different corpora | - (i x 0.08) | SEANCE (positive_adjectives_component) | Crossley, S. A., Kyle, K., & McNamara, D. S. (2017). Sentiment analysis and social cognition engine (SEANCE): An automatic tool for sentiment, social cognition, and social order analysis. Behavior Research Methods 49(3), pp. 803-821. doi:10.3758/s13428-016-0743-z. | |||||||||||||||||||||||
28 | Average standard deviation of word length for all words | i x 0.047 | ReaderBench (RB.WdLettStdDev) | Dascalu, M., Dessus, P., Trausan-Matu, Ş., Bianco, M., & Nardy, A. (2013, July). ReaderBench, an environment for analyzing text complexity and reading strategies. In International Conference on Artificial Intelligence in Education (pp. 379-388). Springer, Berlin, Heidelberg. | |||||||||||||||||||||||
29 | Average character entropy for all characters | - (i x 0.395) | ReaderBench (RB.CharEnt) | ||||||||||||||||||||||||
30 | CAREC_M | Crowdsourced algorithm of reading comprehension modified | Constant | 1.811 | N/A | Crossley, S. A., Skalicky, S., & Dascalu, M. (2019). Moving beyond classic readability formulas: new methods and new models. Journal of Research in Reading, 42(3-4), 541-561. | |||||||||||||||||||||
31 | Average age of acquisition (Kuperman) for all content words | i x 0.022 | TAALES (Kuperman_AoA_CW) | Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behavior research methods, 44(4), 978-990. | |||||||||||||||||||||||
32 | Average bigram range score (COCA) for all words | i x 0.746 | TAALES (COCA_Academic_Bigram_Range) | Davies, M. (2009). The Corpus of Contemporary American English (COCA): 400+ million words, 1990-present (2008). Available online at http://www. americancorpus. org. | |||||||||||||||||||||||
33 | Average trigram proportion score (BNC-written) for all words | - (i x 0.742) | TAALES (BNC_Written_Trigram_Proportion) | BNC Consortium. (2007). The British national corpus, version 3 (BNC XML Edition). Distributed by Oxford University Computing Services on behalf of the BNC Consortium, 5(65), 6. | |||||||||||||||||||||||
34 | Average imageability score (MRC) for all content words | - (i x 0.001) | TAALES (MRC_Imageability_CW) | Coltheart, M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology Section A, 33(4), 497-505. | |||||||||||||||||||||||
35 | Average frequency score (Brown) for all words | i x 0.0000625 | TAALES (Brown_Freq_AW) | Brown, G. D. (1984). A frequency count of 190,000 words in theLondon-Lund Corpus of English Conversation. Behavior research methods, instruments, & computers, 16(6), 502-532. | |||||||||||||||||||||||
36 | Average type token ratio of lemma trigrams for all trigrams | - (i x 0.699) | TAACO (trigram_lemma_ttr) | Crossley, S. A., Kyle, K., & Dascalu, M. (in press). The Tool for the Automatic Analysis of Cohesion 2.0: Integrating Semantic Similarity and Text Overlap. Behavioral Research Methods. | |||||||||||||||||||||||
37 | Proportion of lemma types that occur in the next paragraph for all paragraphs | - (i x 0.111) | TAACO (adjacent_overlap_all_para) | ||||||||||||||||||||||||
38 | Number of temporal connectives divided by number of words in text | - (i x 2.067) | TAACO (all_temporal) | ||||||||||||||||||||||||
39 | Proportion of noun lemma types that occur in the next paragraph for all paragraphs | i x 0.035 | TAACO (adjacent_overlap_noun_sent_div_seg) | ||||||||||||||||||||||||
40 | Number of content word lemma types divided by number of content words | i x 0.2 | TAACO (nlemma_content_types) | ||||||||||||||||||||||||
41 | Positive adjective scores derived from 4 different corpora | - (i x 0.08) | SEANCE (positive_adjectives_component) | Crossley, S. A., Kyle, K., & McNamara, D. S. (2017). Sentiment analysis and social cognition engine (SEANCE): An automatic tool for sentiment, social cognition, and social order analysis. Behavior Research Methods 49(3), pp. 803-821. doi:10.3758/s13428-016-0743-z. | |||||||||||||||||||||||
42 | Average standard deviation of word length for all words | i x 0.047 | ReaderBench (RB.WdLettStdDev) | Dascalu, M., Dessus, P., Trausan-Matu, Ş., Bianco, M., & Nardy, A. (2013, July). ReaderBench, an environment for analyzing text complexity and reading strategies. In International Conference on Artificial Intelligence in Education (pp. 379-388). Springer, Berlin, Heidelberg. | |||||||||||||||||||||||
43 | Average character entropy for all characters | - (i x 0.395) | ReaderBench (RB.CharEnt) | ||||||||||||||||||||||||
44 | CARES | Crowdsourced algorithm of reading speed | Constant | -0.862 | N/A | Crossley, S. A., Skalicky, S., & Dascalu, M. (2019). Moving beyond classic readability formulas: new methods and new models. Journal of Research in Reading, 42(3-4), 541-561. | |||||||||||||||||||||
45 | Average word naming response time for all words | i x 0.003 | TAALES (WN_Mean_RT) | Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., ... & Treiman, R. (2007). The English lexicon project. Behavior research methods, 39(3), 445-459. | |||||||||||||||||||||||
46 | Average concreteness score (MRC) for all words | - (i x 0.001) | TAALES (MRC_Concreteness_AW) | Coltheart, M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology Section A, 33(4), 497-505. | |||||||||||||||||||||||
47 | Average semantic distinctiveness scores for all words | - (i x 0.461) | TAALES (Sem_D) | Hoffman, P., Ralph, M. A. L., & Rogers, T. T. (2013). Semantic diversity: A measure of semantic ambiguity based on variability in the contextual usage of words. Behavior research methods, 45(3), 718-730. | |||||||||||||||||||||||
48 | Number of content word lemmas | i x 0.004 | TAACO (nlemma_content_words) | Crossley, S. A., Kyle, K., & Dascalu, M. (in press). The Tool for the Automatic Analysis of Cohesion 2.0: Integrating Semantic Similarity and Text Overlap. Behavioral Research Methods. | |||||||||||||||||||||||
49 | Number of function words | i x 0.002 | TAACO (nfunction_words) | ||||||||||||||||||||||||
50 | Complex nominals per T-unit | i x 0.011 | TAASSC (CN_T) | Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4):474-496. | |||||||||||||||||||||||
51 | Number of dependents per direct object | i x 0.023 | TAASSC (av_dobj_deps) | Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication (Doctoral Dissertation). | |||||||||||||||||||||||
52 | Average number of sentences per paragraph | - (i x 0.015) | ReaderBench (RB.BlStDevSen) | Dascalu, M., Dessus, P., Trausan-Matu, Ş., Bianco, M., & Nardy, A. (2013, July). ReaderBench, an environment for analyzing text complexity and reading strategies. In International Conference on Artificial Intelligence in Education (pp. 379-388). Springer, Berlin, Heidelberg. | |||||||||||||||||||||||
53 | Average number of characters per word | i x 0.062 | ReaderBench (RB.WdLettStdDev) | ||||||||||||||||||||||||
54 | CML2RI | Coh-Metrix L2 Readability Index (approximated) | Constant | -43.142 | N/A | Crossley, S. A., & McNamara, D. S. (2008). Assessing L2 reading texts at the intermediate level: An approximate replication of Crossley, Louwerse, McCarthy & McNamara (2007). Language Teaching, 41(3), 409-429. | |||||||||||||||||||||
55 | Number of sentences in text | i x 0.642 | spaCy | ||||||||||||||||||||||||
56 | Average frequency score (SUBTLEXus) for all content words logged | i x 12.671 | TAALES (SUBTLEXus_Freq_CW_Log) | Brysbaert, M., & New, B. (2009). Subtlexus: American word frequencies. Http:/Subtlexus. Lexique. Org. | |||||||||||||||||||||||
57 | Proportion of noun and pronoun lemma types that occur in the next two sentences for all sentences | i x 29.619 | TAACO (adjacent_overlap_2_argument_sent) | Crossley, S. A., Kyle, K., & Dascalu, M. (in press). The Tool for the Automatic Analysis of Cohesion 2.0: Integrating Semantic Similarity and Text Overlap. Behavioral Research Methods. | |||||||||||||||||||||||
58 | |||||||||||||||||||||||||||
59 | |||||||||||||||||||||||||||
60 | |||||||||||||||||||||||||||
61 | |||||||||||||||||||||||||||
62 | |||||||||||||||||||||||||||
63 | |||||||||||||||||||||||||||
64 | |||||||||||||||||||||||||||
65 | |||||||||||||||||||||||||||
66 | |||||||||||||||||||||||||||
67 | |||||||||||||||||||||||||||
68 | |||||||||||||||||||||||||||
69 | |||||||||||||||||||||||||||
70 | |||||||||||||||||||||||||||
71 | |||||||||||||||||||||||||||
72 | |||||||||||||||||||||||||||
73 | |||||||||||||||||||||||||||
74 | |||||||||||||||||||||||||||
75 | |||||||||||||||||||||||||||
76 | |||||||||||||||||||||||||||
77 | |||||||||||||||||||||||||||
78 | |||||||||||||||||||||||||||
79 | |||||||||||||||||||||||||||
80 | |||||||||||||||||||||||||||
81 | |||||||||||||||||||||||||||
82 | |||||||||||||||||||||||||||
83 | |||||||||||||||||||||||||||
84 | |||||||||||||||||||||||||||
85 | |||||||||||||||||||||||||||
86 | |||||||||||||||||||||||||||
87 | |||||||||||||||||||||||||||
88 | |||||||||||||||||||||||||||
89 | |||||||||||||||||||||||||||
90 | |||||||||||||||||||||||||||
91 | |||||||||||||||||||||||||||
92 | |||||||||||||||||||||||||||
93 | |||||||||||||||||||||||||||
94 | |||||||||||||||||||||||||||
95 | |||||||||||||||||||||||||||
96 | |||||||||||||||||||||||||||
97 | |||||||||||||||||||||||||||
98 | |||||||||||||||||||||||||||
99 | |||||||||||||||||||||||||||
100 |