JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 7

NLU Lab: Paper Reading (14 Feb 2024)

Today:

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

(McCoy et al. 2019)

arXiv: 1902.01007

2 of 7

What can you learn from title + abstract?

Kinds of NLP Papers

“Prove transformers cannot learn to multiply arbitrary sequences in S₅”

“Models rely on shallow heuristics to solve NLI tasks”

“Behold: the Transformer”

Empirical tests need benchmarks!

“Benchmark” = way to test on common ground

Good:

“Modifying model X by doing Y improves performance on Z”

Bad:

“Sentiment classification models work better on English than on Icelandic”

Why is the good example good?

Why is the bad example bad?

Sections & their purposes

From McCoy et al. (2019):

Where do they state their hypothesis?

Why are (3) and (4) different sections?

What’s going on with (7)?

What’s missing?

What is their hypothesis?

Quantitative & qualitative explanations

Meta Questions