CaCL paper selection form

	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	Q	R	S	T	U	V	W	X	Y	Z	AA	AB	AC	AD	AE	AF	AG	AH
1		topic	notes	link	sponsor	date	Votes	CC	MW	ME	YC	TK	SC	AL	MC	Dr. CAS	BD	MH	MB	TC	NJ	JY	DP	WC	Dr. EJ	PHL	AL	SA	LJ	DLK	TM	NER

2	Nominated																So sad, but at least I get to sit next to Cory
3	Liu et al 2023	Lost in the Middle: How Language Models Use Long Contexts	"We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do not robustly make use of information in long input contexts"	https://arxiv.org/pdf/2307.03172.pdf	Sara		2			1			1				1
4	Hu et al 2024	Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale		https://arxiv.org/pdf/2403.08293.pdf	Christian		0
5	Papadimitriou and Jurafsky 2023	Injecting structural hints: Using language models to study inductive biases in language learning	They pretrain transformers to develop different inductive biases (e.g. recursive structure, Zipfian distributions) and test the effects on downstream perplexity	https://aclanthology.org/2023.findings-emnlp.563.pdf	Christian		1				1						1
6	Madusanka et al. 2023	Not all quantifiers are equal: Probing transformer-based language models’ understanding of generalised quantifiers	Uses a new evaluation based on model checking in natural language	https://aclanthology.org/2023.emnlp-main.536.pdf	Christian		0
7	Timkey and Linzen 2023	A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing	The first author is a nice guy; this work extends looking for memory-based effects in Transformers a la Ryu and Lewis (2021) and Oh and Schuler (2022)	https://arxiv.org/pdf/2310.16142.pdf	Christian		0											1
8	Portelance et al. 2023	Predicting Age of Acquisition for Children's Early Vocabulary in Five Languages Using Language Model Surprisal	Tests whether predictability in context (i.e. surprisal) helps with children's word learning, above and beyond frequency and concreteness	https://onlinelibrary.wiley.com/doi/full/10.1111/cogs.13334?campaign=woletoc	Christian		1						0		1
9	Evanson et al. 2023	Language acquisition: do children and language models follow similar learning stages?	Uses probing tasks from BIG-Bench etc to evaluate how syntactic/semantic abilities emerge over the course of training GPT-2	https://arxiv.org/pdf/2306.03586.pdf	Christian		0	0			0		0
10	McCoy et al. arXiv 2023	Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve	Title to be contrasted with "Sparks of AGI"; basically shows differential performance of GPT-* as a function of frequency	https://arxiv.org/pdf/2309.13638.pdf	Byung-Doh		1	0					0	1
11	Kauf et al. PsyArXiv 2023	Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network	Might explain why BiW hasn't been observed on fMRI	https://www.biorxiv.org/content/10.1101/2023.05.05.539646v1.full	Byung-Doh		1				0		1
12	Hosseini et al. Neurobiology of Language 2024	Artificial Neural Network Language Models Predict Human Brain Responses to Language Even After a Developmentally Realistic Amount of Training		https://direct.mit.edu/nol/article/doi/10.1162/nol_a_00137/119156/Artificial-Neural-Network-Language-Models-Predict	Byung-Doh		0	0					0
13	Schaeffer et al. NeurIPS 2023	Are Emergent Abilities of Large Language Models a Mirage?		https://openreview.net/pdf?id=ITw9edRDlD	Byung-Doh		0
14	von Oswald et al. ICML 2023	Transformers Learn In-Context by Gradient Descent	Seems highly related to paper above (I might recommend this one instead of Dai et al. 2023)	https://arxiv.org/pdf/2212.07677.pdf	Byung-Doh		2			1			1
15	Jelassi et al. arXiv 2024	Repeat After Me: Transformers are Better than State Space Models at Copying Transformers are Better than State Space Models at Copying	Cool paper name; almost makes me think e.g. Mamba surprisal is worth comparing against e.g. GPT-2 surprisal	https://arxiv.org/abs/2402.01032	Byung-Doh		1	0			1		0
16	Ezquerro et al. EACL 2024	From Partial to Strictly Incremental Constituent Parsing	Pulling parses out of incremental LMs; the partial parses seemed extremely similar to those from left-corner parsers, but the first author didn't seem to know what left-corner parsers were when wm asked	https://aclanthology.org/2024.eacl-short.21.pdf	Byung-Doh		2	1		1
17	Sakana AI arXiv 2024	Evolutionary Optimization of Model Merging Recipes	Method for merging LLMs (didn't even know merging models was a thing)	https://arxiv.org/pdf/2403.13187.pdf	Byung-Doh		0						0
18	Mahabadi et al. EACL 2024	TESS: Text-to-Text Self-Conditioned Simplex Diffusion	Interested in how diffusion models might be applied to language modeling	https://aclanthology.org/2024.eacl-long.144.pdf	Byung-Doh		2	0	1	1
19	Isono Cognition 2024	Category Locality Theory: A unified account of locality effects in sentence comprehension	Apparently better DLT with CCG (on Natural Stories)	https://www.sciencedirect.com/science/article/pii/S0010027724000520	Byung-Doh		0
20	Google arXiv 2024	RecurrentGemma: Moving Past Transformers for Efficient Open Language Models	LM based on Google's new Griffin architecture (https://arxiv.org/abs/2402.19427) Are animals the new muppets now? Actually, the Griffin paper might be a better read	https://arxiv.org/abs/2404.07839	Byung-Doh		0
21	Pasquiou et al. (2023)	Information-restricted neural language models reveal different brain regions’sensitivity to semantics, syntax and context	correlate large language model encodings tohuman reading times and fMRI data, respectively,and find that smaller models provide an equal (orbetter) fit to the human data.	https://arxiv.org/abs/2302.14389	William		0										1
22	Eisape et al. (2022)	Probing for incremental parsestates in autoregressive language models	probe LLM representations to arrive at incremental unlabeled dependency analyses	https://arxiv.org/abs/2211.09748	William		0
23	Hoover et al 2022	The Plausibility of Sampling as an Algorithmic Theory of Sentence Processing	Some thoughts about this paper: 1. This paper has a good review of work in surprisal theory and the functional relationship between surprisal and reading times. 2. Their main claims are that 1) the relationship between surprisal and RT is superlinear and therefore 2) sampling algorithms are promising as their time complexity scales exponentially as a function of surprisal. No concrete implementation of 2) is provided though. 3. The first claim supported by a GAM analysis which shows that the fitted curves are superlinear, especially for the larger PLMs. The authors seem to assume that larger PLMs are "better" and therefore provide stronger evidence wrt the relationship between surprisal and reading times. Oh and Schuler shows that this assumption is incorrect. 4. Using GAM instead of LMER is unlikely to change the conclusions of Oh and Schuler, since linearity vs. superlinearity makes the most different predictions at high-surprisal points, but the larger-gets-worse behavior of PLM surprisal is primarily driven by low-surprisal points.	https://files.ca-1.osf.io/v1/resources/qjnpv/providers/osfstorage/6351ab810ecb420e5e2eb105?format=pdf&action=download&direct&version=1	Mike		1								1
24	Bhagavatula et al ACL 2023	I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation	Use of self-imitation very similar to self-training for NLG	https://aclanthology.org/2023.acl-long.535/	Mike		3		1			1		1
25	McCoy et al 2023	How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty in Text Generation Using RAVEN🐦‍⬛	"we introduce RAVEN, a suite of analyses for assessing the novelty of generated text, focusing on sequential structure (n-grams) and syntactic structure. We apply these analyses to four neural language models trained on English (an LSTM, a Transformer, Transformer-XL, and GPT-2)."	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00567/116616	Yi-Chien		2				1			1
26	Dziri et al 2024	Faith and fate: Limits of transformers on compositionality	"We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills."	https://proceedings.neurips.cc/paper_files/paper/2023/file/deb3c28192f979302c157cb653c15e90-Paper-Conference.pdf	Yi-Chien		2	1	1
27	Munkhdalai et al 2024	Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention	This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation.	https://arxiv.org/pdf/2404.07143.pdf	Yi-Chien		0
28	Total:					Total:	21	2	3	4	3	1	3	3	2	#REF!	3	1	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!	#REF!
29		History				Err check:	19
30	Dai et al. ACL Findings 2023	Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers		https://aclanthology.org/2023.findings-acl.247.pdf	Sara (Byung-Doh)	4-18	0						0
31	Bietti et al 2023	Birth of a Transformer: A Memory Viewpoint	analysis of how simple transformers do cued association	https://arxiv.org/pdf/2306.00802.pdf#page5	Byung-Doh (William)	4-11 (in oxley 102)	3	1			1		1				1
32	Murty et al. 2023	Pushdown Layers: Encoding Recursive Structure in Transformer Language Models	New kind of transformer self-attention layer that helps with syntactic generalization	https://aclanthology.org/2023.emnlp-main.195/	Christian	4/4	4	1	1			1	0	1
33	Li et al ACL 2023	Contrastive Decoding: Open-ended Text Generation as Optimization	Clever decoding (from clever folks) using the difference between a smart and dumb model	https://aclanthology.org/2023.acl-long.687/	Mike (Yi-Chien)	3/28	4			1	1			1			1
34	Chen et al. arXiv 2023	Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs	Apparently syntactic heads appear at around 1000 training steps	https://arxiv.org/abs/2309.07311	Byung-Doh	3/7	5	1	1	1							1	1
35	Yamaki et al. ACL 2023	Holographic CCG Parsing	Uses holographic embeddings, which allow for compositional operations in a continuous vector space	https://aclanthology.org/2023.acl-long.15.pdf	Christian	2/29	3	1	1	1
36	Patel and Pavlick 2022	Mapping Language Models to Grounded Conceptual Spaces	Predecessor to the Pavlick 2023 paper we read last semester that Mike shared with us (model learns relationships that aren't directly tied to the space)	https://openreview.net/pdf?id=gJcEM8sxHK	Sara	2/22	5	1	1		1		1		1
37	Gu and Dao arXiv 2023	Mamba: Linear-Time Sequence Modeling with Selective State Spaces	New architecture	https://arxiv.org/pdf/2312.00752.pdf	Byung-Doh	2/15	4	1					1	1			1
38	Mahowald et al. 2023	Dissociating language and thought in large language models: a cognitive perspective	Reviews linguistic vs functional competence in humans; argues LLMs need more non-linguistic cognitive capacities.	https://arxiv.org/abs/2301.06627	Yi-Chien (Mike)	2/8	6		1		1	1	1				1	1
39	Futrell (2023)	An information-theoretic account of availability effects in language production	Model of language production (incremental selection at the word level) based in information theory, cognitive science, and neuroscience Objective maximizes "communicative value" subject to an information theoretic constraint	https://escholarship.org/uc/item/23q9k7pc	Sara	2/1	3	0	1	1	0		1
40	Piñango 2023	Solving the elusiveness of word meanings: two arguments for a continuous meaning space for language	Model explains words that have multiple interdependent meanings ("smoke") or a large family of meanings ("have")	https://www.frontiersin.org/articles/10.3389/frai.2023.1025293/full	Christian	1/25	5		1			1	1				1	1
41	Deepmind people arXiv 2023	Reinforced Self-Training (ReST) for Language Modeling	Apparently more efficient version of RLHF	https://arxiv.org/pdf/2308.08998.pdf	Byung-Doh	2024-01-18	5		1			1	1	1				1
42	Webb et al. Nature 2023	Emergent analogical reasoning in large language models	Alex Petrov knows the authors; convinces him that LLMs are not just stochastic parrots	https://www.nature.com/articles/s41562-023-01659-w	Yi-Chien (Mike)	2023-11-30	4	1	1				1				1
43	Lake and Baroni Nature 2023	Human-like systematic generalization through a meta-learning neural network		https://www.nature.com/articles/s41586-023-06668-3	Byung-Doh	2023-11-16	0
44	Pavlick 2023	Semantic structure in deep learning	one of several recent papers looking at whether real-world semantic structure can be learned just from language data (earlier paper: https://www.annualreviews.org/doi/abs/10.1146/annurev-linguistics-031120-122924)	https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2022.0041	Mike	2023-11-02	5	1	1		1	1	1
45	Wang et al. 2023	Finding Structure in One Child's Linguistic Experience		https://onlinelibrary.wiley.com/doi/full/10.1111/cogs.13305	Christian	10/19	4	1				1	1				1
46	Bailly et al. ACL 2023	Syntax and Geometry of Information	"We study syntactic generalization from the perspective of the capacity to disentangle semantic and structural information"	https://aclanthology.org/2023.acl-long.590.pdf	Byung-Doh	9/21	3	1	1								1
47	Li & Lu ACL 2023	Contextual Distortion Reveals Constituency: Masked Language Models are Implicit Parsers	Tree reconstruction from MLMs	https://aclanthology.org/2023.acl-long.285.pdf	Christian	9/14	3	1	1								1
48	UniLM people arXiv 2023	Retentive Network: A Successor to Transformer for Large Language Models	New architecture!	https://arxiv.org/pdf/2307.08621.pdf	Byung-Doh	9/7	2		1								1
49	Piantadosi & Hill arXiv 2022	Meaning without reference in large language models		https://arxiv.org/pdf/2208.02957.pdff	Mike/Christian	8/31	3		1		1	1
50	Hahn et al. 2022	A resource-rational model of human processing of recursive linguistic structure		https://www.pnas.org/doi/10.1073/pnas.2122602119	Christian (from Byung-Doh)	4/20	0
51	Piantadosi LingBuzz 2023	Modern language models refute Chomsky’s approach to language	Cited during Casillas talk; there's also a reply to this https://lingbuzz.net/lingbuzz/007190	https://ling.auf.net/lingbuzz/0071800	Byung-Doh	4/13	0
52	Yedetore et al 2023	How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech	Transformers and LSTMs trained on CHILDES don't pick up hierarchical structure	https://arxiv.org/abs/2301.11462	Christian	3/30	0
53	Meister and Cotterrell 2021	Language Model Evaluation Beyond Perplexity	.	https://aclanthology.org/2021.acl-long.414.pdf	Christian		1		1												0
54	Sinclair et al. TACL 2022	Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations	Priming language models (TACL)	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00504/113019/Structural-Persistence-in-Language-Models-Priming	Byung-Doh	2/2	2	1	1
55	Yang et al 2022	Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars	Unsupervised parsing that can handle extraposition, wh-movement, etc	https://arxiv.org/pdf/2212.09140.pdf	Christian	1/26	2	1									1
56	Warstadt and Bowman 2022	What Artificial Neural Networks Can Tell Us About Human Language Acquisition		https://arxiv.org/pdf/2208.07998.pdf	Christian	11/17	5	1	1								1					1		1
57	Prange et al. NAACL 2022	Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling	Conditioning on syntax/semantic subgraphs improves GPT-2 perplexity, probably makes surprisal less humanlike though	https://aclanthology.org/2022.naacl-main.325.pdf	Byung-Doh	12/1	5	1	1					1			1							1
58	Li and Liang, 2021	Prefix-Tuning: Optimizing Continuous Prompts for Generation	alternative to fine-tuning	https://aclanthology.org/2021.acl-long.353.pdf	Ash	11/3	3		1							0						1		1
59	Niu and Penn 2020	Grammaticality and Language Modelling	point biserial correlation for comparing NN output to human judgments, and some other improvements / tests	https://aclanthology.org/2020.eval4nlp-1.11/	Willy	10/27	3		1					1										1
60	Dettmers et al. Neurips 2022	LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale	Personally interested to learn more about "emergent outliers" rather than the quantization technique	https://arxiv.org/pdf/2208.07339.pdf	Byung-Doh		3	1	1													1
61	Tran et al 22	Plex: Towards Reliability Using Pretrained Large Model Extensions	Google paper looking at reliability of LLMs including few-shot uncertainty; blog post: https://ai.googleblog.com/2022/07/towards-reliability-in-deep-learning.html	https://arxiv.org/pdf/2207.07411.pdf	Willy (Mike)		3		1								1							1
62	Srivastava et al 2022	Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models	BIG-bench (set of 204 LM evaluation tasks)	https://arxiv.org/abs/2206.04615	Christian		3		1													1		1
63	Goldstein et al. NatNeurosci 2022	Shared computational principles for language processing in humans and deep language models	GPT-2 embeddings X ECoG	https://www.nature.com/articles/s41593-022-01026-4	Byung-Doh		3		1										1				1
64	Caucheteux et al 2021	Decomposing lexical and compositional syntax and semantics with deep language models		https://arxiv.org/pdf/2103.01620.pdf	Christian		3										0		1			1		1
65	Schuster and Linzen 2022	When a sentence does not introduce a discourse referent, transformer-based models still sometimes refer to it		https://arxiv.org/pdf/2205.03472.pdf	Willy		3		1														1	1
66	Jiang et al 2021	How can we know when Language Models know? On the calibration of Language Models for Question Answering	Looking at probability estimates of T5, BART, GPT2 on QA task	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00407/107277/How-Can-We-Know-When-Language-Models-Know-On-the	Willy	4/21	3		1								1							1
67	Ryu and Lewis 2021	Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention	Also appeared in CMCL 2021 (https://aclanthology.org/2021.cmcl-1.6/)	https://arxiv.org/abs/2104.12874	Christian	4/7	4	1	1								1							1
68	Xu et al. ACL 2021	Syntax-Enhanced Pre-trained Model		https://aclanthology.org/2021.acl-long.420.pdf	Byung-Doh	3/31	2	1									1
69	Davis & van Schijndel 2020	Discourse structure interacts with reference but not syntax in neural language models		https://arxiv.org/pdf/2010.04887.pdf	Willy	3/10	4	1	1								1							1
70	Stengel-Eskin et al 2021	Joint Universal Syntactic and Semantic Parsing	Compares several model architectures for joint syntactic and semantic parsing on rich annotations from Universal Decompositional Semantics dataset	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00396/106796/Joint-Universal-Syntactic-and-Semantic-Parsing	Christian	3/3	4	1	1								1							1
71	Mao et al 2021	Grammar-Based Grounded Lexicon Learning	A method for learning lexical entries from grounded data like images and texts. Entries include syntactic types and "neuro-symbolic" semantic programs that combine lambda calculus expressions with neural network embeddings	https://proceedings.neurips.cc/paper/2021/file/4158f6d19559955bae372bb00f6204e4-Paper.pdf	Byung-Doh	2/24	3		1								1							1
72	Elazar et al 2021	Measuring and Improving Consistency in Pretrained Language Models	small paraphrase adversarial dataset with BERT based models	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00410/107384/Measuring-and-Improving-Consistency-in-Pretrained	Willy	2/17	4	1	1								1							1
73	Yang & Piantadosi 2022	One model for the learning of language		https://www.pnas.org/content/119/5/e2021865119	Christian	2/10	4	1	1								1							1
74	Anthropic people 2021	A Mathematical Framework for Transformer Circuits	GPT-2-ology (0-layer, 1-layer models)	https://transformer-circuits.pub/2021/framework/index.html	Byung-Doh	2/3		0
75	Belinkov and Glass 2019	Analysis methods in neural language processing: a survey	interested in exploring ways to test ling data with neural models, plus focuses on some perspectives not mentioned in other similar papers	https://doi.org/10.1162/tacl_a_00254	Willy	1/27			1								0
76	Guest and Martin 2021	On logical inference over brains, behaviour, and artificial neural networks	Questions how much we can infer about the mind and brain from the behavior of neural network models ("if NN reproduces the pattern seen in brain activity, the brain must work like the NN")	https://psyarxiv.com/tbmcg/	Christian	1/20	0	0									0
77	Li et al. ACL 2021	How is BERT surprised? Layerwise detection of linguistic anomalies		https://aclanthology.org/2021.acl-long.325.pdf	Byung-Doh		1	1													0
78	Stanojević, Steedman 2021	Formal Basis of a Language universal		https://direct.mit.edu/coli/article/47/1/9/97333/Formal-Basis-of-a-Language-Universal	Nanjiang		3	1	1															1
79	Stanojević et al 2021	Modeling incremental language comprehension in the brain with Combinatory Categorial Grammar		https://aclanthology.org/2021.cmcl-1.3.pdf	Christian		4	1	1								1							1
80	Sanh et al 2021	Multitask Prompted Training Enables Zero-Shot Task Generalization	to be discussed on 10/28 by popular demand	https://arxiv.org/pdf/2110.08207.pdf	Willy (from Mike)
81	Kuribayashi et al. ACL 2021	Lower Perplexity is Not Always Human-Like		https://aclanthology.org/2021.acl-long.405.pdf	Byung-Doh		0
82	White and Cotterell 2021	Examining the Inductive Bias of Neural Language Models with Artificial Languages			Nanjiang		3	1									1				1
83	Aghajanyan et al 2021	Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning		https://aclanthology.org/2021.acl-long.568/	Christian		5	1	1								1				1			1
84	Linzen and Baroni 2021	Syntactic Structure from Deep Learning	9/23	https://www.annualreviews.org/doi/abs/10.1146/annurev-linguistics-032020-051035?cookieSet=1	Willy (Christian)		3		1												1			1
85	Shen et al. ACL 2021	StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling	9/16	https://aclanthology.org/2021.acl-long.559.pdf	Byung-Doh		4	1									1				1			1
86	Press et al 2021	Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation	9/9	https://arxiv.org/abs/2108.12409	Christian (Mike)		4	1	1												1			1
87	Lewis and Bastiansen 2015	A predictive coding framework for rapid neural dynamics during sentence-level language comprehension		https://www.sciencedirect.com/science/article/abs/pii/S0010945215000714	Evan		2									1									1
88	Beres 2017	Time is of the Essence: A Review of Electroencephalography (EEG) and Event-Related Brain Potentials (ERPs) in Language Research	overview of ERPs in linguistic research	https://core.ac.uk/download/pdf/206525297.pdf	Willy		4	1								1	1							1
89	Li et al. AACL 2020	Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads		https://www.aclweb.org/anthology/2020.aacl-main.43.pdf	Byung-Doh		3	1								1	1								0
90	Brothers & Kuperberg 2021	Word predictability effects are linear, not logarithmic: Implications for probabilistic models of sentence comprehension		https://www.sciencedirect.com/science/article/pii/S0749596X20300887?casa_token=eP1ih9VvgCYAAAAA:b8PPt-3KkCybH56c6jOFyVqnVrC1xI4j1BGiKLtexmPziQaJ0HPxPkSZx7kSus1OJ37u_iHwbA	Cory		3	1								1								1
91		3/4 - CUNY day: https://www.cuny2021.io
92	Wilcox et al 20	On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior		https://arxiv.org/pdf/2006.01912.pdf	Christian		3									1	1								1
93	Caplan et al 2020	Miller's Monkey Updated: Communicative Efficiency and the Statistics of Words in Natural Language		https://ling.auf.net/lingbuzz/004660/current.pdf?_s=6GvkvSUSdQZc_66K	Cory		1	0					0			0					1
94	Steinert-Threlkeld and Szymanik 2020	Ease of Learning Explains Semantic Universals		https://semanticsarchive.net/Archive/zM5ZGIxM/EaseLearning.pdf	Nanjiang		3	1					0			1								0	1
95	Meister et al. EMNLP 2020	If Beam Search is the Answer, What was the Question?		https://www.aclweb.org/anthology/2020.emnlp-main.170.pdf	Byung-Doh		2	1									1											0
96	Lopopolo et al 20	Distinguishing syntactic operations in the brain: Dependency and phrase-structure parsing		https://www.mitpressjournals.org/doi/abs/10.1162/nol_a_00029	Willy (from Cory)		2						0			1								1
97	Venhuizen et al 19	Expectation-based Comprehension: Modeling the Interaction of World Knowledge and Linguistic Experience		https://www.tandfonline.com/doi/pdf/10.1080/0163853X.2018.1448677	Cory		4						0			1	1							1	1
98	Li et al 2019	Specializing Word Embeddings (for Parsing) by Information Bottleneck		https://www.aclweb.org/anthology/D19-1276.pdf	Nanjiang		3	1	0								1										1		0
99	Kodner & Gupta ACL 2020	Overestimation of Syntactic Representation in Neural Language Models		https://www.aclweb.org/anthology/2020.acl-main.160.pdf	Byung-Doh		5									1	1							1		1	1
100	Kuperberg and Jaeger 2016	What do we mean by prediction in language comprehension?		https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4850025/pdf/nihms-754635.pdf	Evan		2	1																			1