Tracing Emergent Abilities of Language Models
Feedbacks are highly appreciated!
Feel free to send me emails via chenghao dot uchicago dot edu.
Agenda
What are “Emergent Abilities”?
1. Language Generation (following prompt)
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1-35.
What are “Emergent Abilities”?
2. In-Context Learning (few-shot, one-shot, zero-shot)
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. NeurIPS 2020
What are “Emergent Abilities”?
3. World Knowledge
Weir, N., Poliak, A., & Van Durme, B. (2020). Probing neural language models for human tacit assumptions. CogSci 2020
Sap, M., Shwartz, V., Bosselut, A., Choi, Y., & Roth, D. (2020). Commonsense reasoning for natural language processing. ACL 2020 Tutorial Abstracts.
ChatGPT failures: https://github.com/giuven95/chatgpt-failures
Tracing Emergent Abilities: History of GPT-3 family
Large-Scale LM pretraining
2020
GPT-3 Initial
Davinci
Codex Initial
Code-davinci-001
Code-cushman-001
InstructGPT Initial (Presumably)
Instruct-davinci-beta
Text-davinci-001
Code Training
Instruction Tuning
2021
Code-davinci-002
LM + Code Training -> Instruction Tuning
2022
Text-davinci-002
Supervised Instruction Tuning (FeedME)
Text-davinci-003
ChatGPT
RLHF
RLHF
GPT-3 Series
GPT-3.5 Series
Code Training – Codex and Incoder
Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. D. O., Kaplan, J., ... & Zaremba, W. (2021). Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
Fried, D., Aghajanyan, A., Lin, J., Wang, S., Wallace, E., Shi, F., ... & Lewis, M. (2022). Incoder: A generative model for code infilling and synthesis. arXiv preprint arXiv:2204.05999.
Function signature
DocString
(optionally w/ example)
reference
Masked-LM style pretraining is also used.
How is Code Training helpful?
(Chain-of-Thought, Reliable CoT, and “let’s think step by step” of NeurIPS’22)
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E., Le, Q., & Zhou, D. (2022). Chain of thought prompting elicits reasoning in large language models. NeurIPS 2022
Ye, X., & Durrett, G. (2022). The unreliability of explanations in few-shot prompting for textual reasoning. NeurIPS 2022.
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. NeurIPS 2022
Khot, T., Trivedi, H., Finlayson, M., Fu, Y., Richardson, K., Clark, P., & Sabharwal, A. (2022). Decomposed prompting: A modular approach for solving complex tasks. ICLR 2023
Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., ... & Koreeda, Y. (2022). Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.
(Supervised) (Multitask) Instruction Tuning – T0, FLAN, and Davinci-001
Unifying Different NLP tasks with Prompt Templates, and do supervised tuning.
Sanh, V., Webson, A., Raffel, C., Bach, S., Sutawika, L., Alyafeai, Z., ... & Rush, A. M. (2021). Multitask Prompted Training Enables Zero-Shot Task Generalization. ICLR 2022
Wei, J., Bosma, M., Zhao, V. Y., Guu, K., Yu, A. W., Lester, B., ... & Le, Q. V. (2021). Finetuned Language Models Are Zero-Shot Learners. ICLR 2022
How is Instruction Tuning helpful?
Compared with large-scale LM pre-training data, Instruction Tuning uses very few data. So it is very likely Instruction Tuning does not inject new abilities – it just unlocks these abilities from LMs.
In many in-context zero-shot learning scenarios, the task format would be very similar to supervised instruction tuning. So zero-shot generalization seems to be a free lunch from Instruction Tuning.
Reinforcement Learning from Human Feedback (RLHF) – GPT 3.5
Stiennon, N., Ouyang, L., Wu, J., Ziegler, D., Lowe, R., Voss, C., ... & Christiano, P. F. (2020). Learning to summarize with human feedback. NeurIPS 2020
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. NeurIPS 2022
How is RLHF helpful?
Achieve better Alignment
Better human preference
Better truthfulness, factuality
Reduced toxicity
Learn to reject improper questions
Learn to reject out-of-scope questions
Sacrifice In-Context Learning ability (“Alignment Tax”)
Failure Mode of GPT-3 models, Future Direction
On-the-fly overwriting the model’s belief
Failure Mode of GPT-3 models, Future Direction
It seems LLMs lack systematicity in In-Context Learning – you can give arbitrary samples in ICL, but they can still do tasks correct.
Another evidence that it is hard to overwrite the model’s belief about the world.
Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. EMNLP 2022
Failure Mode of GPT-3 models, Future Direction
Logic Consistency
Arithmetic Reasoning
… Maybe all about Reasoning… :-)
Failure Mode of GPT-3 models, Future Direction
Retrieval from the Internet
Use API/External Tools
WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing. https://openai.com/blog/webgpt/
Nakano, Reiichiro, et al. "Webgpt: Browser-assisted question-answering with human feedback." arXiv preprint arXiv:2112.09332 (2021).
Schick, Timo, et al. "Toolformer: Language models can teach themselves to use tools." arXiv preprint arXiv:2302.04761 (2023).