1 of 16

Tracing Emergent Abilities of Language Models

Chenghao Yang, UChicago

Materials mostly based on Yao Fu’s blog

and my personal reading

Feedbacks are highly appreciated!

Feel free to send me emails via chenghao dot uchicago dot edu.

2 of 16

Agenda

What are “Emergent Abilities”?
Tracing Emergent Abilities

Evolvement in GPT-3 families: From 2020 GPT-3 to 2022 ChatGPT
Code Pretraining – Codex and Incoder
(Supervised) (Multitask) Instruction Tuning – T0, FLAN, and Davinci-001
Reinforcement Learning from Human Feedback (RLHF) – GPT-3.5

Failure Mode of GPT-3 models, Future Direction

3 of 16

What are “Emergent Abilities”?

1. Language Generation (following prompt)

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1-35.

4 of 16

What are “Emergent Abilities”?

2. In-Context Learning (few-shot, one-shot, zero-shot)

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. NeurIPS 2020

5 of 16

What are “Emergent Abilities”?

3. World Knowledge

Weir, N., Poliak, A., & Van Durme, B. (2020). Probing neural language models for human tacit assumptions. CogSci 2020

Sap, M., Shwartz, V., Bosselut, A., Choi, Y., & Roth, D. (2020). Commonsense reasoning for natural language processing. ACL 2020 Tutorial Abstracts.

ChatGPT failures: https://github.com/giuven95/chatgpt-failures

6 of 16

Tracing Emergent Abilities: History of GPT-3 family

Large-Scale LM pretraining

2020

GPT-3 Initial

Davinci

Codex Initial

Code-davinci-001

Code-cushman-001

InstructGPT Initial (Presumably)

Instruct-davinci-beta

Text-davinci-001

Code Training

Instruction Tuning

2021

Code-davinci-002

LM + Code Training -> Instruction Tuning

2022

Text-davinci-002

Supervised Instruction Tuning (FeedME)

Text-davinci-003

ChatGPT

RLHF

GPT-3 Series

GPT-3.5 Series

7 of 16

Code Training – Codex and Incoder

Generally, Code Training means using LMs to train on code with the goal of minimizing negative loglikelihood of reference.

Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. D. O., Kaplan, J., ... & Zaremba, W. (2021). Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.

Fried, D., Aghajanyan, A., Lin, J., Wang, S., Wallace, E., Shi, F., ... & Lewis, M. (2022). Incoder: A generative model for code infilling and synthesis. arXiv preprint arXiv:2204.05999.

Function signature

DocString

(optionally w/ example)

reference

Masked-LM style pretraining is also used.

8 of 16

How is Code Training helpful?

Procedure-oriented Programming -> “let’s think step by step”

(Chain-of-Thought, Reliable CoT, and “let’s think step by step” of NeurIPS’22)

Object-oriented Programming -> “decomposing complex tasks” (ICLR’23)
Parentheses matching -> “long-term dependency matching”
HELM (Liang et al., 2022) shows models trained on code has strong language reasoning abilities.

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E., Le, Q., & Zhou, D. (2022). Chain of thought prompting elicits reasoning in large language models. NeurIPS 2022

Ye, X., & Durrett, G. (2022). The unreliability of explanations in few-shot prompting for textual reasoning. NeurIPS 2022.

Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. NeurIPS 2022

Khot, T., Trivedi, H., Finlayson, M., Fu, Y., Richardson, K., Clark, P., & Sabharwal, A. (2022). Decomposed prompting: A modular approach for solving complex tasks. ICLR 2023

Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., ... & Koreeda, Y. (2022). Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.

9 of 16

(Supervised) (Multitask) Instruction Tuning – T0, FLAN, and Davinci-001

Unifying Different NLP tasks with Prompt Templates, and do supervised tuning.

Sanh, V., Webson, A., Raffel, C., Bach, S., Sutawika, L., Alyafeai, Z., ... & Rush, A. M. (2021). Multitask Prompted Training Enables Zero-Shot Task Generalization. ICLR 2022

Wei, J., Bosma, M., Zhao, V. Y., Guu, K., Yu, A. W., Lester, B., ... & Le, Q. V. (2021). Finetuned Language Models Are Zero-Shot Learners. ICLR 2022

10 of 16

How is Instruction Tuning helpful?

Compared with large-scale LM pre-training data, Instruction Tuning uses very few data. So it is very likely Instruction Tuning does not inject new abilities – it just unlocks these abilities from LMs.

In many in-context zero-shot learning scenarios, the task format would be very similar to supervised instruction tuning. So zero-shot generalization seems to be a free lunch from Instruction Tuning.

11 of 16

Reinforcement Learning from Human Feedback (RLHF) – GPT 3.5

Stiennon, N., Ouyang, L., Wu, J., Ziegler, D., Lowe, R., Voss, C., ... & Christiano, P. F. (2020). Learning to summarize with human feedback. NeurIPS 2020

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. NeurIPS 2022

12 of 16

How is RLHF helpful?

Achieve better Alignment

Better human preference

Better truthfulness, factuality

Reduced toxicity

Learn to reject improper questions

Learn to reject out-of-scope questions

Sacrifice In-Context Learning ability (“Alignment Tax”)

13 of 16

Failure Mode of GPT-3 models, Future Direction

On-the-fly overwriting the model’s belief

14 of 16

Failure Mode of GPT-3 models, Future Direction

It seems LLMs lack systematicity in In-Context Learning – you can give arbitrary samples in ICL, but they can still do tasks correct.

Another evidence that it is hard to overwrite the model’s belief about the world.

Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. EMNLP 2022

15 of 16

Failure Mode of GPT-3 models, Future Direction

Logic Consistency

Arithmetic Reasoning

… Maybe all about Reasoning… :-)

16 of 16

Failure Mode of GPT-3 models, Future Direction

Retrieval from the Internet

Use API/External Tools

WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing. https://openai.com/blog/webgpt/

Nakano, Reiichiro, et al. "Webgpt: Browser-assisted question-answering with human feedback." arXiv preprint arXiv:2112.09332 (2021).

Schick, Timo, et al. "Toolformer: Language models can teach themselves to use tools." arXiv preprint arXiv:2302.04761 (2023).