Published using Google Docs
[Syllabus] PhD Seminar: Scaling Laws, the Bitter Lesson, and AI Research after GPT-3 (F'21, DS-GA 3001)
Updated automatically every 5 minutes

PhD Seminar: Scaling Laws, the Bitter Lesson, and AI Research after GPT-3

Listed as DS-GA 3001: Special Topics in Data Science

Fall 2021

Sam Bowman

60 Fifth Ave C10
Weds 11–1:45

Material progress in many areas of applied machine learning has recently been driven more by the effects of scale than by the innovations of specialist AI researchers. In NLP, this is showcased well by the transition from GPT-2 to GPT-3: Even though the two models are nearly identical, a simple increase in parameter count and training time led to dramatic quantitative and qualitative improvements in language and reasoning ability. Should we expect these dynamics to hold for the foreseeable future? If so, what kinds of AI-related research are still likely to produce worthwhile and lasting successes?  This seminar aims to be a forum for students looking to better understand these issues and to discuss how they should influence one’s choice of research agenda.

This seminar will focus on topics in the academic literature surrounding scaling laws—attempts to forecast future progress driven by model or dataset scale—and the bitter lesson—the observation that attempts to encode expert knowledge into scalable machine learning models tend to fail. We will use GPT-2 and GPT-3 as a recurring case study, but we will discuss relevant topics from other areas of NLP and other areas of applied machine learning.

The seminar will be centered around student-driven discussions of clusters of related papers. In addition, each student will also be asked to write a survey paper—alone or with a teammate—synthesizing research on a topic related to the course.

Prerequisites

Enrollment is limited to PhD students who have substantial experience and a publication record working on an applied ML topic. This includes NLP, computer vision, robotics, or general empirical machine learning research. MS or undergraduate students with equivalent experience may be admitted with permission of instructor if capacity allows. 

If you meet the prerequisites but are unable to formally enroll, you are welcome to participate as an auditor. However, auditing in the normal sense is not allowed: Anyone who attends meetings after the drop deadline must be committed to staying with the seminar for the full term and submitting a paper.

Requirements and Grading

Each week, a small group of students will be assigned to briefly present the readings—with no more than fifteen minutes of planned material per reading—and lead discussion. These will generally include one required reading and several additional papers (or demos or blog posts) expanding on a related theme.

The only written work requirement will be a survey paper. This paper should be no more than ten pages (in any common ML/AI paper format), and should synthesize work on some narrow topic related to the course. The paper may include original empirical work, but this should not be the primary focus. Papers may be submitted individually or in teams of up to three. Multi-person teams should include a brief collaboration statement.

Formula:

Unexcused late submissions are subject to a 10 percentage point grade deduction.

Logistics

Office hours by appointment.

Schedule

Draft sketch of topics and key readings by week. Full agenda to be shared privately. Each session will include a 15m snack/coffee break.

  1. Introductions, brainstorming, initial presenter assignments, GPT-* successes, and limitations
  1. Required: GPT-3
  1. What should we take away from this: Initial reactions
  1. Required: The Bitter Lesson (very short essay)
  1. Forecasting further progress: Initial attempts
  1. Required: Scaling Laws for Neural Language Models
  1. Research directions: How do social impact concerns change with model size/capacity?
  1. Required: On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜
  1. Research directions: Long-term AI alignment/safety w/ predictive models: Major concerns
  1. Risks from Learned Optimization in Advanced Machine Learning Systems
  1. Research directions: Long-term AI alignment/safety w/ predictive models: Approaches
  1. Required: Alignment of Language Agents 
  1. Research directions: Evaluation, data collection, and analysis
  1. Required: What Will it Take to Fix Benchmarking in Natural Language Understanding?
  2. Set papers and presenters for the second half of the term
  1. Research directions: ML research as cognitive science research
  1. Required: Syntactic Structure from Deep Learning
  1. Research directions: Grounded learning and interactive learning
  1. Required: Experience Grounds Language
  1. Topics from recent work (to be selected by the group)
  1. Required: TBD
  2. Paper proposals due at the start of class.
  1. Topics from recent work (to be selected by the group)
  1. Required: TBD
  1. Topics from recent work (to be selected by the group)
  1. Required: TBD
  1. Topics from recent work (to be selected by the group)
  1. Required: TBD
  1. Topics from recent work (to be selected by the group)
  1. Required: TBD
  1. Finals week: Project presentations.
  1. Papers due at the start of the assigned exam time.

Applicable University Policies

Academic Integrity

Work you submit should be your own. Please consult the GSAS academic integrity policy for more information: GSAS Statement on Academic Integrity

Penalties for violations of academic integrity may include failure of the course, suspension from the University, or even expulsion.

Religious Observance

As a nonsectarian, inclusive institution, NYU policy permits members of any religious group to absent themselves from classes without penalty when required for compliance with their religious obligations. The policy and principles to be followed by students and faculty may be found here: University Calendar Policy on Religious Holidays - NYC

Disability Disclosure Statement

Academic accommodations are available to any student with a chronic, psychological, visual, mobility, learning disability, or who is deaf or hard of hearing. Students should please register with the Moses Center for Students with Disabilities at 212-998-4980.

NYU's Henry and Lucy Moses Center for Students with Disabilities

726 Broadway, 2nd Floor

New York, NY 10003-6675

Telephone: 212-998-4980

Voice/TTY Fax: 212-995-4114

Web site: http://www.nyu.edu/csd