CS 6120
Lecture 1: Introduction to Natural Language Processing
Natural Language Processing
“Two eggs on a plate”
2400000 Bytes (Compressed, JPG)
50000 Bytes (Compressed, MP3)
18 Bytes (Uncompressed)
Adoption Rates of ChatGPT in the United States
Usage Stats for ChatGPT
Natural Language Processing�
Section 0: A brief introduction to the course
Section 1: Administrative and logistics
Section 2: A lab to get you started
Section 3: Some historical perspectives
What is NLP? Why study language and automate it?
Computer programs that analyze, understand and generate (in)formal human language
What can NLP do for you?
primer.ai
What can you do for NLP?
primer.ai
watsonx.ai
Verticals that use NLP extensively
E-Commerce
Healthcare
Education
Majors and fields of study
Computational linguistics - a case study on marketability
Computational linguistics (CL) is what powers anything in a machine or device that has to do with language—speaking, writing, reading, and listening. It is often linked with natural language processing (NLP), which is a subset of CL.
On the topic of chatbots
Why is this so hard?
Natural Language Processing�
Section 0: A brief introduction to the course
Section 1: Administrative and logistics
Section 2: A lab to get you started
Section 3: Some historical perspectives
Welcome!
���
Survey of Courses Taken
Course Title | NEU Course Number |
Machine Learning | CS6140 |
Deep Learning | CS 7150 |
(Advanced) Algorithms | CS5800 / CS7800 |
Bella Chen: NLP Teaching Assistant
Joy (Hsin-Yu) Guo: NLP Teaching Assistant
Raman: NLP Teaching Assistant
About this course
ML is an advanced class: this is a very advanced class
Where can you find our material
Our class website (CS6120) is at:
You can find our syllabus, reading, homeworks, project templates, etc. there.
Course Format
You will do well in this class if:�
Some suggestions to do excellent in the class and beyond:
Class Artifacts
Objectives of the course
By the end of this course, you will build competencies in your:
knowledge-base
implementation fluency
industry/academic skill
By the end of this course, you will have …
… built your foundational skills
You’ll have
By the end of this course, you will have …
… improved your fluency and practical knowledge
You’ll be able to code with velocity by
By the end of this course, you would have …
… built your accomplishments and resume
Your track record will either have:
(Expectations are that 90% of you would choose the product project vs the academic contribution)
You will do this by:
knowledge-base
implementation fluency
industry/academic skill
On Practice-Oriented Problems
Structured towards industry practice of using Natural Language Processing
On Practice-Oriented Problems
Structured towards industry practice of using Natural Language Processing
Lectures build towards modern Large Language Modeling (LLM)
Class Artifacts
Literature and Reading
Open source, data proliferation and compute ⇒ �this field moves faster than textbooks can keep up
What are required “keynote papers”
Use your resources! There are so many of them
Keynote Reading Roles
How will we learn and discuss keynote papers?
Fill out your name as facilitator
Fill out your name as a scribe
Expectations for everyone in the classroom
What to Read and How
Required keynote reading for the next week�
Videos for keynote paper
Blogs for keynote paper
Topic in the next week related to keynote paper
Current list of keynote papers
Week | Paper | Notes | Summary | Lead 1 |
6 | Name & Link | Name & Link | Name & Link | |
7 | Name & Link | Name & Link | Name & Link | |
8 | Name & Link | Name & Link | Name & Link | |
9 | Name & Link | Name & Link | Name & Link | |
9 | Name & Link | Name & Link | Name & Link | |
11 | Name & Link | Name & Link | Name & Link | |
12 | Name & Link | Name & Link | Name & Link | |
12 | Name & Link | Name & Link | Name & Link | |
13 | Name & Link | Name & Link | Name & Link | |
13 | Name & Link | Name & Link | Name & Link | |
14 | Name & Link | Name & Link | Name & Link |
Other things you can do to better understand the paper
The scribe role
Document is due a week after conversation
Purpose: absent students and prepares everyone; creates record on discussion
What’s included in the scribe notes?
Class Artifacts
Two options for your project
Fully Packaged�NLP Product�
Academic Contribution��
Product Delivery to be Scaled
Academic Paper to be Submitted
Grade Breakdown - Traditional Project
Option 1: Business App
Grade Breakdown - Paper Project
Option 2: Paper
Class Project - Route 1
Option 1: Business App
Example projects with front ends
Option 1: Business App
Class Project: Route 2
Option 2: Paper
Course Project(s)
Option 1: Business App
Course Project(s)
Option 2: Paper
Conferences
Option 2: Paper
Class Artifacts
Homeworks and Labs
Natural Language Processing�
Section 0: A brief introduction to the course
Section 1: Administrative and logistics
Section 2: A lab to get you started
Section 3: Some historical perspectives
Some Available Tools
LaTeX
Google Colab
Google Cloud Platform
LaTeX and Overleaf
Benefits of using overleaf.com
Joint editing session
\begin{equation}
X \in \mathbb{R}^{10}
\end{equation}
Google Colab
Course Start Date: 9/1/2024
Students’ email domain(s): @northeastern.edu, @ccs.neu.edu
Students can request coupons from the URL and redeem them until: 6/1/2025
Coupons Valid Through: 1/1/2025
Number of Coupons: 30
Face Value of Coupon(s): USD 50.00
Check how many credits you have
Natural Language Processing�
Section 0: A brief introduction to the course
Section 1: Administrative and logistics
Section 2: A lab to get you started
Section 3: Some historical perspectives
Where are we today?
We're currently in one of the longest periods of sustained interest in AI in history because:�
Many doubt AI's ability to pass the Turing Test and prove its ability to create systems that imitate human intelligence and behavior.
It's still an open question how far the technology can go…and how far you can push it.
How we got here?
Section 3: Some History and Where it Pertains to You
A Coarse Timeline
1916 | de Saussure develops General Lingustics Course |
1950 | Turing writes Computing machinery and Intelligence |
1954 | Georgetown experiments |
1966 | ELIZA is the first chatbot |
1975 | The First AI Winter |
1980 | If/Else computing revives AI research |
1987 | The Second AI Winter |
1990 | Statistical methods take the community by storm. SVMs are developed and become popular. VC dimension established |
2001 | The first neural language model is built |
2012 | ImageNet makes deep learning the de facto AI poster child |
2013 | Mikolov writes word2vec and uses skipgrams |
2014 | Sutskever writes about sequence-sequence models, popularizing RNNs |
2015 | Attention modeling is introduced |
2017 | Google creates Transformer neural networks |
2019 | OpenAI becomes for-profit enterprise |
2022 | OpenAI releases ChatGPT |
2024 | LLMs proliferate throughout the world |
Almost Ending Before It Started (1916)
The Turing Test (1950)
The Imitation Game, a.k.a., the Turing Test
The Georgetown / IBM Experiments (1954)
“Within three or five years, machine translation will be a solved problem”
Purpose: attract governmental and public interest and funding by showing the possibilities of machine translation
The First Chatbot: ELIZA (1960s)
The Rogerian Arguments and Psychology
The AI Winters
AI Winters - A Chill in the AI Enthusiasm��An AI winter refers to a period of reduced funding and interest in artificial intelligence (AI) research.
Causes of AI Winter
The Three Booms
The Advent of the First AI Winter (1974-1980)
Lighthill Report
Revival (1980s)
The Second AI Winter
1984: John McCarthy criticized expert systems - lack of common sense and their inability to understand their own limitations.
1987: Apple and IBM producing general purpose computers & solving more real-world problems … much cheaper than any of the expensive AI-based systems.
John McCarthy
Late 1980s: DARPA and Strategic Computing Initiative cut AI �funding - did not trust the technology’s capability to deliver results.
DARPA Director Schwarz – “… very limited success in particular areas, followed immediately by failure to reach the broader goal at which these initial successes seem at first to hint…”.�
By 1991: Japan’s Fifth Generation Computer project had finished 10 years, spent $400 million, but hadn’t met even one of the original expectations of the project.
Intelligent Agents and Statistical Methods (1990s)
Business Applications of NLP (2011)
SRI International (now unaffiliated with Stanford)
Section 3: Some History and Where it Pertains to You
Modern NLP Approaches (2000+)
The First Neural “Language” Model (2001)
Multi-task Learning (2008)
Sharing
Word Embeddings (2013)
Sequence to Sequence Modeling (2014)
Regains its footing for:
Attention Modeling
Wholesale Investments in Neural Language Processing (2015)
A major drawback of statistical methods is that they require elaborate feature engineering. Since 2015,[22] the statistical approach was replaced by the neural networks approach, using semantic networks[23] and word embeddings to capture semantic properties of words.
The Transformer (2017)
Transformer Modeling Engagements (2017+)
Pre-Trained Language Models
OpenAI releases ChatGPT (2018)
Anthropic
Open AI Evaluation Metrics
Claud’s Purported Performance Metrics
Claud’s Purported Performance Metrics
GPT-o1 (evaluation comparisons between 4o and o1 series)
Large Language Models
Large Language Models
LLMs proliferate throughout the world
Beyond ChatGPT, Gemini, Claud, Llama in the United States:
QWEN - Alibaba’s Line of LLMs
Out of 81 large-scale AI models, 43 were developed by organizations based in the United States.
Around a quarter of these were from China
Progress of LLMs above 1023 flops
Section 3: Some History and Where it Pertains to You
Recent Presentations to Etsy