Generative AI and LLMs
Dr. Savannah Thais
ENGI E4800 Lecture 6
Is ChatGPT Trustworthy?
ⓘ
Click Present with Slido or install our Chrome extension to activate this poll while presenting.
Discussion on Red Teaming Exercises
What are your biggest concerns around generative AI?
ⓘ
Click Present with Slido or install our Chrome extension to activate this poll while presenting.
What do you think generative AI can safely be used for?
ⓘ
Click Present with Slido or install our Chrome extension to activate this poll while presenting.
How Do LLMs Work?
Attention
Is All You Need
Attention Is All You Need: Vaswani et al
(Except for Humans)
What Can LLMs Do?
Maybe a lot…
How should we evaluate an LLM's capabilities?
ⓘ
Click Present with Slido or install our Chrome extension to activate this poll while presenting.
ChatGPT Website and GPT4 Research Page
Prompt Sensitivity
Construct Validity
Contamination
3 Challenges with LLM Evaluations
Are you measuring something intrinsic about the model or is an artifact of your prompt?
Model behavior is not a construct that exists independently of users and prompting
What exists in the training data? Is the model demonstrating behavior or memorization?
Performance Degradation?
How is ChatGPT’s Behavior Changing Over Time?: Chen et al
Performance Degradation?
Is GPT-4 Getting Worse Over Time?: Narayanan & Kapoor
Liberal Bias?
Liberal Bias?
Does ChatGPT have a liberal bias?: Narayanan and Kapoor
Exam Performance
GPT4 Technical Report: OpenAI
Exam Performance
GPT-4 and professional benchmarks: the wrong answer to the wrong question: Narayanan + Kapoor
Real World Exams
Case Study: AI Theory of Mind and Creativity
Discussion
Training Data
Societal Context