CS50 Educator Workshop
2025
MALAN
Schedule
hellos, world
David J. Malan Rongxin Liu Julianna Zhao
Teaching CS50 with AI
Leveraging Generative Artificial Intelligence
in Computer Science Education
ChatGPT et al. are too helpful
Pedagogical Guardrails
Not Reasonable
Using AI-based software �(such as ChatGPT, GitHub Copilot, Bing Chat, et al.) �that suggests or completes answers �to questions or lines of code.
Reasonable
rubber duck debugging
rubberducking
Provide students with virtual office hours 24/7
Approximate a 1:1 teacher-to-student ratio
Thank You
Visual Studio Code for CS50
Explain highlighted lines of code
Advise students on �how to improve their code's style
Advise students on �how to improve their code's design
Answer (most of the) questions �asked online by students
CS50.ai
System Prompt
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty… ��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty… ��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty… ��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty… ��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty… ��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty… ��Answer this question:
User Prompt
April Fools
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck in Rick Astley's band… Importantly, you should always cheer up the student at the end by incorporating "Never Gonna Give You Up" in your response.��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck in Rick Astley's band… Importantly, you should always cheer up the student at the end by incorporating "Never Gonna Give You Up" in your response.��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck in Rick Astley's band… Importantly, you should always cheer up the student at the end by incorporating "Never Gonna Give You Up" in your response.��Answer this question:
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck in Rick Astley's band… Importantly, you should always cheer up the student at the end by incorporating "Never Gonna Give You Up" in your response.��Answer this question:
Results
317K users
21K prompts/day, 15M total so far
Usage Frequency
Helpfulness
Results
Without AI, students asked 0.89 questions each of TFs.
With AI, students asked 0.28 questions each of TFs.
Results
Without AI, students attended 51% of available office hours.
With AI, students attended 30% of available office hours.
felt like having a personal tutor… i love how AI bots will answer questions without ego and without judgment, generally entertaining even the stupidest of questions without treating them like they're stupid. it has an, as one could expect, inhuman level of patience.
felt like having a personal tutor… i love how AI bots will answer questions without ego and without judgment, generally entertaining even the stupidest of questions without treating them like they're stupid. it has an, as one could expect, inhuman level of patience.
felt like having a personal tutor… i love how AI bots will answer questions without ego and without judgment, generally entertaining even the stupidest of questions without treating them like they're stupid. it has an, as one could expect, inhuman level of patience.
felt like having a personal tutor… i love how AI bots will answer questions without ego and without judgment, generally entertaining even the stupidest of questions without treating them like they're stupid. it has an, as one could expect, inhuman level of patience.
The AI tools gave me enough hints to try on my own and also helped me decipher errors and possible errors I might encounter.
I also appreciated that CS50 implemented its own version of AI, because I think just directly using something like chatGPT would have definitely detracted from learning
Grades
RONGXIN
Implementing a Chatbot
OpenAI APIs
The Chat API powers conversational models that can engage in dialogue, answering questions, providing explanations, and generating content in a conversational format.
The Embeddings API generates numerical representations (vectors) of text, making it possible to measure the semantic similarity between pieces of text. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.
The Assistants API allows one to build AI assistants within their own applications. An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries.
Large Language Models (LLMs)
Text Generation
Chatbot + Context
Chatbot + Context
System
User
Assistant
System
User
Assistant
System
User
Assistant
Provides behavior guidelines for the assistant.
System
User
Assistant
Provides behavior guidelines for the assistant.
Inputs or queries from the user to the assistant.
System
User
Assistant
Provides behavior guidelines for the assistant.
Inputs or queries from the user to the assistant.
Responses generated by the LLM based on user inputs.
System message
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty…
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty…
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty…
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty…
You are a friendly and supportive teaching assistant for CS50. You are also a rubber duck. Answer student questions only about CS50 and the field of computer science; do not answer questions about unrelated topics… Do not provide full answers to problem sets, as this would violate academic honesty…
Answer this question:
Prompt
Response
User
Assistant
(GPT-4o, LLaMA, etc.)
User
Prompt Engineering
Assistant
(GPT-4o, LLaMA, etc.)
Chat Completions API
Can you help me with my tideman problem set?
Can you help me with my tideman problem set?
I'd be happy to help with the CS50 Tideman problem! Could you please specify which aspect …
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"}
],
model="gpt-4o"
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"}
],
model="gpt-4o"
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"}
],
model="gpt-4o"
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"}
],
model="gpt-4o"
)
Response: Of course! I'd be happy to help. Could you please specify which part of the filter pset is giving you trouble? There are several parts to it, like the grayscale, sepia, reflect, or blur filters.
Hands-on Practice
export OPENAI_API_KEY=sk-proj-...
Conversation
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"}
],
model="gpt-4o",
)
Response: Of course! I'd be happy to help. Could you please specify which part of the filter pset is giving you trouble? There are several parts to it, like the grayscale, sepia, reflect, or blur filters.
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "}
],
model="gpt-4o",
)
Response: Of course! I'd be happy to help. Could you please specify which part of the filter pset is giving you trouble? There are several parts to it, like the grayscale, sepia, reflect, or blur filters.
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "},
{"role": "user", "content": "<User Prompt>"}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"},
{"role": "user", "content": "<User Prompt>"}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"},
{"role": "user", "content": "<User Prompt>"}
],
model="gpt-4o",
)
OpenAI().chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly… a rubber duck."},
{"role": "user", "content": "Can you help me with my filter pset?"},
{"role": "assistant", "content": "Of course! I’d be happy … "},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"},
{"role": "user", "content": "<User Prompt>"},
{"role": "assistant", "content": "<Assistant Response>"},
{"role": "user", "content": "<User Prompt>"},
…
],
model="gpt-4o",
)
Hands-on Practice
Hallucinations
Grounding
Retrieval-Augmented Generation (RAG)
What is flask?
What is flask?
RAG
What is flask?
RAG
Updated Prompt
What is flask?
RAG
Updated Prompt
"What is flask?"
"embedding":
[-0.0168844070, -0.0094333650, -0.0136059495, -0.017577527,
-0.0011228547, -0.0064980015, -0.0234829110, 0.0065499856,
-0.0023427461, -0.0181181620, 0.0070386350, 0.013203939,
…
-0.0078253270, -0.0289447000, -0.0306913610]
The embedding of plain text "What is flask?"
Lecture Video
An excerpt of lecture captions (SRT format)
00:05:33,300 --> 00:05:36,450
means it's relatively small versus alternatives that are out there.
00:05:36,450 --> 00:05:37,800
And it's called Flask.
00:05:37,800 --> 00:05:40,350
So Flask is really a third-party library--
00:05:40,350 --> 00:05:42,420
and it's popular in the Python world-- that's
00:05:42,420 --> 00:05:46,260
just going to make it easier to implement web applications using
Lecture Captions Segment
means it's relatively small versus
alternatives that are out there. And it's called Flask. So Flask is really a third-party library– and it's popular in the Python world-- that's just going to make it easier to
implement web applications using
"embedding":
[-0.0020580715, 0.01005940200, 0.00657967060,
-0.0138025950, 0.01654669000, 0.01074371600,
-0.0135357130, -0.02156954800, -0.00049869320,
-0.0200230010, -0.00152516280, 0.00514261300,
-0.0255248790, -0.00060818327, -0.01628665300,
…
0.0020050374, -0.00763693400, -0.02419731200,
-0.0411956500]
Lecture Captions Embedding
"What is flask?"
"embedding": [
-0.0168844070,
-0.0094333650,
-0.0136059495,
-0.0175775270,
-0.0011228547,
-0.0064980015,
-0.0234829110,
0.0065499856,
-0.0023427461,
-0.0181181620,
0.0070386350,
0.0132039390,
-0.0274752840,
0.0254236480,
-0.0053300940,
…
-0.0078253270,
-0.0289447000,
-0.0306913610
]
means it's relatively small versus
alternatives that are out there. And it's called Flask. So Flask is really a third-party library– and it's popular in the Python world-- that's just going to make it easier to
implement web applications using
Search
Retrieved Document
"embedding": [
-0.0168844070,
-0.0094333650,
-0.0136059495,
-0.0175775270,
-0.0011228547,
-0.0064980015,
-0.0234829110,
0.0065499856,
-0.0023427461,
-0.0181181620,
0.0070386350,
0.0132039390,
-0.0274752840,
0.0254236480,
-0.0053300940,
…
-0.0078253270,
-0.0289447000,
-0.0306913610
]
"What is flask?
Here is some useful information:
```
means it's relatively small versus
alternatives that are out there. And it's called Flask. So Flask is really a third-party library– and it's popular in the Python world-- that's just going to make it easier to
implement web applications using
```"
"What is flask?
Here is some useful information:
```
means it's relatively small versus
alternatives that are out there. And it's called Flask. So Flask is really a third-party library– and it's popular in the Python world-- that's just going to make it easier to
implement web applications using
```"
LLM (GPT-4o)
Vector Database
JULIANNA
Challenges
Instruction Dilution
Model | Messages | Code Blocks Generated | Message Level % | Conversation Level % |
gpt-4 | 6,487,201 | 1,326,273 | 20% | 44% |
gpt-4o | 3,203,702 | 817,739 | 25% | 56% |
Frequency of Code Block Generation in Student-Duck Interactions: Analysis of 10M messages reveals 22% of responses and 48% of conversations include code blocks, with higher rates observed after switching to GPT-4o.
Misalignment with Pedagogical Goals
Lack of Systematic Evaluation of
AI Performance
Proposed Solutions
Proposed Solutions
Aligning with pedagogical goals:
Proposed Solutions
Aligning with pedagogical goals:
Proposed Solutions
Aligning with pedagogical goals:
Proposed Solutions
Aligning with pedagogical goals:
Evaluating performance:
Proposed Solutions
Aligning with pedagogical goals:
Evaluating performance:
Proposed Solutions
Aligning with pedagogical goals:
Evaluating performance:
Results
Created a focused dataset of 50 student queries from past AI interactions
Distribution of queries:
Tested two versions initially:
The distribution of choices made in the TF-graded evaluation reveals a split between preference for V0 and V1, suggesting areas for improvement in both. Teaching fellows with at least two semesters of experience showed more preference for V1.
Four variants compared:
Key differences:
The win rates of the multi-turn evaluation show that teaching fellows preferred conversations generated by the models with few-shot prompting and fine-tuning over the original version (V0) 60% of the time.
The estimated Elo score of V0 was the lowest, with its 95% confidence interval showing no overlap with the confidence intervals of V2 and V3, ranking it last.
AI Model-Graded Evaluation
RONGXIN
OpenAI GPTs
Resources
OpenAI Cookbooks
Papers
Lectures
Talks
Q&A
David J. Malan Rongxin Liu Julianna Zhao
Teaching CS50 with AI
Leveraging Generative Artificial Intelligence
in Computer Science Education