1 of 70

DSTL Lab Showcase

June 6, 2025

2 of 70

Welcome!

3 of 70

Prof. Samuel Lau

Asst. Teaching Prof. in HDSI

Pandas Tutor

Data science curriculum

LLMs in CS + DS Education

4 of 70

About the Lab

ContentGen

(PL: Ylesia)

ClassBuzz

(PL: Owen)

Ayush, Jiaen, Gabriel

WISE

(PL: Andrew)

Michelle, Jack, Chris, Parna

Sathvika, Achintya

Sam supervises 3 teams

5 of 70

A typical year in the lab

Fall: Recruit + teams form

Winter: Full-speed ahead!

Spring: Present work, write papers

Summer: Write papers, brainstorm new project ideas

SIGCSE 2025 in Pittsburgh

6 of 70

A typical week in the lab

All-hands meeting / lab social

Project team meeting

PLs meet with Sam

7 of 70

Hear from our lab members!

8 of 70

ContentGen

Ylesia Wu, Ayush Shah, Gabriel Cha, Jiaen Yu

9 of 70

Demo clip 2

Entering API key + ~2 examples

10 of 70

Motivation

  • Why question generation?
  • Why JupyterLab?
  • Good question vs. Bad question

11 of 70

Project Process Overview

  • Jupyter extension development
    • Prototyping
    • UIUX
    • Prompt
    • PyPI
  • User Interviews
  • Prompt testing

12 of 70

Code base overview

User opens a notebook

Process Notebook Structure

Frontend

Backend

Send Notebook Content

User selects a notebook cell

+

enters a message

Save to frontend

Send relevant notebook details

1st API call for question + answer generation

2nd API call for JSON format correction

Update chat interface

+

Insert new question cell

Send LLM response

13 of 70

Development Process

  • Ayush:
    • One thing interesting: Learning more about Typescript
    • One thing challenging: Prompt Engineering
    • One takeaway: LLMs can’t do everything
  • Ylesia:
    • One thing interesting: Publishing a PyPI package
    • One thing challenging: Debugging frontend and backend
    • One takeaway: Just start

14 of 70

User Interviews

  • Users
    • 6 instructors in total
    • DSC 10, DSC 80, and COGS 18
    • Professor + TA
  • Background
    • Jupyter Notebook
    • Using GenAI for generating questions

15 of 70

Instructors’ reaction to the tool

  • Questions are helpful and relevant in general.
    • Tool works better for simple topics compare to complex/multi-step topics
  • Good for students out of lectures
  • Perceived as useful tool for saving prep time and enriching student practice.

  • Found UI to be generally intuitive, easy to navigate
  • Most common troubles:
    • Clarification of features (follow-ups, formatting, etc)
    • Takes a few times to get used to

Usefulness

Ease of Use

16 of 70

Instructors’ reaction to the tool

  • Questions are helpful and relevant in general.
    • It works better for simple topics compare to complex/multi-step topics.
  • Good for students out of lectures
    • Discussions, review sessions, office hours.
  • Perceived as useful tool for saving prep time and enriching student practice.
    • Especially for undergraduate level classes.

  • Found UI to be generally intuitive, easy to navigate
  • Most common troubles:
    • Clarification of features (follow-ups, formatting, etc)
    • Takes a few times to get used to

Usefulness

Ease of Use

17 of 70

After Interview - Next Steps

  • Finding a way to credit ContentGen in the notebooks that use it
  • More customization (question types, formats)
  • Adding more ease-of-use features
    • Answer code can run in the cell, etc
  • Creating a short guide to the extension for new users

18 of 70

The Eval

  • Test cases
    • DSC10 + DSC80 lecture notebooks
  • Metrics
    • Correctness
    • Context
    • Coherence
  • Manually measured 3 versions of prompt

19 of 70

Prompts

v0.1.0:

Naive one-paragraph prompt

v0.1.1:

More comprehensive prompt

- Context

- Instructions

- Response Format

- Notebook Details

20 of 70

Prompts

v0.1.4: Notebook Structure

v0.1.1:

21 of 70

Test Cases

22 of 70

Eval Data

23 of 70

Eval Data

baseline

detailed instructions

detailed w/ notebook structure

24 of 70

Reflection + Next Steps

  • What worked: Providing important parts of the notebook, Making use of multiple LLM calls
  • What did not work: Telling the LLM not to do something
  • Next Steps: User Logging to get more data

25 of 70

Thank You!

Any questions?

Try it out yourself!

pip install contentgen

26 of 70

Class[Buzz]

Chris, Jack, Michelle, Owen, Parna

27 of 70

What is Class[Buzz]

28 of 70

Poll Types

29 of 70

Building on Class[Buzz]

30 of 70

Original Site Demo

31 of 70

Updated Site Demo

32 of 70

UI Updates

33 of 70

AI Summaries

34 of 70

Additional Features

Exporting data to CSV

Added filesystem (+restructuring)

Deleting users

Editing & deleting polls

35 of 70

Tech Stack

Frontend

Backend

Database

36 of 70

Code Base Overview

37 of 70

Life as a Class[Buzz] dev

38 of 70

Michelle’s Experience

hellooo

39 of 70

Jack’s Experience

The 8 weeks: const result = await groupResponses(sortedVotes);

  • Adding features.
  • Histogram generation.
  • Iterative prompting and debugging.
    • Wiping database?
    • Leaking personal information?

40 of 70

Owen’s Experience

41 of 70

Hopes and Dreams

Conversation Trees

42 of 70

43 of 70

WISE

Watchful Intelligent Science Expert

WISE Team: Achintya, Sathvika, Andrew

44 of 70

Table of contents

01

03

02

04

Introduction

Problem Statement

Demo

Development Process

Research Structure & Findings

Final Product

Road Map

Research

45 of 70

01

Introduction

46 of 70

Imagine…

47 of 70

You find yourself asking…

What’s the best way to handle missing values in this context?

Why did this visualization break?

How do I interpret this pattern I’m seeing?

48 of 70

LLMs?

49 of 70

WISE

Introducing…

50 of 70

Demo Video

51 of 70

AI Assistant

Domain Aware Support

Smarter EDA Suggestions

Real Time & Convenient

Relevant Insights within Jupyter Environment

Task-Specific Guidance

52 of 70

02

Roadmap

53 of 70

Journey

Spring

-Learning Tools/Tech

-Building an Extension in Jupyter Environment

-Full Development & Iterative Design Process

-Features (DataFrame, Plots, Domain Knowledge)

-AI Capabilities

-Designing Research Study

-Conducting Study

-Analyzing Results

-Fine Tuning/Refining Extension

Fall

Winter

54 of 70

03

Extension

55 of 70

Architecture

You (unpaid intern)

plot(), head()

adjQ4_TotSpend_USD_vFinal_FINAL_USETHIS_notNullCleaned_bkp_v3__real.csv??

Frontend event tracker

POST request

Backend

NotebookState.JSON

(Peter) Parser

  • Code Cells
  • Output
  • Markdown Cells
  • Plots, Dataframes
  • Keywords
  • Analysis

Lexicon Luther

Regex, tokenization

LLM

Dana Frames

Dr. Graphenstein

Graphenstein’s Monster

Refines

56 of 70

Technical Challenges

Problem

Solution

Relevant AI Suggestions

How to understand user’s notebook and analytical goals?

How to make Analysis relevant and useful?

Notebook Understanding

57 of 70

Final Product

Turns Ambiguity into Action

Empowers Data Scientists of All Levels

Saves Time

58 of 70

04

Research

What is helpful to a novice data scientist working with a dataset from an unfamiliar domain to push past the barrier to domain knowledge?

59 of 70

Designing Our Protocol

-Domain Specific EDA Task Set up by PhD Student

-Task Introduction

-Task Execution

-Post Task Interview

Participants

Procedure

7 PhD Students from different Labs/Domains

7 Novice Data Scientists

(3rd Year DSC Major Students)

60 of 70

Conducting the Study

-Researchers on mute observing

-Novice and Expert interacting during the task

-Separate post-task interviews about experiences

1-1.5 Hour Zoom Session

61 of 70

Some Recordings

62 of 70

Some Example Datasets

Telemetry Dataset

(Data Smith Lab)

Privacy and Security Domain

PI: HaoJian

DNA Dataset

(The Amariuta Lab)

Bioinformatics

PI: Tiffany Amariuta

63 of 70

Post Observation Analysis

Data Analysis

-Screen Recordings capture activity and verbal interaction

-Audio Transcriptions storing dialogue

-Observer Notes (Observations of Key Moments)

-Post-Task Interview Transcripts

-Qualitative Coding

>Interaction Patterns

- Quantitative Analysis

> Number of occurrences

-Iterative Process

Data Collection

64 of 70

Thematic Analysis

Qualitative Data Analysis

  • Systematically identify and analyze recurring patterns of meaning

  • A way to understand and interpret the significant concepts in textual information

Braun, V., & Clarke, V. (2022). Toward good practice in thematic analysis: Avoiding common problems and be(com)ing a knowing researcher. International Journal of Transgender Health, 24(1), 1–6. https://doi.org/10.1080/26895269.2022.2129597

65 of 70

Post Interview Analysis

66 of 70

Evaluation & Next Steps

Rigorous Testing

Assess suggestions across domains and datasets

Finishing Paper

01

Improving Extension

Releasing Extension

02

03

04

Hopefully submitting to the ACM CHI journal

Implementing our final findings from the paper

Publishing extension as a package to PyPI

67 of 70

Thank You!

CREDITS: This presentation template was created by Slidesgo, and includes icons, infographics & images by Freepik

68 of 70

Fall Recruitment

69 of 70

Fall 2025 Timeline

Week 4: Teams start working!

Week 3: Interviews, members chosen

Week 1: Info session

Week 2: Applications due, interviews scheduled

What we ask:

  1. Why you are interested in research on teaching and learning.
  2. Past programming projects and what you learned.

Sam’s tip for standing out: Be as personal and specific as possible!

70 of 70

Thanks for coming!