1 of 59

Seamless Learning Workshop Presentations

DATA/EECS Seamless Learning Su 24 Workshop—June 2024 @ UC Berkeley�Michael Ball and Lisa Yan�Workshop Overview / Schedule

1

Friday 6/14/24

2 of 59

Instructions

  • Plan for a 10-minute (presentation + demo), put in this slidedeck
    • Project Overview
    • Demo
    • Lessons Learned
    • (see template slides in next section)
  • This slidedeck will be shared on the seamless-learning website
  • You’ll present from your own laptop so that you can live demo your tool/project—make sure to join the Zoom and share screen
  • For specifics on other deliverables (e.g., documentation, open-sourcing), see Workshop Overiew doc: Deliverables. These can/should be completed after presentations.

2

3 of 59

cel

flexible exam logistics - Andrew Liu

3

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

4 of 59

Project Overview

  • Exams are complicated.
    • Left Handed
    • Specific Buildings
    • DSP Accommodations
    • Online Requests
    • Sending Emails
  • cel aims to solve this by automating and integrating into current workflows in an unobtrusive manner
  • Philosophy:
    • Getting exams right is important — cel is designed to assist, not do
    • Offers suggestions, but asks for confirmation

4

5 of 59

Demo

5

6 of 59

Lessons Learned, Next Steps

  • Thinking about design is important for maintainability
    • Coming back to this project after a bit of downtime
    • Documentation / Logical organization
  • Workflows
    • What one class does might not be the same as other classes, how to ensure flexibility?
  • Next Steps:
    • Onboarding other people to adopt / work on codebase now that documentation exists
    • Ask other classes if they want to demo
    • Cleaning up a bit more (some code was rushed towards the end of the last semester)
    • Getting through the to-do list (GitHub issues)

6

7 of 59

CS61C Internal Tools�Anto Kam

  • Monorepo
  • Gradar

7

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

8 of 59

Project Overview

  • Monorepo
    • Have the 61C lab/project repos go from multiple repos to one single repo
    • Will explain why this was desperately needed later
  • Gradar
    • Understand course infrastructure by implementing a small change
    • This is something that will continue throughout the semester
    • I have not been able to get to much of this

8

9 of 59

Monorepo

  • Pre su24
    • lab-dev
      • Contains the ‘solutions’ for lab assignments
    • lab-starter-dev
      • Contains the starter code for lab assignments
    • lab-autograder
      • Consists of the code needed for the Gradescope autograder to run
    • <semester>-lab-starter
      • Is the base repo that student repos are created from

10 of 59

Monorepo

  • Pre su24
    • lab-dev
      • Contains the ‘solutions’ for lab assignments
    • lab-starter-dev
      • Contains the starter code for lab assignments
    • lab-autograder
      • Consists of the code needed for the Gradescope autograder to run
    • <semester>-lab-starter
      • Is the base repo that student repos are created from
    • proj1-dev
    • proj1-starter-dev
    • proj1-autograder
    • <semester>-proj1-starter
    • proj2-dev
    • proj2-starter-dev
    • proj2a-autograder
    • proj2b-autograder

11 of 59

Monorepo

  • Pre-su24
    • (Walkthrough + Look)

12 of 59

Monorepo

  • Pre-su24
    • Long story short, there’s a lot of repos that need to be cloned on your computer, and it’s hard to check whether you’ve made changes to everything at a glance
      • There have been situations in the past where I edited lab-dev and lab-starter-dev, but not lab-autograder, and had to do a hotfix during the week

13 of 59

Autograder

  • Aside:
    • The autograder works by reading the assignment title and selecting the correct autograder to run (for labs), and for projects, they each have their own individual .zip file. If you make a change to lab-autograder/proj1-autograder/etc., the autograder updates without needing to re-upload an autograder .zip file to Gradescope
      • I’m personally a really big fan of this - it made creating new Lab assignments on Gradescope quite easy despite me being a new staff member on 61C last semester.

14 of 59

Monorepo

  • Goals
    • Avoid having to clone 90000+ repositories
    • Only use one autograder .zip file for every assignment
      • This includes projects also using the same .zip file
      • (This reduces the amount of overhead needed to setup projects by quite a lot)

15 of 59

Monorepo

  • Currently
    • Only 1 repo (+ 1 for each student-facing repo):
      • assignments
      • semester-assignment-starter
    • Uses 1 autograder zip on gradescope (and selects the correct assignment based on the name of the assignment)
    • Also has a script that takes an assignment and moves it (locally) to the correct semester-assignment-starter repo
      • Wanted to avoid doing this automatically to avoid having the situation of accidentally putting solutions on the student-facing repo, which has happened before

16 of 59

Monorepo Demo

  • Agenda
    • Gradescope
    • Repo structure

16

17 of 59

Gradar Current Progress

  • I’m roughly familiar with the structure of it, but haven’t done any massive implementations
  • My first thing that I’m going to be implementing is adding timezones to course instances; this will touch a bit of the backend and some amount of the frontend, which will lead well to future contributions and generally figuring out roughly how Gradar works
  • Goal for summer semester (I’ll be on 61C this summer, so I’ll have hours dedicated to doing more infrastructure)

17

18 of 59

Lessons Learned

  • Gradescope Autograder
    • The gradescope autograder is very finicky - there’s a lot to keep track of, but once you wrap your head around the documentation, it’s pretty clear what it’s trying to do
  • Incremental
    • Start things small and work your way through tasks; you’ll see more and more of the bigger picture with each small change you make
  • Test run
    • Do a small change before you fully commit just to see how the change feels - it’s a lot easier to make a small change than to make a big change

18

19 of 59

Next Steps

  • Write documentation for the new monorepo (this can be done gradually as the semester goes on)
  • Clean up the repo more so it’s easier for people to edit without needing to look all over the repo
    • The reason why this wasn’t done right now is mainly to get the infrastructure ready for the first week of instruction
  • Gradar

19

20 of 59

CI & Berkeley Class Site Template

Rebecca Dang

Objectives:

  • Implement continuous integration (CI) checks
  • Customize just-the-class Jekyll site template for EECS/CS/DS classes

20

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

21 of 59

Why continuous integration?

Course staff hours are limited

  • Not enough time to train staff on good code practices
  • Manually enforcing best practices is not scalable
  • Even if staff are trained, people forget (no shame!)
  • Ideally staff should be focused on teaching and improving course content, rather than formatting details

21

22 of 59

Why continuous integration?

Make course websites accessible

  • Staff typically aren’t familiar with web accessibility standards
  • Websites on berkeley.edu domain have a legal obligation to be accessible (and it’s just the right thing to do)

22

23 of 59

The Solution

  • Continuous integration (CI) is the practice of integrating source code changes frequently and ensuring that the integrated codebase is in a workable state” (Wikipedia)
  • GitHub Actions is a continuous integration and continuous delivery (CI/CD) platform that allows you to automate your build, test, and deployment pipeline.” (GitHub Docs)
  • On every push and/or pull request, run code quality and accessibility checks!

23

24 of 59

Why have a Berkeley class site template?

Don’t reinvent the wheel

  • Course staffs across EECS/CS/DS are somewhat siloed
  • Every class has its own website and course infrastructure, but a lot of things are common
  • Goal: Create a class site template that can be forked and customized, and comes with accessibility, CI, and a modular system out of the box
  • Use a modern static site generator Jekyll which is easily deployed with GitHub Pages
    • Reduce content development time with automatic rebuilds on source file change

24

25 of 59

The Solution

  • berkeley-eecs GitHub organization currently has a fork of just-the-class (Jekyll theme), but doesn’t have any customization for Berkeley classes
  • I forked it to add common functionality
    • Added GitHub Actions workflows
    • Updated Staff page with Berkeley roles
    • Created modular templates for Lab, HW, and Project pages
    • Updated repository documentation

25

26 of 59

Modularity

26

27 of 59

Modularity

27

28 of 59

Modularity

28

29 of 59

Modularity

29

30 of 59

Demo

30

31 of 59

Lessons Learned

  • Things that work locally may not work in GitHub Actions without extra configuration
  • Finding the right tool is half the battle and may change over time
    • Axe, pa11y, or others?
    • Jekyll pages or collections? Which kind of Jekyll plugin to use?
    • Do we want to use GitHub Pages at all?
  • Tradeoff between convenience and customization
  • Make things as DRY as possible to be more maintainable → Jekyll variables, front matter, templates
  • CI checks can tell you what’s wrong but not necessarily how to fix it
    • Ideally use autoformatters or automated PR suggestions (ex: reviewdog) wherever possible to save course staff hours

31

32 of 59

Next Steps

  • Create rspec accessibility checks dynamically (if possible) so we can check the status of every page (instead of stopping after the first failure)
  • Customize course website template even more
    • Use CS 161’s calendar format instead of default just-the-class calendar
    • Support solution and assignment pages, generate starter code, generate Gradescope autograder - can write custom Jekyll plugins for this but that means we can’t use github-pages gem (and therefore can’t use just-the-docs)
  • Documentation!
  • More GitHub issues

32

33 of 59

Conclusion

  • Improved code quality for every pull request and allowed PR reviewers to focus more on content, less on formatting:
    • Web accessibility using Axe
    • Python linting
    • Markdown linting
    • … and more GitHub Action workflows can be added as the course needs!
  • Configured rspec tests so accessibility checks can be easily extended
  • Saved course staff hours by creating a modular template system for a typical EECS/CS/DS course website

33

34 of 59

Links & Resources

34

35 of 59

PR to Website: Automation of Assignment Generation and Releases

Jonathan Ferrari

  • Limitation on Staff Hours
  • Many processes that Data 8 uses can be automated
    • One such process is assignment generation and release
  • Can sometimes take 1-2 hours per assignment (3 assignments per week)
    • Up to 90 hours per semester

35

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

36 of 59

Project Overview

  • Goal: Create a PR when a new assignment needs to be released. PR will trigger autograder to be built, then trigger assignment to be added to the public repo, which will trigger the website to update, which will trigger an Ed Thread to be posted
  • Project mainly revolves around 3 Specific Tools:
    • Github Actions:
      • Allows for seamless interaction between trigger events in different repositories (CI)
      • In this context, used as the main way to trigger actions
    • EdApi
      • Used to read, edit, and post the Ed threads
      • For my project, used to post a thread to the Ed Course when triggered by a Github Action in the website repository
    • Gradescope Reverse Engineered Api
      • Used to automate some of the processes of Gradescope
      • I am using it to automatically upload the Autograder Zip to build the assignment

36

37 of 59

Demo

37

38 of 59

Lessons Learned, Next Steps

  • Challenges:
    • Had issues with authentication and restricted access of Github PAT
      • Was able to contact Eric Van Dusen to change Org level Permissions
    • Was initially unable to target specific text of the website as different assignments live in different files depending on the week
      • Was able to use Jekyll variables and yml magic to target all links
        • Though this did require a restructuring of the website which isn’t yet fully implemented
    • Ed Api provided some challenges based on the custom XML required to post and examine threads
      • Should be working by tonight
    • Gradescope API proved challenging, and current source code continuously errored / didn’t perform the desired result
    • Work with Shane and Balaji to implement automatic testing of the notebooks as a part of this automation.

38

39 of 59

Otter Grader - Documentation and Feature Improvements

Lance Mathias

  • Otter Grader is very powerful- has lots of features, some undocumented
  • Takes up to 3 weeks to onboard new staff members
  • Even experienced staff have troubles debugging/figuring out what Otter can/can’t do
  • Would like some nice-to-have features for theory-oriented classes

39

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

40 of 59

Project Overview

  • Add missing documentation
    • Built-in plugins
    • Creating custom plugins
    • Add example notebook for plugin creation
  • Add additional plugin hook to support custom behavior when preparing student submissions
    • Use case: run code analysis tools on notebooks (MOSS, etc.)
  • Determine a way to create cascading test cases that build off each other
    • Use case: runtime test that builds off of correctness test
    • Currently:
      • “Follow-up” tests re-run “upstream” test cases multiple times
      • Repeatability issues
      • Time-out issues for long-running functions (e.g. algorithms for NP-complete problems)

40

41 of 59

Demo - Cascading Questions (Grading From Log)

41

def test_q1_2(return_5, return_6):

# Verify that the previous part is correct

from otter.check.logs import Log

log = Log.from_file(".OTTER_LOG")

assert log.get_results("q1.1").passed_all, "The previous part must be correct!"

# Verify return_6 gives the right answer

assert return_6() == 6

test_q1_2(return_5, return_6) # IGNORE

42 of 59

Lessons Learned, Next Steps

  • Building a general-purpose software requires compromise
    • Don’t try to be everything to everyone
      • Not everything needs to be a built-in feature
  • Documentation is important
    • Didn’t know about some features from reading the docs
  • Talk to those with more expertise
    • Good way to know what the software can’t do
  • Next steps:
    • [For CS70/CS170]: Create course-specific training/tutorials
    • Future work: integrate Otter with Gradescope leaderboards(?)

42

43 of 59

Improved OH Queue�Naveen Nathan

43

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

44 of 59

The Way OH currently Works

  • Students create tickets
  • The tickets are displayed to TAs
  • TAs assist students sequentially

45 of 59

Problem: There’s a lot of Tickets

46 of 59

Grouping Tickets would …

  • Enable TAs to assist students who have related questions as a group, saving TA bandwidth
  • Allow students working on similar questions to collaborate
  • be especially useful in remote OH, in which students don’t have the benefit of being in the same room as other students and TAs

47 of 59

But don’t we already have the ability to group students?? We do, but …

  • Students must group themselves.
  • This is only possible for public conceptual tickets
  • Very few tickets are public and conceptual, so, realistically, students would rarely form groups

48 of 59

Solution: Allow TAs to group tickets and assist them programmatically

  • Classify tickets based on the descriptions that students write
  • Allow TAs to choose a set of keywords that may show up in ticket descriptions that can be used to classify OH tickets
  • Allow TAs to filter tickets by keywords of their choosing

49 of 59

Improved OH in action: example scenario

  • TAs are prepping for an OH
  • TAs anticipate that questions will involve some of the following topics
    • Stacks
    • Queues
    • Lists
  • TAs set these as keywords the OH queue should sort tickets by

50 of 59

The first round of tickets have come in…

51 of 59

A dropdown allows TAs to filter tickets that pertain to each topic.

52 of 59

Demo

53 of 59

My most valuable takeaways/accomplishments of this week

  • Meeting all of you (peers and guest presenters)
  • Learning about course infrastructure–and many related topics–from guest presenters, peers, Lisa, and Michael
  • Gaining more SWE experience, which will enable me to work on similar (and not-so-similar!) projects in the future
  • Familiarizing myself with the environment and codebase, which will enable me to work on this specific project more in the future
  • Making the improvements that I presented

54 of 59

This is just the beginning! In the future we can…

  • Cluster related OH tickets together using a PCA of the characteristics of the tickets.
  • Features could indicate the presence/absence of a given word.
    • These words could be chosen every week by TAs
    • Alternately, they could procedurally be found by a program which identifies common or important words in OH tickets.
  • The assignment a ticket is associated with should also be a feature
    • Weightage needs to be balanced with indicator features for words

55 of 59

Thanks to …

  • Lisa and Michael for organizing this workshop
  • Everyone who presented over the course of this workshop
  • All of my peers in this workshop
  • Anyone who may be watching the recording

56 of 59

TEMPLATE - XX###

  • Include

56

DATA/EECS Seamless Learning

Su 24 Workshop—June 2024

57 of 59

Project Overview

ALSO—Fill out

57

58 of 59

Demo

58

59 of 59

Lessons Learned, Next Steps

59