1 of 19

Jupyterhub in the Classroom

Sang-Yun Oh

Data Science Instructional Computing Working Group

2 of 19

Who to contact

Ernesto Espinosa: cluster administration

  • Create/manage Jupyterhub cluster
  • Monitors credit usage and archive persistent storage for 1 year

Patrick Windmiller: instructor liaison

  • Liaise between LSIT and the instructor
  • Customize computing environment for each course

Sang-Yun Oh: faculty advocate

  • Teaching with Jupyterhub (tools for submission, grading)

3 of 19

What is Jupyterhub?

  • Jupyterhub is a multi-user Jupyter notebook server
  • Jupyterhub authenticates users and instantiates notebook sessions
  • All users get the same environment: “single-user notebook” (SUN)
  • Jupyterhub needs Docker image SUN (modify Dockerfile for customization)
  • SUN can be any image here: https://github.com/jupyter/docker-stacks
  • Custom R images are created from: https://github.com/rocker-org/rocker
  • Jupyterhub cluster runs on GCP and SUN images are on Docker Hub
  • Persistent storage

4 of 19

Requesting Jupyterhub Setup

5 of 19

1-2-3!

  1. Request Google Cloud Credit (more on this later)
  2. Contact Patrick to discuss your needs:�E.g. Python vs. R studio, package needs, etc.
  3. Contact Patrick and Ernesto for support�Contact Sang about questions regarding grading tools, etc.

6 of 19

Requesting Cloud Credit

  • https://edu.google.com/programs/credits/teaching (example on next slide)
  • Redeemable right away, expires 1 year after course start date
  • Google provides coupon codes totaling (up to $5000)$10 x number of students + $50 x number of teaching staff
  • Additional requests are allowed
  • For Jupyterhub, lump sum credit is needed

7 of 19

  • First Name: Sang-Yun
  • Last Name: Oh
  • Email Address: syoh@ucsb.edu
  • Title: Prof.
  • Name of School: UNIVERSITY OF CALIFORNIA SANTA BARBARA
  • Country: US
  • Name of Department: Statistics and Applied Probability
  • Link to Faculty Directory: https://www.pstat.ucsb.edu/people
  • Expected Start Date: 2019-09-25
  • Name of course/project: PSTAT 134
  • Course abstract or project description: Overview and use of data science tools in R and Python for data retrieval, analysis, visualization, reproducible research and automated report generation. Case studies will illustrate practical use of these tools.
  • Student email domain: ucsb.edu
  • Expected Number of Students: 85
  • Expected Number of TAs and Staff: 7
  • Additional Comments: We create a Jupyterhub cluster for this class. We need *one lump sum credit* to setup and run this Jupyterhub cluster. During a lecture, students will be informed what Google Cloud Platform is and how it powers their JupyterHub.

8 of 19

Informing Students about GCP

Background: Google Cloud is increasingly wanting students to get direct exposure to GCP. Feel free to use the following slides in your lecture

9 of 19

JupyterHub URL: https://[CLASS].lsit.ucsb.edu

Internet

Amy

Sam

Amy’s Kernel

Sam’s Kernel

JupyterHub URL

10 of 19

JupyterHub URL: https://[CLASS].lsit.ucsb.edu

Internet

Amy

Sam

Amy’s Kernel

Sam’s Kernel

11 of 19

JupyterHub URL: https://[CLASS].lsit.ucsb.edu

[Image Source: https://rb.gy/cknf59]

12 of 19

Sample GCP Credit Request Response

Clipped email response from Google containing the credit code (forward to Ernesto):

13 of 19

Tools for Teaching with Jupyterhub

14 of 19

Jupyterhub Admin Interface

15 of 19

Moving Material to/from Jupyter Notebook

Graphical User Interface (GUI)

Command Line Interface (CLI)

  • `wget` or `curl` from direct downloadable links: e.g. Box direct link
  • `zip directory.zip -r directory/` then Download

16 of 19

Moving Material to/from R Studio

Graphical User Interface (GUI)

  • Upload/Export buttons
  • Uploading a zip file expands into directory
  • Exporting a directory creates a zip file

Command Line Interface (CLI)

  • `wget` or `curl` from direct downloadable links: e.g. Box direct link
  • `zip directory.zip -r directory/` then Download

17 of 19

Questions from Students

  • Q: How do I delete a directory?�A: After emptying the directory
  • Q: How do I download a directory?�A: Either zip (CLI) and Download or select all files
  • Q: I can’t save my file! What might be the cause?�A: Storage allocation may be full or Jupyter notebook file may be too large

18 of 19

Inconveniences

  • Access control is done manually�Default is to allow any UCSB ID
  • There is no collaboration functionality�But instructor can assume any user identity via Admin panel
  • No easy way to submit homeworks to GS for students�Manually download from Jupyterhub then Upload to GS
  • No easy way to monitor load (need GCP login)�Potential for abuse of (limited) resources

19 of 19

Future Plans

  • Possibly using AWS (year long credit)
  • Tutorial on creating homework assignments: jassign
  • Integration with GauchoSpace? e.g. access control
  • Wifi reliability testing in large classrooms�(a tip: can use Zoom screen sharing if classroom projector is dim)
  • Infrastructure developments: e.g. help-ticket system