1 of 47

Open Infrastructure in the�Cloud with JupyterHub

@choldgraf

Chris Holdgraf, UC Berkeley and Project Jupyter

2 of 47

bit.ly/jupyterhub-sdss-2019

3 of 47

you???

4 of 47

5 of 47

@choldgraf

A bit about me then...

Cognitive Neuroscience

Open Source

6 of 47

@choldgraf

A bit about me now...

Research and Open Source

Education and Open Source

Jupyter @ Berkeley

7 of 47

a community of people and an ecosystem of open tools and standards for interactive computing

8 of 47

create things that are language-agnostic and modular. Empower people to use other open tools.

9 of 47

For example: the Jupyter Notebook

10 of 47

The Jupyter Notebook is a stack of modular, open tools

You

Your awesome report

server

.ipynb

package ecosystem

Notebook document specification

Jupyter server protocol

Interactive Kernels

Notebook

interfaces

11 of 47

How does the ☁️ fit into this?

12 of 47

(some) data science should be taught to everyone

(no, really)

13 of 47

Here’s what this means at Berkeley...

14 of 47

How can Jupyter connect people with computation?

15 of 47

Build infrastructure tools that are workflow- and platform-agnostic. Give people control over resources, freedom to deploy what and where they wish.

(in the ☁️)

16 of 47

What is JupyterHub?

Host pre-configured data science environments�on shared infrastructure

jupyter.org/hub

17 of 47

My fancy machine in the cloud

myhub.org

18 of 47

myhub.org

19 of 47

myhub.org

environments

20 of 47

myhub.org

interfaces

environments

21 of 47

AUTHENTICATION

myhub.org

interfaces

environments

22 of 47

JupyterHub distributions

A pre-configured JupyterHub setup with sensible defaults and lots of documentation, fit for many use-cases

The Littlest JupyterHub�tljh.jupyter.org

JupyterHub on Kubernetes�z2jh.jupyter.org

☁️

💻

23 of 47

Scalable in both users and in resources

Uses Docker for environment management

Agnostic to the provider and�hardware configuration

Zero to JupyterHub for Kubernetes

z2jh.jupyter.org

24 of 47

25 of 47

The littlest JupyterHub

Deploy JupyterHub on a �single virtual machine�

Faster, lightweight setup�and administration�

More easily created�and destroyed

tljh.jupyter.org

26 of 47

27 of 47

JupyterHub in�the wild

28 of 47

education and training 🎓

29 of 47

datahub.berkeley.edu

30 of 47

inferentialthinking.com

31 of 47

nbgitpuller - one-click interactive content

jupyterhub.github.io/nbgitpuller

32 of 47

Chris Is Trying A Live Demo

Hopefully he doesn’t embarrass himself too badly.

33 of 47

Data 8 is...

jupyter book

github

gofer grader

jupyter notebook

nbgitpuller

scipy stack

34 of 47

35 of 47

36 of 47

large-scale�science 🌎☁️

37 of 47

pangeo.io

38 of 47

  • A JupyterHub + BinderHub managing an open stack of tools for geospatial analysis
  • Utilize and improve pre-existing tools, rather than build new ones. Push improvements upstream.
  • Add value with customization and configuration
  • Provide access to high-performance hardware

The Pangeo pattern

pangeo.io

39 of 47

pangeo.io

40 of 47

Pangeo is...

zarr

jupyter widgets

dask

jupyter

lab

xarray

scipy stack

big data

41 of 47

Open and Interactive collaboration 🤝

42 of 47

openhumans.org

43 of 47

44 of 47

Chris Is Trying A Live Demo

Hopefully he doesn’t embarrass himself too badly.

45 of 47

OpenHumans is...

open data

R stack

exploratory

jupyter

notebook

community

scipy stack

46 of 47

  • JupyterHub connects users with interactive environments on shared infrastructure using open tools
  • JupyterHub distributions are opinionated deployments of a JupyterHub for a specific scale or purpose.
    • The Littlest JupyterHub - Deploy on a single VM 💻
    • Zero to JupyterHub for Kubernetes - Deploy in the ☁️
  • JupyterHub has been used for
    • Large-scale education (data8.org)
    • High-performance 🌎 analysis in the ☁️(pangeo.io)
    • Collaborative community analytics 🤝 (openhumans.org)

In summary

47 of 47

Get involved with Jupyter

@choldgraf

jupyterhub-team-compass.readthedocs.io�discourse.jupyter.org

  • All of these projects are open source, run by open communities
  • Jupyter is a place where *anybody* can participate
  • If you’d like to get involved: