Defragmentation Training school
Jupyter for interactive cloud computing
Guillaume Witz, PhD
Microscopy Imaging Center, Data Science Lab
University of Bern
1
2
“Classic” software vs. notebooks
3
Interactive computing with
Advantages of notebooks
5
Data science tool in academia
and private companies
6
Advantages of notebooks
7
Using Jupyter for documentation
8
Notebooks can be turned into online (interactive) documentation. Here https://guiwitz.github.io/microfilm/
Advantages of notebooks
9
Exploiting cloud resources
For you, the user, “where it runs” doesn’t affect the interface
10
Notebook displayed in browser
Sends computations
Sends result
Computations done by kernel on a server
The server can run locally (laptop) or remotely (cluster, Google etc,)
Why run Jupyter in the cloud?
11
For BioImage Analysis in the cloud…
… you need
1. A running instance of Jupyter accessible via the web
2. The necessary software and packages to run the analysis
3. The necessary hardware to run the analysis
4. The data on which to apply the analysis
12
The many flavours of “cloud notebooks”
13
The many flavours of “cloud notebooks”
14
Google Colab
Pros
Cons
15
Google Colab
ZeroCostDL4Mic is a great project to exploit Colab for microscopy image processing
16
The many flavours of “cloud notebooks”
17
Run Jupyter on for-pay cloud service
“Empty” machine where to install Jupyter and other software, and then use by remote connection
18
Run Jupyter on for-pay cloud service
“Empty” machine where to install Jupyter and other software, and then use by remote connection
19
Run Jupyter on for-pay cloud service
Pros
Cons
Some packaged solutions like SageMaker on Amazon simplify some of the difficulties.
“Empty” machine where to install Jupyter and other software, and then use by remote connection
20
The many flavours of “cloud notebooks”
21
Cloud computing works with containers
22
Cloud computing works with containers
23
Cloud computing works with containers
24
Cloud computing works with containers
25
0.16
Cloud computing works with containers
26
0.16
0.19
Cloud computing works with containers
27
0.16
Cloud computing works with containers
28
0.16
0.19
Cloud computing works with containers
29
0.16
0.19
0.18
Cloud computing works with containers
30
Cloud computing works with containers
31
Cloud computing works with containers
A Docker container is an isolated place where software can run without affecting the rest of the computer or cluster. It:
32
Cloud computing works with containers
33
Interface
The many flavours of “cloud notebooks”
34
For-pay Jupyter services: Paperspace
35
For-pay Jupyter services: Paperspace
36
For-pay Jupyter services: Paperspace
37
For-pay Jupyter services: Paperspace
38
For-pay Jupyter services: Paperspace
39
For-pay Jupyter services: Paperspace
Pros
Cons
40
The many flavours of “cloud notebooks”
41
Example 1:
42
https://github.com/guiwitz/microfilm
Example 1:
43
Example 1:
Pros
Cons
44
Example 2:
45
A platform for reproducible and collaborative data analysis developed by the Swiss Data Science Center
Hosts code and data
Keep your changes using git !
Example 2:
46
Example 2:
47
Example 2:
48
Example 2:
49
Example 2:
50
Add any software using apt-get
Example 2:
51
Edit the requirements.txt and/or environment.yml file
Example 2:
52
We need scikit-image in our project so we add it to the environment,yml file. We can edit directly in GitLab and commit !
Example 2:
53
Committing automatically triggers the creation of an updated Docker container with scikit-image installed.
Example 2:
54
Example 2:
Pros
Cons
55
Example 2:
56
Bioinformatics tools available to run on a computing cluster via a web-based platform. Run by the Freiburg Galaxy Team.
Keep data and notebooks
Interactive Environments
Example 3:
57
Example 3:
On https://usegalaxy.eu/ you get:
58
Example 3:
59
Example 3:
Data can be uploaded to a history
60
Example 3:
Data can be uploaded to a history
61
Example 3:
62
Example 3:
63
Example 3:
64
Example 3:
65
Example 3:
66
Galaxy and Jupyter are “living” in separate places. Data and notebooks can be pushed and pulled between the two worlds
Example 3:
67
To import data from the History, find its ID and use get(ID) in Jupyter. The data are in the imports folder e.g. /imports/18
Example 3:
68
Save notebooks or data in the current History using put(filename) in Jupyter.
Example 3:
Run notebooks:
69
conda create -n defrag python=3.9
conda activate defrag
pip install ipykernel
pip install galaxy-ie-helpers
ipython kernel install --user --name="defrag"
pip install aicsimageio
pip install microfilm
Example 3:
Run notebooks:
70
Now we can create a new notebook using that kernel!
conda create -n defrag python=3.9
conda activate defrag
pip install ipykernel
pip install galaxy-ie-helpers
ipython kernel install --user --name="defrag"
pip install aicsimageio
pip install microfilm
Example 3:
Pros
Cons
71
Thanks for your attention!
72
Exercises
73
Running Jupyter in Galaxy
74
Use MyBinder to turn a repo into a Jupyter session
75