1 of 144

I Don't Like Notebooks

Joel Grus (@joelgrus) #JupyterCon 2018

2 of 144

3 of 144

how we

ended up

here

@joelgrus #jupytercon

4 of 144

the idea was received pretty warmly

@joelgrus #jupytercon

5 of 144

about me

@joelgrus #jupytercon

6 of 144

research engineer @ Ai2

@joelgrus #jupytercon

7 of 144

know a few things about python

@joelgrus #jupytercon

8 of 144

know a few things about data science

@joelgrus #jupytercon

9 of 144

know a few things about software engineering

@joelgrus #jupytercon

10 of 144

livecoder

@joelgrus #jupytercon

11 of 144

blogger

@joelgrus #jupytercon

...

12 of 144

podcast co-host

@joelgrus #jupytercon

13 of 144

what I assume about

you

@joelgrus #jupytercon

14 of 144

a couple of caveats

@joelgrus #jupytercon

15 of 144

@joelgrus #jupytercon

16 of 144

17 of 144

things I do like about notebooks

@joelgrus #jupytercon

18 of 144

@joelgrus #jupytercon

19 of 144

@joelgrus #jupytercon

20 of 144

things I don't like about notebooks

@joelgrus #jupytercon

21 of 144

notebooks have tons and tons of hidden state that's easy to screw up and difficult to reason about

@joelgrus #jupytercon

22 of 144

simple example

@joelgrus #jupytercon

23 of 144

if you look at the numbers, it's clear those cells weren't even executed in order!

@joelgrus #jupytercon

24 of 144

if you look at the numbers, it's clear that something else was run between the first two lines

@joelgrus #jupytercon

25 of 144

@joelgrus #jupytercon

26 of 144

@joelgrus #jupytercon

27 of 144

@joelgrus #jupytercon

28 of 144

@joelgrus #jupytercon

29 of 144

@joelgrus #jupytercon

30 of 144

@joelgrus #jupytercon

31 of 144

one more

@joelgrus #jupytercon

...

32 of 144

oops!

@joelgrus #jupytercon

33 of 144

notebooks are difficult for beginners

@joelgrus #jupytercon

34 of 144

Start off by creating a notebook.

@joelgrus #jupytercon

35 of 144

(hidden state complications are not obvious)

@joelgrus #jupytercon

36 of 144

this is most people's experience of how code works

the ability to run code snippets in arbitrary order is weird and unintuitive

@joelgrus #jupytercon

37 of 144

beginner tutorials are cavalier about (or silent on) hidden state

@joelgrus #jupytercon

38 of 144

@joelgrus #jupytercon

39 of 144

@joelgrus #jupytercon

40 of 144

@joelgrus #jupytercon

@joelgrus #jupytercon

41 of 144

as notebook experts, the problem here is obvious to you

(also this is a really simple example)

for beginners, with dozens of cells and more complex code, this is utterly confusing

(I know this because they come to me with their problems)

@joelgrus #jupytercon

42 of 144

in fact, my original angry tweet was prompted by the [large number]th occasion of me helping a newcomer to python who used a notebook because they heard that was *the thing to do* and who found "python" bewildering 100% on account of messed up hidden state in their notebook due to normal beginner sloppiness and out-of-order execution

@joelgrus #jupytercon

43 of 144

Lots of beginners use notebooks; clearly it can't be that difficult.

It's not that it's insurmountably difficult, it's that the out-of-order execution makes learning Python more confusing than it needs to be.

@joelgrus #jupytercon

44 of 144

notebooks encourage bad habits

@joelgrus #jupytercon

45 of 144

@joelgrus #jupytercon

46 of 144

apparently this was a slight misrepresentation of the "tip", although I didn't learn that until after I started a Twitter war over it

@joelgrus #jupytercon

47 of 144

Rigorous software engineering isn't that important, I'm just experimenting!

You mean you're just

doing science?

@joelgrus #jupytercon

48 of 144

I just want to see if my model works before I put it into production.

Don't you need to write correct code to make sure it works?

@joelgrus #jupytercon

49 of 144

anti-pattern: the need for speed

@joelgrus #jupytercon

50 of 144

@joelgrus #jupytercon

51 of 144

@joelgrus #jupytercon

52 of 144

@joelgrus #jupytercon

"Here's some code I wrote, but I don't want my client or decision maker to see"

53 of 144

@joelgrus #jupytercon

54 of 144

notebooks discourage good habits

@joelgrus #jupytercon

55 of 144

modularity

@joelgrus #jupytercon

56 of 144

testability

@joelgrus #jupytercon

57 of 144

@joelgrus #jupytercon

58 of 144

@joelgrus #jupytercon

59 of 144

@joelgrus #jupytercon

60 of 144

notebooks are way less helpful than my text editor

@joelgrus #jupytercon

61 of 144

some things are easier demonstrated

62 of 144

@joelgrus #jupytercon

Just run the cell first. And then just press TAB.

63 of 144

@joelgrus #jupytercon

Just replace the parentheses with a question mark. And then just execute the cell

This also "uses up" a computation, so either I have to leave my ignorance there or have "out of order" cells

64 of 144

@joelgrus #jupytercon

You may not want to use a filename as your variable name, but some people might!

65 of 144

@joelgrus #jupytercon

...

66 of 144

@joelgrus #jupytercon

...

67 of 144

I found an extension called "hinterland" that does some of these, but...

  • questionable autocomplete behavior���

  • still can only autocomplete things that have been executed
  • which means still can't handle type hints or with blocks

enter

68 of 144

@joelgrus #jupytercon

...

69 of 144

@joelgrus #jupytercon

70 of 144

those are the tools that help me be productive and write good code

@joelgrus #jupytercon

71 of 144

You just need to use the next-generation JupyterLab.

@joelgrus #jupytercon

72 of 144

notebooks encourage bad processes

@joelgrus #jupytercon

73 of 144

@joelgrus #jupytercon

74 of 144

@joelgrus #jupytercon

75 of 144

@joelgrus #jupytercon

76 of 144

notebooks HINDER

reproducible + extensible science

@joelgrus #jupytercon

77 of 144

@joelgrus #jupytercon

78 of 144

notebooks are a recipe for poorly factored code

@joelgrus #jupytercon

79 of 144

idea:

train a neural network to generate music

this is the true story of an idea that my co-workers and I were kicking around at lunch a couple of months ago, there was more to the idea but this was at the core

@joelgrus #jupytercon

80 of 144

"standing on the shoulders of giants, etc., etc."

@joelgrus #jupytercon

81 of 144

😬

was hoping for model.py or models.py or something

was hoping for model.py or models.py or something

@joelgrus #jupytercon

82 of 144

nope

pytorch 0.3? 0.4? 0.4.1?

I suppose I could guess based on the commit date

@joelgrus #jupytercon

83 of 144

an aside

@joelgrus #jupytercon

84 of 144

model definition

model instantiation (with hard-coded parameters)

more parameters

hard-coded paths

more parameters

training loop

@joelgrus #jupytercon

85 of 144

  • This code is pretty much impossible for me to use
  • First, I have to look at the notebooks, manually install all the dependencies, and hope I got the right version
  • Then either:
    • I can run it the exact same way on the exact same data (at the exact same paths!), or
    • I can copy the notebook and try to figure out how to modify it (forcing me to work in a notebook), or
    • I can do a lot of work to copy and paste the right parts into a module that I can import into my own code

You know, there is a tool called nbconvert

@joelgrus #jupytercon

86 of 144

(yes, I am aware of nbconvert)

$ jupyter nbconvert music_generation_training_nottingham.ipynb --to python

  • still mixes "library" code and "execution" code

  • still have to go through and perform surgery

  • still no requirements.txt / unit tests

  • still few comments :(

@joelgrus #jupytercon

87 of 144

to be perfectly fair

  • this is someone's fun project
  • kudos to them for sharing what they did
  • likely they thought they were being helpful by providing notebooks
  • what they did was make it semi-possible (but not easy) for me to repeat their exact same code (but only if I have my data at the exact same paths and the right versions of all their dependencies, ¯\_(ツ)_/¯)
  • while making it very hard for me to build new things on top of their code
  • again, lots of non-notebook code has these same issues, it's just that notebooks implicitly encourage a workflow that has these issues

@joelgrus #jupytercon

88 of 144

but imagine...

@joelgrus #jupytercon

89 of 144

but imagine...

  • a requirements.txt (possibly even a Docker image) indicating exactly what dependencies I need
  • clean, parameterizable load_data.py and model.py with good documentation and unit tests (but I repeat myself)
  • an easy way to programmatically specify model parameters, data paths, and so on
  • a run_experiment.py script that replicates exactly the original result, and that can be tweaked in obvious ways to run novel variants

@joelgrus #jupytercon

90 of 144

@joelgrus #jupytercon

91 of 144

notebooks make it hard to collaborate across mediA

@joelgrus #jupytercon

92 of 144

the scene: pytorch slack, #beginner channel

@joelgrus #jupytercon

93 of 144

"It never hurts to help!"

@joelgrus #jupytercon

94 of 144

I imagine it took you a few tries to produce that, a notebook would have made it easier

@joelgrus #jupytercon

95 of 144

ok, I guess I could do that IN a notebook

@joelgrus #jupytercon

96 of 144

but how to copy and paste into slack?

@joelgrus #jupytercon

97 of 144

select all -> copy -> paste

@joelgrus #jupytercon

98 of 144

select cells -> edit -> copy cells -> paste

doesn't seem to do anything outside of the notebook

that is, when you go to Slack, there's nothing in the clipboard

@joelgrus #jupytercon

99 of 144

select cells -> copy -> paste

@joelgrus #jupytercon

100 of 144

basically, I have no idea how to copy and paste code + outputs from a notebook into slack

@joelgrus #jupytercon

as far as I can tell, in JupyterLab even the "ugly" versions don't work!

101 of 144

this may seem frivolous, but I spend a lot of time debugging people's python issues via slack

@joelgrus #jupytercon

102 of 144

same thing for github issues / code reviews

@joelgrus #jupytercon

103 of 144

one other nuisance that I discovered when working on this example!

@joelgrus #jupytercon

104 of 144

kernel -> restart and run all

error was expected

(and is part of what I want to demonstrate)

but halts execution

@joelgrus #jupytercon

105 of 144

@joelgrus #jupytercon

106 of 144

@joelgrus #jupytercon

107 of 144

notebooks make it easy to teach poorly

@joelgrus #jupytercon

108 of 144

@joelgrus #jupytercon

109 of 144

@joelgrus #jupytercon

110 of 144

@joelgrus #jupytercon

111 of 144

@joelgrus #jupytercon

112 of 144

@joelgrus #jupytercon

113 of 144

notebooks make it hard for me to teach well

@joelgrus #jupytercon

114 of 144

this is from a real book

written by me

Data Science from Scratch

available wherever books are sold

@joelgrus #jupytercon

2nd edition being written as we speak

OK, not exactly as we speak.

115 of 144

You should make a notebook version of your book.

Sure, I'll give it a try!

@joelgrus #jupytercon

116 of 144

this is from a real book

written by me

Data Science from Scratch

available wherever books are sold

@joelgrus #jupytercon

2nd edition being written as we speak

OK, not exactly as we speak.

117 of 144

Well, your book shouldn't split the class into multiple pieces like that

@joelgrus #jupytercon

118 of 144

...

You'd rather I dump several pages of code at the reader all at once, then refer back to it bit by bit?

@joelgrus #jupytercon

119 of 144

It's not just me!

@joelgrus #jupytercon

120 of 144

@joelgrus #jupytercon

121 of 144

@joelgrus #jupytercon

122 of 144

this was also suggested

@joelgrus #jupytercon

123 of 144

every one of these requires presenting unnatural code to my readers, solely to appease the notebook gods

@joelgrus #jupytercon

124 of 144

simpler examples too

this is not a real variable, which means this snippet isn't executable

this example would not be improved by actually defining that variable

125 of 144

what I do instead

@joelgrus #jupytercon

126 of 144

markdown tutorials

and

docs

@joelgrus #jupytercon

127 of 144

vscode + ipython console

@joelgrus #jupytercon

128 of 144

demo

(time permitting)

@joelgrus #jupytercon

129 of 144

how you could win me over

@joelgrus #jupytercon

130 of 144

@joelgrus #jupytercon

131 of 144

@joelgrus #jupytercon

132 of 144

IDE-style autocomplete

@joelgrus #jupytercon

133 of 144

real-time typechecking and linting

@joelgrus #jupytercon

134 of 144

a better story around dependency management

@joelgrus #jupytercon

135 of 144

first-class support for refactoring code out of notebooks into modules

@joelgrus #jupytercon

136 of 144

the reality is

  • you're not going to provide me all these things
  • I'm not going to switch to notebooks
  • but hopefully I've challenged you to think about
    • how your tools limit you
    • how to write better software
    • how to teach better
    • how to practice better science
    • how different workflows could make your life easier
    • some ways that you could make notebooks better

@joelgrus #jupytercon

137 of 144

in conclusion

@joelgrus #jupytercon

138 of 144

please come to my next talk:

"I don't like that slideshow program where sometimes the next slide is RIGHT and sometimes the next slide is DOWN"

@joelgrus #jupytercon

139 of 144

thanks for coming!

@joelgrus #jupytercon

140 of 144

appendix (rejected memes)

@joelgrus #jupytercon

141 of 144

@joelgrus #jupytercon

@joelgrus #jupytercon

142 of 144

@joelgrus #jupytercon

143 of 144

@joelgrus #jupytercon

144 of 144

@joelgrus #jupytercon

UNTITLED25.IPYNB

UNTITLED24.IPYNB