1 of 254

Writing Code for NLP Research

EMNLP 2018

{joelg,mattg,markn}@allenai.org

2 of 254

Who we are

Matt Gardner (@nlpmattg)

Matt is a research scientist on AllenNLP. He was the original architect of AllenNLP, and he co-hosts the NLP Highlights podcast.

Mark Neumann (@markneumannnn)

Mark is a research engineer on AllenNLP. He helped build AllenNLP and its precursor DeepQA with Matt, and has implemented many of the models in the demos.

Joel Grus (@joelgrus)

Joel is a research engineer on AllenNLP, although you may know him better from "I Don't Like Notebooks" or from "Fizz Buzz in Tensorflow" or from his book Data Science from Scratch.

3 of 254

Outline

How to write code when prototyping
Developing good processes

BREAK

How to write reusable code for NLP
Case Study: A Part-of-Speech Tagger
Sharing Your Research

4 of 254

What we expect you know already

5 of 254

What we expect you know already

modern (neural) NLP

6 of 254

What we expect you know already

Python

7 of 254

What we expect you know already

the difference between good science and bad science

8 of 254

What you'll learn today

9 of 254

What you'll learn today

how to write code in a way that facilitates good science and reproducible experiments

10 of 254

What you'll learn today

how to write code in a way that makes your life easier

11 of 254

The Elephant in the Room: AllenNLP

This is not a tutorial about AllenNLP
But (obviously, seeing as we wrote it) AllenNLP represents our experiences and opinions about how best to write research code
Accordingly, we'll use it in most of our examples
And we hope you'll come out of this tutorial wanting to give it a try
But our goal is that you find the tutorial useful even if you never use AllenNLP

AllenNLP

12 of 254

Two modes of writing research code

13 of 254

1: prototyping

2: writing components

14 of 254

Prototyping New Models

15 of 254

Main goals during prototyping

Write code quickly

Run experiments, keep track of what you tried

Analyze model behavior - did it do what you wanted?

16 of 254

Main goals during prototyping

Write code quickly

Run experiments, keep track of what you tried

Analyze model behavior - did it do what you wanted?

17 of 254

Writing code quickly - Use a framework!

18 of 254

Writing code quickly - Use a framework!

Training loop?

19 of 254

Writing code quickly - Use a framework!

Training loop?

model = LSTMTagger(EMBEDDING_DIM, HIDDEN_DIM,

len(word_to_ix), len(tag_to_ix))

loss_function = nn.NLLLoss()

optimizer = optim.SGD(model.parameters(), lr=0.1)

validation_losses = []

patience = 10

for epoch in range(1000):

training_loss = 0.0

validation_loss = 0.0

for dataset, training in [(training_data, True),

(validation_data, False)]:

correct = total = 0

torch.set_grad_enabled(training)

t = tqdm.tqdm(dataset)

for i, (sentence, tags) in enumerate(t):

model.zero_grad()

model.hidden = model.init_hidden()

sentence_in = prepare_sequence(sentence, word_to_ix)

targets = prepare_sequence(tags, tag_to_ix)

tag_scores = model(sentence_in)

loss = loss_function(tag_scores, targets)

predictions = tag_scores.max(-1)[1]

correct += (predictions == targets).sum().item()

total += len(targets)

accuracy = correct / total

if training:

loss.backward()

training_loss += loss.item()

t.set_postfix(training_loss=training_loss/(i + 1),

accuracy=accuracy)

optimizer.step()

else:

validation_loss += loss.item()

t.set_postfix(validation_loss=validation_loss/(i + 1),

accuracy=accuracy)

validation_losses.append(validation_loss)

if (patience and

len(validation_losses) >= patience and

validation_losses[-patience] ==

min(validation_losses[-patience:])):

print("patience reached, stopping early")

break

20 of 254

Writing code quickly - Use a framework!

Tensorboard logging?
Model checkpointing?
Complex data processing, with smart batching?
Computing span representations?
Bi-directional attention matrices?

Easily thousands of lines of code!

21 of 254

Writing code quickly - Use a framework!

Don’t start from scratch! Use someone else’s components.

22 of 254

Writing code quickly - Use a framework!

But...

23 of 254

Writing code quickly - Use a framework!

But...

Make sure you can bypass the abstractions when you need to

24 of 254

Writing code quickly - Get a good starting place

25 of 254

Writing code quickly - Get a good starting place

First step: get a baseline running

This is good research practice, too

26 of 254

Writing code quickly - Get a good starting place

Could be someone else’s code... as long as you can read it

27 of 254

Writing code quickly - Get a good starting place

Could be someone else’s code... as long as you can read it

28 of 254

Writing code quickly - Get a good starting place

Even better if this code already modularizes what you want to change

Add ELMo / BERT here

29 of 254

Writing code quickly - Get a good starting place

Re-implementing a SOTA baseline is incredibly helpful for understanding what’s going on, and where some decisions might have been made better

30 of 254

Writing code quickly - Copy first, refactor later

CS degree:

31 of 254

Writing code quickly - Copy first, refactor later

CS degree:

32 of 254

Writing code quickly - Copy first, refactor later

CS degree:

We’re prototyping! Just go fast and find something that works, then go back and refactor (if you made something useful)

33 of 254

Writing code quickly - Copy first, refactor later

Really bad idea: using inheritance to share code for related models

Instead: just copy the code, figure out how to share later, if it makes sense

34 of 254

Writing code quickly - Do use good code style

CS degree:

35 of 254

Writing code quickly - Do use good code style

CS degree:

36 of 254

Writing code quickly - Do use good code style

37 of 254

Writing code quickly - Do use good code style

38 of 254

Writing code quickly - Do use good code style

39 of 254

Writing code quickly - Do use good code style

Meaningful names

40 of 254

Writing code quickly - Do use good code style

Shape comments on tensors

41 of 254

Writing code quickly - Do use good code style

Comments describing non-obvious logic

42 of 254

Writing code quickly - Do use good code style

Write code for people, not machines

43 of 254

Writing code quickly - Minimal testing (but not no testing)

CS degree:

44 of 254

Writing code quickly - Minimal testing (but not no testing)

CS degree:

45 of 254

Writing code quickly - Minimal testing (but not no testing)

A test that checks experimental behavior is a waste of time

46 of 254

Writing code quickly - Minimal testing (but not no testing)

But, some parts of your code aren’t experimental

47 of 254

Writing code quickly - Minimal testing (but not no testing)

And even experimental parts can have useful tests

48 of 254

Writing code quickly - Minimal testing (but not no testing)

And even experimental parts can have useful tests

Makes sure data processing works consistently, that tensor operations run, gradients are non-zero

49 of 254

Writing code quickly - Minimal testing (but not no testing)

And even experimental parts can have useful tests

Run on small test fixtures, so debugging cycle is seconds, not minutes

50 of 254

Writing code quickly - How much to hard-code?

Which one should I do?

51 of 254

Writing code quickly - How much to hard-code?

Which one should I do?

I’m just prototyping! Why shouldn’t I just hard-code an embedding layer?

52 of 254

Writing code quickly - How much to hard-code?

Which one should I do?

Why so abstract?

53 of 254

Writing code quickly - How much to hard-code?

Which one should I do?

On the parts that aren’t what you’re focusing on, you start simple. Later add ELMo, etc., without rewriting your code.

54 of 254

Writing code quickly - How much to hard-code?

Which one should I do?

This also makes controlled experiments easier (both for you and for people who come after you).

55 of 254

Writing code quickly - How much to hard-code?

Which one should I do?

And it helps you think more clearly about the pieces of your model.

56 of 254

Main goals during prototyping

Write code quickly

Run experiments, keep track of what you tried

Analyze model behavior - did it do what you wanted?

57 of 254

Running experiments - Keep track of what you ran

You run a lot of stuff when you’re prototyping, it can be hard to keep track of what happened when, and with what code

58 of 254

Running experiments - Keep track of what you ran

59 of 254

Running experiments - Keep track of what you ran

This is important!

60 of 254

Running experiments - Keep track of what you ran

Currently in invite-only alpha; public beta coming soon
https://github.com/allenai/beaker
https://beaker-pub.allenai.org

61 of 254

Running experiments - Keep track of what you ran

62 of 254

Running experiments - Keep track of what you ran

63 of 254

Running experiments - Keep track of what you ran

64 of 254