1 of 71

learning for sequences

motivation, fundamentals and application

For SmartGateML

2018

2 of 71

language and its structure

applying machine learning to sequences

goo.gl/ryD4cv

deeplanguageclass.github.io/lab/fairseq-transliteration/

seq2seq lab:

translation + transliteration

3 of 71

The more interesting the message,

the less autocorrect helps you.

4 of 71

What is language?

Are natural languages like programming languages?

5 of 71

Language input/output can be text, images or audio.

It is has structure, but different than the structure of programming languages.

6 of 71

What is the structure of language?

Sequence of sequences?

7 of 71

layers of language

text/image/audio

tokens

morphology

syntax

semantics

dialogue

8 of 71

structures of language

Zipfian

token layer

tree-like

syntactic layer

unstructured

discourse layer

9 of 71

sequences

fundamentals

What are properties of natural language text input and output?

10 of 71

26 x 2 chars

To express most everything the Western world has thought since the Romans.

11 of 71

95 ^ 180

Most of the 9.8...e+355 permutations are invalid.

12 of 71

sparsity

www.tensorflow.org/tutorials/representation/word2vec

13 of 71

“The Pope’s Baby Steps on Syria”

14 of 71

“Boy paralyzed after tumor fights back to gain black belt”

15 of 71

ambiguity

web.stanford.edu/class/cs224n/lectures/lecture1.pdf

16 of 71

Владимир эту водку не п...

17 of 71

More people have been to Russia than I have.

18 of 71

More people have been to Russia than I have.

19 of 71

strict

http://languagelog.ldc.upenn.edu/nll/?p=39477

20 of 71

“Nice truck attack video”

21 of 71

“No forgetto!”

22 of 71

evolution

23 of 71

Tehre was a sutdy by Mcirosoft yares ago...

24 of 71

Ca n yo u rea d th is?

25 of 71

Ca n yo u rea d th is?

M@ch1nes C@n’t.

26 of 71

ვატ აბაუტ ზის...

27 of 71

ვატ აბაუტ ზის...

ор зис...

28 of 71

ვატ აბაუტ ზის...

ор зис...

բոտ նոտ զիս

29 of 71

noisy + dirty

www.mrc-cbu.cam.ac.uk/people/matt.davis/cmabridge/

docs.microsoft.com/en-us/typography/develop/word-recognition

30 of 71

wreck a nice beach

31 of 71

wreck a nice beach

recognize speech

32 of 71

arbitrary

en.wikipedia.org/wiki/Double_articulation

33 of 71

The car crashed because it...

34 of 71

The car crashed because it...

35 of 71

The car crashed because it had old tyres.

36 of 71

The car crashed because it had old tyres.

it == car

37 of 71

The car crashed because it was rainy.

38 of 71

The car crashed because it was rainy.

it == ?

39 of 71

The car crashed because it was rainy.

it == ?

40 of 71

hard

en.wikipedia.org/wiki/Winograd_Schema_Challenge

41 of 71

Humans are good at this.

42 of 71

Humans are good at this.

Even baby humans are good at this.

43 of 71

representations

for text sequences

How can we represent text sequences numerically?

44 of 71

word representations

What are the properties?

What are the limitations?

45 of 71

word representations

sub-word

character-level n-grams

supra-word

word-level n-grams

46 of 71

applications

text input/output

Which tasks can we frame as sequence tasks?

47 of 71

sequence input

unsupervised word representations, text classification...

48 of 71

sequence output

article generation (e.g. financial news)

49 of 71

sequence input + output

translation, transliteration, spelling + grammar correction, dialogue systems (e.g. Q&A, "chat bots"), summarisation, style transfer

50 of 71

mixed medium

image captioning

(e.g. CLEVR, @picdescbot)

51 of 71

beyond language

DNA sequences

52 of 71

seq2seq

53 of 71

INPUT : OUTPUT

NUMBERS

AUDIO

TEXT

TEXT+

NUMBERS

regression

LABEL

classification

TEXT

sequence

54 of 71

For which tasks can we use seq2seq?

55 of 71

seq2seq tasks + applications

56 of 71

How do we take as a sequence as input?

57 of 71

How do we generate a sequence?

58 of 71

Generating a sequence

ENCODER + DECODER

59 of 71

github.com/google/seq2seq

60 of 71

Other ways?

To generate a sequence

61 of 71

What if tokens are not 1:1?

62 of 71

How do we represent each token context?

63 of 71

Token context

ATTENTION

64 of 71

65 of 71

What if a token depends on what comes after it?

66 of 71

Token context

BI-DIRECTIONAL

67 of 71

68 of 71

seq2seq lab

The tasks of translation and transliteration tasks contain essentially all the challenges of natural language understanding (NLU) -- ambiguity, agreement, evolution, mixed language... -- which can require full AI.

69 of 71

approach

Find a dataset or generate the dataset from a monolingual corpus

WMT, Wikipedia

Find a seq2seq implementation that is easy to use and productive-strength

Fairseq from Facebook AI Research

Convert from word-level to char-level

Choose parameters, train, test and iterate

1

2

3

4

70 of 71

deeplanguageclass.github.io/lab/fairseq-transliteration/

71 of 71

goo.gl/ryD4cv

Questions?

ML EVN

Yerevan machine learning community

mlevn.org