1 of 61

Run Deep Learning Models in the Browser with JavaScript and ConvNetJS

Christoph Körner

Slides available via http://bit.ly/jazoon2017-dl

2 of 61

About me

Big Data Tech Lead at T-Mobile Austria
Visual Computing at Vienna University of Technology
Creator of/Contributor to CaffeJS
Author of Data Visualizations with D3 and AngularJS
Author of Learning Responsive Data Visualization
Organizer of Vienna Kaggle Meetup

3 of 61

Agenda

Why JavaScript might be a thing for Deep Learning
Intro to Neural Nets and Deep Learning
Deep Learning and JavaScript
More Demos

4 of 61

Y THO

Source: twitter.com/lazygamereviews

5 of 61

Why JavaScript might be a thing for DL

Runs in the Browser (and hence on almost any device)
Runs on the Server
Ridiculously easy Debugging
Interactive Visualization (of activations, gradients, etc.)

6 of 61

Besides Education

Access to a LOT of data on the client side

Models built on Cursor movement, typing speed, etc.

Massive parallelization

Run backward passes on all your clients and collect gradients

Privacy sensitive applications

Run forward passes on the client only

7 of 61

Runs in the Browser

Source: chaosmail.github.io/caffejs

8 of 61

...and hence almost anywhere

9 of 61

Ridiculously easy Debugging

Source: chaosmail.github.io/caffejs

10 of 61

Interactive Visualization

Source: ConvNetJS - MNIST

11 of 61

Intro to Neural Nets

(Supervised Classification)

Pedestrian or cyclist?

12 of 61

Dataset

x₀

x₁

13 of 61

Annotated Dataset

Pedestrian

Cyclist

x₀

x₁

14 of 61

Random Separation Line

Pedestrian

Cyclist

x₀

x₁

15 of 61

Line Equation

y = w * x + b

16 of 61

Line Equation

y = w * x + b

Distance from the line

weights

bias

Input data

17 of 61

Line Equation

x

y

b

w

18 of 61

Line Equation

x₀

y

b

w

x₁

19 of 61

Line Equation (simplified)

x₀

y

w

x₁

20 of 61

Let’s classify the samples

Pedestrian

Cyclist

x₀

x₁

21 of 61

Activation function

+1

-1

Pedestrian

Cyclist

x₀

x₁

22 of 61

Activation function

x₀

y

w

x₁

Step function:

map to -1 and +1

23 of 61

Neuron

x₀

y

x₁

24 of 61

How good are the predicted classes?

Pedestrian

Cyclist

x₀

x₁

+1

-1

25 of 61

Error/Loss function

Wrong side!

Pedestrian

Cyclist

x₀

x₁

+1

-1

26 of 61

Error/Loss function

x₀

y

w

x₁

Compute Error

y_true

27 of 61

Optimization (e.g. Gradient Descent)

x₀

w

x₁

Update weights

y

y_true

28 of 61

Optimization (e.g. Gradient Descent)

Pedestrian

Cyclist

x₀

x₁

29 of 61

Optimization (e.g. Gradient Descent)

Pedestrian

Cyclist

x₀

x₁

30 of 61

Optimization (e.g. Gradient Descent)

Pedestrian

Cyclist

x₀

x₁

31 of 61

Optimization (e.g. Gradient Descent)

Pedestrian

Cyclist

x₀

x₁

Wrong side!

32 of 61

Multiple Neurons

Pedestrian

Cyclist

+1

-1

+1

-1

x₀

x₁

33 of 61

Multiple Neurons

x₀

y₀

x₁

y₁

34 of 61

Multiple Neurons

Pedestrian

Cyclist

+1

-1

x₀

x₁

35 of 61

Multiple Layers

x₀

x₁

y

36 of 61

Summary (Neural Nets)

Neurons stacked in Layers

Line equation (Hyperplane)
Activation function
Outputs class (score) in the end

Optimization

Compute error (with ground truth)
Update weights
Iterate

37 of 61

Summary (Neural Nets)

Neurons stacked in Layers

Line equation (Hyperplane)
Activation function
Outputs class (score) in the end

Optimization

Compute error (with ground truth)
Update weights
Iterate

Training
Backward Pass
Back-Propagation

Inference
Forward Pass

38 of 61

Summary (Neural Nets)

x₀

x₁

y

Inference

Back-Propagation

39 of 61

Demo: Tensorflow Playground

Source: playground.tensorflow.org

40 of 61

Ingredients for Deep Learning

More (efficient) Layers

Convolutions
Pooling
Batchnorm, Concat, Skip, etc.

Deep and wide models
A lot of data

41 of 61

Fully Connected Layer

x₀

x₁

y

x₂

x₃

Usually very large! e.g. 3x227x227

Hence a large number of params e.g. 3x227x227 * 2

42 of 61

Convolution (weight sharing)

x₀

x₁

y

x₂

x₃

43 of 61

Convolution (weight sharing)

x₀

x₁

y

x₂

x₃

44 of 61

Convolution (weight sharing)

x₀

x₁

y

x₂

x₃

45 of 61

Demo: MNIST Classification (Training)

Source: ConvNetJS - MNIST

46 of 61

Demo: Deep Learning Models

AlexNet (2012)

60M params ~ 240MB
15.4% ImageNet Top-5 Error

GoogLeNet (2014)

7M params ~ 30MB
6.7% ImageNet Top-5 Error

SqueezeNet (2016)

1M params ~ 5MB
17.5% ImageNet Top-5 Error

Source: chaosmail.github.io/caffejs

47 of 61

Demo: DeepDream

Source: Google DeepDream, chaosmail.github.io/caffejs

48 of 61

Demo: DeepDream

Source: Google DeepDream, chaosmail.github.io/caffejs

49 of 61

Many pretrained models available

For Tasks:

Image Classification
Image Segmentation
Object Localization and Detection
Word embedding (word2vec)
… many more

For Frameworks:

Caffe: github.com/BVLC/caffe/wiki/Model-Zoo
Keras/Tensorflow: keras.io/applications/
Torch: github.com/torch/torch7/wiki/ModelZoo

50 of 61

Summary (Deep Learning)

Use mostly convolutions (weight sharing)
Deep networks for learning complex features
Many pretrained models are available

Caffe models: BAIR license for unrestricted use

51 of 61

JavaScript Ingredients for Deep Learning

TypedArrays and ArrayBuffer
GPU support via WebGL
Multithreading via WebWorkers
Rich Media APIs

WebRTC (Web Real-Time Communications)
WebAudio & WebMidi

52 of 61

Challenges in JavaScript (in the Browser)

Network load
Memory consumption
No CUDA, requires Textures in WebGL�
Small models are preferred

53 of 61

JavaScript Frameworks for Deep Learning

Name	Year	Inference	Backprop	GPU	Comp.
ConvNetJS	Dec. 2013	yes	yes		ConvNetJS
CaffeJS	May 2016	yes	yes		Caffe
Keras.js	Aug. 2016	yes		yes	Keras
Deeplearn.js	Jul. 2017	yes	yes	yes	Keras/ TensorFlow

54 of 61

It’s time for some more demos

55 of 61

Teachable Machine

Authors: Støj, Use All Five, Creative Lab and PAIR teams at Google; Source: teachablemachine.withgoogle.com

56 of 61

Fast Neural Style Transfer

Authors: Reiichiro Nakano; Source: reiinakano.github.io/fast-style-transfer-deeplearnjs

57 of 61

Sentiment Classification

Authors: Leon Chen (MD.ai); Source: transcranial.github.io/keras-js/#/imdb-bidirectional-lstm

58 of 61

RNN for Melody Creation

Authors: Magenta (a Google Brain project); Source: deeplearnjs.org/demos/performance_rnn

59 of 61

Take-home Messages

Already today, you can run many pretrained models and train models in the browser for image, audio, text, etc.
Keep an eye on the model size

Avoid Fully Connected layers
Embed wide modules (such as Fire or Inception)

Use GPU acceleration if possible

60 of 61

One more thing..

It’s an exciting time for JavaScript developers interested in AI and DL!

61 of 61

Thanks!

Slides are available:

bit.ly/jazoon2017-dl