1 of 118

ML Visuals

ML Visuals is a new collaborative effort to help the machine learning community in improving science communication by using more professional, compelling and adequate visuals and figures. You are free to use the visuals in your presentations or blog posts. You don’t need to ask permission to use any of the visuals but it will be nice if you can provide credit to the designer/author (author information found in the slide notes).

This is a project made by the dair.ai community and maintained in this GitHub repo. Our community members will continue to add more common figures and basic elements in upcoming versions. Think of this as free and open artefacts and templates which you can freely and easily download, copy, distribute, reuse and customize to your own needs. I am maintaining this set of slides on a bi-weekly basis where I will regularly organize and keep the slides clean from spam or unfinished content.

IMPORTANT NOTE: Please don’t request editing permission if you do not plan to add anything to the slides. If you want to edit the slides for your own purposes, just make a copy of the slides.

Contributing: To add your own custom figures, simply add a new slide and reuse any of the basic visual components (remember to request edit permissions). We encourage authors/designers to add their visuals here and allow others to reuse them. Make sure to include your author information (in the notes section of the slide) so that others can provide the proper credit if they use the visuals elsewhere (e.g. blog/presentations). Add your name and email just in case someone has any questions related to the figures you added. Also, provide a short description of your visual to help the user understand what it is about and how they can use it. If you need "Edit" permission, just click on the "request edit access" option under the "view only" toolbar above or send me an email at ellfae@gmail.com. If you have editing rights, please make sure not to delete any of the work that other authors have added. You can try to improve it by creating a new slide and adding that improved version (that’s actually encouraged).

Downloading a figure from any of the slides is easy. Just click on File→Download→(choose your format).

If you need help with customizing a figure or have an idea of something that could be valuable to others, we can help. Just open an issue here and we will do our best to come up with the visual. Thanks.

Feel free to ask us anything related to this project in our Slack group.

2 of 118

Basic components

3 of 118

4 of 118

5 of 118

6 of 118

7 of 118

Softmax

Convolve

Sharpen

8 of 118

Softmax

Convolve

Sharpen

9 of 118

10 of 118

Architectures

11 of 118

S2

S3

Frame1

Frame2

Frame3

Frame4

Frame5

Frame6

Frame7

12 of 118

Frame5

Frame6

Frame7

Level 1

Sample1

Sample2

Frame4

Frame3

Frame2

Frame1

Level 2

Sample1

Sample2

Level 3

Sample1

Sample2

Level 4

Sample1

Sample2

13 of 118

C

CNN

Layer

14 of 118

15 of 118

16 of 118

17 of 118

Conv3-32

x4

Maxpool

Conv3-64

x2

Maxpool

(2x2)

Conv3-128

x1

Maxpool

(2x2)

FC-512

Feature

Vector

Output

Input

32x32x3

18 of 118

19 of 118

20 of 118

L

AM

FC

SM

Conv

C

21 of 118

L

Conv

C

FC

SM

22 of 118

L

C

FC

SM

L

C

23 of 118

C

FC

SM

C

Conv

24 of 118

A) LSTM

B) Mixed LSTM / 1D Conv

D) Mixed Att-BiLSTM / 1D Conv

C) Mixed BiLSTM / 1D Conv

25 of 118

A) LSTM

B) LSTM / 1D Conv

D) Att-BiLSTM / 1D Conv

C) BiLSTM / 1DConv

26 of 118

Level 1

Sample1

Sample2

Level 2

Sample1

Sample2

Level 3

Sample1

Sample2

Level 4

Sample1

Sample2

Frame1

Frame2

Frame3

Frame4

Frame5

Frame6

Frame7

Frame8

Frame9

Frame10

27 of 118

Frame1

Frame2

Frame3

Frame4

Frame5

Frame6

Frame7

Frame8

Frame9

Frame10

S7

S8

28 of 118

29 of 118

L

C

FC

SM

L

C

Conv

AM

30 of 118

L

C

FC

SM

L

C

AM

31 of 118

L

C

FC

SM

L

C

32 of 118

L

C

FC

SM

L

C

Conv

33 of 118

L

C

FC

SM

L

C

Conv

AM

L

34 of 118

35 of 118

36 of 118

L

C

FC

SM

L

C

Conv

37 of 118

C

LSTM layer

Attention layer

CNN

layer

C

38 of 118

C

LSTM layer

Attention layer

CNN

layer

39 of 118

L

AM

FC

SM

Conv

C

40 of 118

Input Layer

Hidden Layers

Output Layer

X = A^[0]

a^[4]

A^[1]

A^[3]

X

Ŷ

a^[1]₁

a^[1]₂

a^[1]₃

a^[1]_n

a^[2]₁

a^[2]₂

a^[2]₃

a^[2]_n

a^[3]₁

a^[3]₂

a^[3]₃

a^[3]_n

A^[2]

A^[4]

41 of 118

Input Layer

Hidden Layers

Output Layer

X = A^[0]

a^[4]

A^[1]

A^[3]

X

Ŷ

^[1a^]₁

a^[1]₂

a^[1]₃

a^[1]_n

a^[2]₁

a^[2]₂

a^[2]₃

a^[2]_n

a^[3]₁

a^[3]₂

a^[3]₃

a^[3]_n

A^[2]

A^[4]

42 of 118

Input Layer

Hidden Layers

Output Layer

X = A^[0]

a^[4]

A^[1]

A^[3]

X

Ŷ

a^[1]₁

a^[1]₂

a^[1]₃

a^[1]_n

a^[2]₁

a^[2]₂

a^[2]₃

a^[2]_n

a^[3]₁

a^[3]₂

a^[3]₃

a^[3]_n

A^[2]

A^[4]

43 of 118

NxNx3

+b1

+b2

MxM

+b1

+b2

ReLU

a^[l]

MxMX2

a^[l-1]

CONV operation

44 of 118

NxNx3

+b1

+b2

MxM

+b1

+b2

ReLU

MxMX2

CONV operation

45 of 118

NxNx3

+b1

+b2

MxM

+b1

+b2

ReLU

MxMX2

CONV operation

46 of 118

S=1

S=2

Striding in CONV

47 of 118

NxNx192

NxNx64

NxNx32

NxNx128

NxNx192

1x1 Same

3x3 Same

5x5 Same

MaxPool Same s=1

Inception Module

48 of 118

49 of 118

Multi-Head

Attention

Add & Norm

Input

Embedding

Output

Embedding

Feed

Forward

Add & Norm

Masked

Multi-Head

Attention

Add & Norm

Multi-Head

Attention

Add & Norm

Feed

Forward

Add & Norm

Linear

Softmax

Inputs

Outputs (shifted right)

Positional

Encoding

Positional

Encoding

50 of 118

Multi-Head

Attention

Add & Norm

Input

Embedding

Output

Embedding

Feed

Forward

Add & Norm

Masked

Multi-Head

Attention

Add & Norm

Multi-Head

Attention

Add & Norm

Feed

Forward

Add & Norm

Linear

Softmax

Inputs

Outputs (shifted right)

Positional

Encoding

Positional

Encoding

51 of 118

Tokenize

I

love

coding

and

writing

“I love coding and writing”

52 of 118

ML Concepts

53 of 118

Size

#bed

ZIP

Wealth

Family?

Walk?

School

PRICE ŷ

X

Y

X

Ŷ = 0

Ŷ = 1

How does NN work (Insprired from Coursera)

Logistic Regression

Basic Neuron Model

54 of 118

Size

$

Size

$

Linear regression

ReLU(x)

55 of 118

NxN

256

225

56

.

214

210

211

R-G-B

Unrolling Feature vectors

56 of 118

Large NN

Med NN

Small NN

SVM,LR etc

η

Amount of Data

Why does Deep learning work?

57 of 118

a^[1]₁

a^[1]₂

a^[1]₃

Input

Hidden

Output

X = A^[0]

a^[1]₄

a^[2]

A^[1]

A^[2]

X

Ŷ

One hidden layer neural network

58 of 118

a^[1]₁

a^[1]₂

x_[1]

a^[2]

x_[2]

x_[3]

x_[1]

Neural network templates

59 of 118

Train

Valid

Test

x1

x2

x1

x2

x1

x2

Train-Dev-Test vs. Model fitting

Underfitting

Good fit

Overfitting

60 of 118

x_[2]

x_[3]

x_[1]

a_[L]

x1

x2

r=1

x1

x2

DropOut

Normalization

61 of 118

w1

w2

J

w1

w2

J

w1

w2

Before Normalization

After Normalization

Early stopping

Dev

Train

Err

it.

62 of 118

x₁

x₂

w[1]

w[2]

w[L-2]

w[L-1]

w[L]

FN

TN

TP

FP

Deep neural networks

Understanding

Precision & Recall

63 of 118

w1

w2

SGD

BGD

w1

w2

SGD

Batch vs. Mini-batch �Gradient Descent

Batch �Gradient Descent vs. SGD

64 of 118

x_[2]

x_[3]

x_[1]

p_[1]

p_[2]

Softmax Prediction with 2 outputs

65 of 118

Abstract backgrounds

66 of 118

67 of 118

dair.ai

68 of 118

69 of 118

70 of 118

Gradient Backgrounds

71 of 118

72 of 118

73 of 118

74 of 118

ML and Health

75 of 118

ICA

Conv

EEG Time Series

Time slice

Conv

Spatial Feature Learning

EEG Images

Spectral Topography Maps

Alpha

LSTM+AM

Conv

Temporal Feature Aggregation

Pain Intensity Assessment

Delta

Alpha

Beta

76 of 118

77 of 118

Pain Location Assessment

ICA

Conv

EEG Time Series

Time slice

Conv

Spatial Feature Learning

EEG Images

Spectral Topography Maps

Alpha

Theta

LSTM+AM

Beta

Conv

Temporal Feature Aggregation

Pain Intensity Assessment

78 of 118

Conv

79 of 118

80 of 118

81 of 118

Activations

(U=WX)

Scalp maps

( )

82 of 118

83 of 118

84 of 118

85 of 118

Test Subject	S1	S2	S3	S4	S5	S6	S7	S8	S9	S10	S11	S12	S13	Average Accuracy
Pain Intensity	0.8209	0.8882	0.9569	0.9625	0.9322	0.9563	1	0.9707	0.9809	0.9226	0.9015	0.892	0.8094	0.922623077
Pain Location	0.7243	0.8028	0.9397	0.9951	0.9286	1	0.9948	0.9088	0.8844	0.8956	0.8816	0.7081	0.7591	0.878684615

86 of 118

87 of 118

88 of 118

Level 1

Level 2

Level 3

Level 4

No Pain

Low Pain

Moderate Pain

High Pain

Signal Segmentation

Time

Withdraw

hand

Immerse

hand

89 of 118

Level 1

Level 2

Level 3

Level 4

No Pain

Low Pain

Moderate Pain

High Pain

Time

Immerse

hand

Withdraw

hand

90 of 118

91 of 118

92 of 118

AEP

93 of 118

Bicubic

Spectral Topography Map

PSD

AEP

Delta

Alpha

Beta

94 of 118

FFT

PSD

Theta(4~8Hz)

Alpha(8~13Hz)

Beta(13~30Hz)

Bicubic

Spectral Topography Map

Image Generation

AEP

95 of 118

Conv3-32

x4

Maxpool

(2x2)

Conv3-64

x2

Maxpool

(2x2)

Conv3-

128

Maxpool

(2x2)

FC-512

Feature

Vector

Input

32x32x3

Output

ConvNet Configuration

96 of 118

Conv3-32

Max-Pool

Conv3-32

Conv3-128

Max-Pool

Conv3-64

Max-Pool

Input

Conv

Max-Pool

FC

Layer1

Softmax

FC

Layer2

Layer3

Layer4

Stack1

Stack2

Stack3

Input

Conv3-32

Max-Pool

Conv3-32

Conv3-128

Conv3-64

Max-Pool

Layer1

Layer2

Layer3

FC-512

Output

Max-Pool

FC-512

Output

ConvNet Configuration

Stack4

97 of 118

Conv3-32

Max-Pool

Conv3-32

Conv3-128

Max-Pool

Conv3-64

Max-Pool

Stack1

Stack2

Stack3

FC-512

Output

Stack4

98 of 118

Level 1

Level 2

Level 3

Level 4

Time

Level 5

No Pain

LowPain

MediumPain

High

Pain

Unbearable

Pain

(a)

(b)

99 of 118

Level 1

Level 2

Level 3

Level 4

Time

Level 5

No Pain

LowPain

MediumPain

High

Pain

Unbearable

Pain

(a)

(b)

100 of 118

101 of 118

102 of 118

Miscellaneous

103 of 118

3

64

16

32

64

128

256

128+256

128

1

64+128

64

32+64

32

16+32

16

Convolution 3x3

Max Pooling 2x2

Convolution 1x1

Skip connection

Up Sampling 2x2

104 of 118

Conv3-32

Max-Pool

Conv3-32

Conv3-128

Max-Pool

Conv3-64

Max-Pool

Input

Conv

Max-Pool

FC

Layer1

Softmax

FC

Layer2

Layer3

Layer4

Layer1

Layer2

Layer3

Layer4

Input

Conv3-32

Max-Pool

Conv3-32

Conv3-128

Conv3-64

Max-Pool

Layer1

Layer2

Layer3

Feature

Vector

FC-512

Output

Max-Pool

FC-512

Output

105 of 118

Previous layer

1x1 convolutions

3x3 convolutions

1x1 convolutions

5x5 convolutions

3x3 max pooling

1x1 convolutions

Filter concatenation

106 of 118

Previous layer

1x1 convolutions

3x3 convolutions

1x1 convolutions

5x5 convolutions

3x3 max pooling

1x1 convolutions

Filter concatenation

107 of 118

Input

1x11 conv

Inception 1

Inception 2

1x7 conv

FC

Output

Inception 2

108 of 118

Previous layer

1x3 conv,

1 padding

1x5 conv,

2 padding

1x3 conv,

1 padding

1x7 conv,

3 padding

Filter concatenation

1x3 conv,

1 padding

1x3 conv,

1 padding

109 of 118

Input

Conv

Max-Pool

Inception

Max-Pool

Conv

Max-Pool

Conv

Inception

Avg-Pool

Conv

FC

Softmax

Avg-Pool

Conv

FC

Softmax

Avg-Pool

Conv

FC

Softmax

Auxiliary Classifier

110 of 118

Previous layer

1x1 conv.

3x3 conv.

1x1 conv.

3x3 conv.

Pool

1x1 conv.

Filter concatenation

3x3 conv.

Previous layer

1x1 conv.

3x3 conv.

Pool

1x1 conv.

Filter concatenation

1x3 conv.

3x1 conv.

1x3 conv.

3x1 conv.

(a)

(b)

111 of 118

R1

R2

R3

R1

R2

R3

R1

R2

Stacked layers

Previous input

x

F(x)

y=F(x)

Stacked layers

Previous input

x

F(x)

y=F(x)+x

x

identity

+

112 of 118

Input

Conv

Avg-Pool

Dense Block 2

Dense Block 3

Conv

Avg-Pool

Conv

Dense Block 1

Avg-Pool

FC

Softmax

Transition layers

113 of 118

3x3 conv

(a)

add

identity

3x3 conv

5x5 conv

3x3 avg

identity

3x3 avg

3x3 conv

5x5 conv

add

Filter concatenation

h_i

h_i-1

...

h_i+1

h_i

h_i-1

...

7x7 conv

5x5 conv

7x7 conv

3x3 max

5x5 conv

3x3 avg

add

identity

3x3 avg

3x3 max

3x3 conv

add

Filter concatenation

h_i+1

(b)

114 of 118

115 of 118

116 of 118

1

2

4

5

6

7

8

3

2

1

0

1

2

3

4

6

8

3

4

Max(1,1,5,6) = 6

Image Representation

Y

X

Pooling performed with a 2x2 kernel and a stride of 2

117 of 118

ML System Design / Infrastructure

118 of 118

脑电波头盔

疼痛等级

疼痛位置

APP

治疗力度

治疗方案

疼痛治疗仪

使用者的治疗时长、治疗方案、治疗反馈记录