Mobile Vision Learning� TensorflowLite, Model compression, Efficient convolution �
Jaewook Kang, Ph.D.
jwkang10@gmail.com
June. 12th 2018
1
© 2018
MoT Labs
All Rights Reserved
누구나 TensorFlow!
J. Kang Ph.D.
소 개
2
Jaewook Kang, et al., "Bayesian Hypothesis Test using Nonparametric Belief Propagation for Noisy Sparse Recovery," IEEE Trans. on Signal process., Feb. 2015
Jaewook Kang et al., "Fast Signal Separation of 2D Sparse Mixture via Approximate Message-Passing," IEEE Signal Processing Letters, Nov. 2015
Jaewook Kang (강재욱)
누구나 TensorFlow!
J. Kang Ph.D.
1. 모바일에서 머신러닝을 한다는것���- Why on-device ML?��- 해결해줘야 하는 부분����
3
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
4
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
5
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
6
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
7
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
8
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
9
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
10
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
11
- 하이퍼커넥트 신범준님 발표자료 중 -
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
in mobile APP
12
+
NN API
+
누구나 TensorFlow!
J. Kang Ph.D.
모바일 머신러닝
in mobile APP
13
+
NN API
+
(2016)
(2015)
(2018)
(2017)
(2017)
누구나 TensorFlow!
J. Kang Ph.D.
2. Tensorflow Lite Preview��- About Tensorflow Lite�- Android Neural Network API �- Model conversion to tflite���
14
누구나 TensorFlow!
J. Kang Ph.D.
About
15
누구나 TensorFlow!
J. Kang Ph.D.
About
16
누구나 TensorFlow!
J. Kang Ph.D.
About
17
Tflite모델을
각 플랫폼의 커널에서
사용할 수 있도록
번역하는 Api
플랫폼 별 tflite
모델을 최적화용 op set
on device HW
계산 자원 할당
최적화
누구나 TensorFlow!
J. Kang Ph.D.
About
18
Run on device!
Run on device!
Run on device!
누구나 TensorFlow!
J. Kang Ph.D.
About
19
누구나 TensorFlow!
J. Kang Ph.D.
About
20
누구나 TensorFlow!
J. Kang Ph.D.
Android Neural Network API
21
This slide is
powered by J. Lee
누구나 TensorFlow!
J. Kang Ph.D.
Android Neural Network API
22
This slide is
powered by J. Lee
2) C++ tflite Kernal Interpreter를 통해서 NNAPI 클래스로 넘겨주고
3) C++ NNAPI Op set을 이용해서 내부에서 tflite 모델을 low-level로 내부적으로 빌드한다.
4) low-level tflite 모델을 NNAPI를 통해서 실행한다.
NNAPI
누구나 TensorFlow!
J. Kang Ph.D.
Android Neural Network API
�
23
누구나 TensorFlow!
J. Kang Ph.D.
Hardware acceleration via Android NN API
24
This slide is
powered by J. Lee
Android NN API Op set
에서 생성되는 Low-level 모델은
다음과 같은 NNAPI op set을 사용해서
구성된다.
누구나 TensorFlow!
J. Kang Ph.D.
Android Neural Network API
25
This slide is
powered by J. Lee
모델생성
Building and Compiling an NNAPI model into lower-level code
Inference실행
종료대기
누구나 TensorFlow!
J. Kang Ph.D.
From TF model to Android APP build
26
누구나 TensorFlow!
J. Kang Ph.D.
From TF model to Android APP build
27
누구나 TensorFlow!
J. Kang Ph.D.
Model converting to tflite
28
Get a Model
Exporting the Inference Graph
Freezing the exported Graph
Conversion
to TFLITE
with Inference graph
Convert
(.pb)
(.ckpt)
(.pb)
(.tflite)
This slide is
powered by S. Yang
누구나 TensorFlow!
J. Kang Ph.D.
Tensorflow Lite Remark
29
누구나 TensorFlow!
J. Kang Ph.D.
2. CNN 모델 경량화 1�- Model compression�����모델을 압축하는 방법론!�
30
누구나 TensorFlow!
J. Kang Ph.D.
Model Compression
31
누구나 TensorFlow!
J. Kang Ph.D.
Network Pruning
with remaining weights
32
Train entire weight in net
Weight thresholding
Making net sparse!
Fine-training on the remaining weights
Satisfied performance ?
Released!
Tuning
threshold
누구나 TensorFlow!
J. Kang Ph.D.
Network Pruning
with remaining weights
의 파이프라인이 길다. (오래걸린다)
의 구현이 필요하고 성능이
크게 의존한다.
성능이 확인되지 않음
33
누구나 TensorFlow!
J. Kang Ph.D.
Network Pruning
with remaining weights
34
Train entire weight in net
Weight thresholding
Making net sparse!
Fine-training on the remaining weights
Satisfied performance ?
Released!
Tuning
threshold
Iterative
Pruning
누구나 TensorFlow!
J. Kang Ph.D.
Network Pruning
35
누구나 TensorFlow!
J. Kang Ph.D.
Network Pruning
36
Model : VGG-16
누구나 TensorFlow!
J. Kang Ph.D.
Network Pruning
의 파이프라인이 길다. (오래 걸린다)
의 구현이 필요하고 성능이
크게 의존한다.
over-parameterized된 모델에서 성능
이 잘나오는 것을 보이는 논문이 대다수
37
Train entire weight in net
Weight thresholding
Making net sparse!
Fine-training on the remaining weights
Satisfied performance ?
Released!
Tuning
threshold
누구나 TensorFlow!
J. Kang Ph.D.
Weight Sharing
줄일 수 있음
이용하는 별도의 구현 필요
38
Clustered
weight
Weight Mapping
table
Original weights
누구나 TensorFlow!
J. Kang Ph.D.
Weight Quantization
39
누구나 TensorFlow!
J. Kang Ph.D.
Weight Quantization
40
float32
float32
float32
Quantize
Dequantize
uint8
Input
float32
ReLu
uint8
누구나 TensorFlow!
J. Kang Ph.D.
Weight Quantization
41
Scale down:
uint8 표현
Cast down:
누구나 TensorFlow!
J. Kang Ph.D.
Weight Quantization
42
Scale down:
uint8 표현
Cast down:
uint32 표현
6
누구나 TensorFlow!
J. Kang Ph.D.
Weight Quantization
43
Google pixel 2
Snapdragon 835 LITTLE
Mobilenet v1, Tflite
Google pixel 1
Snapdragon 821 LITTLE
Mobilebet v1, Tflite
누구나 TensorFlow!
J. Kang Ph.D.
Weight Quantization
44
Google Fixel 2
Snapdragon 835 LITTLE
Mobilenet v1, Tflite
Google Fixel 1
Snapdragon 821 LITTLE
Mobilebet v1, Tflite
결국 integer-only arithmetic 속도가 날려면
하드웨어가 받쳐줘야 한다.
누구나 TensorFlow!
J. Kang Ph.D.
Model Compression Remark
45
누구나 TensorFlow!
J. Kang Ph.D.
3. CNN 모델 경량화 2�- Efficient Convolution Layer ����Convolution 연산량과 파라미터 수 �좀 줄여보자!���
46
누구나 TensorFlow!
J. Kang Ph.D.
작고 강한 convolutional layer!
47
누구나 TensorFlow!
J. Kang Ph.D.
작고 강한 convolutional layer!
48
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
49
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
50
Fig from standford n231 material
Output map
28 X 28 X 1
+
X
W
Y
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
51
Fig from standford n231 material
Output map
28 X 28 X 1
+
X
W
Y
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
52
Fig from standford n231 material
Output map
28 X 28 X 1
+
X
W
Y
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
53
X: 3x3xL
Input features
W: 1x1xL Filter
→ Single 1x1xL conv filters (L=3,M=1)
Num of input ch
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
54
Z1
X: 3x3xL
Input features
+
w11
w12
w13
W: 1x1xL Filter
→ Single 1x1xL conv filters (L=3,M=1)
Num of input ch
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
55
Z1
Z1
X: 3x3xL
Input features
Z: 3x3x1
Logit features
+
W: 1x1xL Filter
→ Single 1x1xL conv filters (L=3,M=1)
w11
w12
w13
Num of input ch
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
56
Z1
Z1
Z2
X: 3x3xL
Input features
Z: 3x3x2 (M=2)
Logit features
+
+
W: 1x1xL Filter
→ Two 1x1xL conv filters (L=3,M=2)
w11
Z2
w12
w13
w21
w22
w23
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
57
Z1
Z1
Z2
X: 3x3xL
Input features
Y1
Z: 3x3xM
Logit features
Y: 3x3xM
Output features
+
+
W: 1x1xL Filter
→ Two 1x1xL conv filters (L=3,M=2)
w11
Z2
Y2
w12
w13
w21
w22
w23
Relu nonlinearity
Relu activation
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
58
Z1
X: 3x3xL
Input features
+
w11
w12
w13
W: 1x1xL Filter
→ Single 1x1xL conv filters (L=3,M=1)
Num of input ch
누구나 TensorFlow!
J. Kang Ph.D.
Cross Channel Pooling
59
Fig from standford n231 material
누구나 TensorFlow!
J. Kang Ph.D.
Combination of Parallel Conv Path
60
Area V1
Area V2
Area V3
Area V4
Retina
Edges
Object parts
Entire objects
누구나 TensorFlow!
J. Kang Ph.D.
Combination of Parallel Conv Path
61
누구나 TensorFlow!
J. Kang Ph.D.
Combination of parallel conv paths
62
누구나 TensorFlow!
J. Kang Ph.D.
Combination of parallel conv paths
63
누구나 TensorFlow!
J. Kang Ph.D.
Combination of parallel conv paths
64
누구나 TensorFlow!
J. Kang Ph.D.
Combination of parallel conv paths
65
누구나 TensorFlow!
J. Kang Ph.D.
Combination of parallel conv paths
66
Dimensionality
reduction
Capturing correlation
From local clusters
Dimensionality
reduction
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
Does depth matter for deep learning ?
67
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
68
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
Does depth matter for deep learning ?
69
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
70
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
계층 수
증가
Overfitting problem
Validation-Training err gap 증가!
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
72
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
73
출발지
목적지
출발지
A:
B:
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
74
출발지
목적지
출발지
A:
B:
결론: 차가막힌다.
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
75
출발지
목적지
출발지
결론: 아니죠!
고속도로를 만들어야죠!
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
76
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
It is easier to optimize to residual mapping than do the original.
77
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
78
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
79
Residual
path
Shortcut
path
누구나 TensorFlow!
J. Kang Ph.D.
Residual Learning
80
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
81
Recent advances are not necessarily making network more efficient with respect to size and speed!!!
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
82
High cross-channel correlation!
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
83
Very? Low cross-channel correlation!
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
84
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
85
Conv filter
Low correlation
High correlation
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
86
Conv filter
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
87
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
88
Depthwise Conv
Pointwise Conv
Dwise Filter Size : K x K x 1x(M)
(K=3)
Pwise Filter Size : 1 x 1 x L (x M)
(L=3)
+
누구나 TensorFlow!
J. Kang Ph.D.
Depthwise Separable Conv
a NxNx1 2D input channel
K x K x 1 x(L) 2D filter
from NxNxM output channel (M < L)
2D convolution with 1x1xLx(M)
1D conv filters
89
누구나 TensorFlow!
J. Kang Ph.D.
90
Depthwise Convolution
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
91
Z1
X: 3x3xL
Input features
+
w11
w12
w13
W: 1x1xL Filter
→ Single 1x1xL conv filters (L=3,M=1)
Num of input ch
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
92
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
93
Lx1x1 local
patch vector
Two different
logits scalars
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
94
Lx1x1 local
patch vector
Two different
logits scalars
1x1xL conv1
1x1xL conv2
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
95
1x1xL conv1
1x1xL conv2
.
Input channels
1x1xNxK conv filters
Output logit
Before activation
Lx1x1 vector X
1x1xLxM
filter matrix, W
�
MX1 output logit Z
�
=
=
output
채널방향
input�채널방향
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
96
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
97
Where “dim” indicates the dimension of activation space span by W.
Note: Activation space- (선형변환 후 feature space)
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
�
98
1x1xL conv1
1x1xL conv2
1x1xL conv3
1x1xL conv4
=
.
Features
After Dwise conv
Set of 1x1xL conv filters
Output logit
Before activation
1X1XLXM
filter matrix, W
�
M X 1 output logit Z
(M=4)�
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
�
99
1x1xL conv1
1x1xL conv2
1x1xL conv3
1x1xL conv4
=
.
Features
After Dwise conv
Set of 1x1xL conv filters
Output logit
Before activation
1X1XLXM
filter matrix, W
�
M X 1 output logit Z
(M=4)�
ReLu !!!!
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
�
100
1x1xL conv1
1x1xL conv2
1x1xL conv3
1x1xL conv4
=
.
Features
After Dwise conv
Set of 1x1xL conv filters
Output
After activation
1X1XLXM
filter matrix, W
�
M X 1 output Y
(M=4)�
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
�
101
X manifold 차원수 <= activation space (WX) 차원수
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
�
102
X manifold 차원수 <= activation space (WX) 차원수
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
- Mark Sandler et al. “MobileNetV2: Inverted Residuals and Linear Bottlenecks”, CoRR, 2017.
103
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
104
Dwise Conv
3x3x1
Pwise Conv
1x1xLxM
(M < L)
BN →ReLu6
BN →ReLu6
Ch in
X
NxNxL
Ch out
Y
NxNxM
Feature maps
NxNxL
Feature map
NxNxM
Spatial
Feature extraction
Channel
Pooling
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck
105
Dwise Conv
3x3x1
Linear Bottleneck
1x1xLxM
(L>M)
Ch in
X
NxNxL
Ch out
Y
NxNxM
Feature maps
NxNxL
Feature map
NxNxL
Spatial
Feature extraction
Channel
Pooling
BN →ReLu6
Pwise Conv
1x1xLxL
BN
BN →ReLu6
누구나 TensorFlow!
J. Kang Ph.D.
Linear Bottleneck in MobileNetv2
106
누구나 TensorFlow!
J. Kang Ph.D.
Efficient Convolution Remarks
107
누구나 TensorFlow!
J. Kang Ph.D.
모두연 MoT랩 소개
108
누구나 TensorFlow!
J. Kang Ph.D.
109
누구나 TensorFlow!
J. Kang Ph.D.
Google Deep learning Jeju Camp 2018
110
Seoyoen Yang (SNU)
Taekmin Kim (SNU)
Jaewook Kang (Modulabs)
누구나 TensorFlow!
J. Kang Ph.D.
MoT Contributors
Jaewook Kang
(Modulabs)
Joon ho Lee (Neurophet)
Yonggeun Lee
Jay Lee
(KakaoPay)
SungJin Lee
(DU)
Seoyoen Yang
(SNU)
Taekmin Kim
(SNU)
Jihwan Lee
(SNU)
Doyoung Kwak (PU)
Yunbum Beak
(신호시스템)
Joongwon Jwang
(위메프)
Jeongah Shin
(HanyangUniv)
누구나 TensorFlow!
J. Kang Ph.D.
모두의 거북목을 지켜줘 프로젝트
112
누구나 TensorFlow!
J. Kang Ph.D.
113
누구나 TensorFlow!
J. Kang Ph.D.
114
The End
Mobile Vision Learning 2018
- All right reserved @ Jaewook Kang 2018
누구나 TensorFlow!
J. Kang Ph.D.