Residual-INR: Communication Efficient On-Device�Learning Using Implicit Neural Representation
Hanqiu Chen1, Xuebin Yao2, Pradeep Subedi2 and Cong (Callie) Hao1
1Georgia Institute of Technology
2Samsung Semiconductor, Inc..
Background: On-device Learning @ Edge
2
Wildlife
monitor
Agriculture
Underwater
Why on-device learning?
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Challenge: Slow Communication
3
Edge devices need to share and synchronize newly collected data. But…
Expensive and slow communication using JPEG!
JPEG
Serverless edge computing
JPEG
JPEG
Slow wireless communication
(~ 2MB/s)
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Opportunity: Using Implicit Neural Representation
4
Implicit neural representation (INR) can be used to compress the images/videos to reduce communication
Similarly, it can be applied to videos using frame time index as inputs (NeRV @ NIPS’21)
Rapid-INR @ ICCAD’23
1. Chen, Hanqiu, Hang Yang, Stephen Fitzmeyer, and Cong Hao. "Rapid-INR: Storage Efficient CPU-free DNN Training Using Implicit Neural Representation." In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD).
2. Chen, Hao, Bo He, Hanyu Wang, Yixuan Ren, Ser Nam Lim, and Abhinav Shrivastava. "Nerv: Neural representations for videos." Advances in Neural Information Processing Systems 34 (2021)
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Limitation of RapidINR: Low Object Quality
5
Background PSNR: 33.89
Object PSNR:21.61
Raw JPEG
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Solution: Residual-INR
6
Use two INRs to separately encode background and object
Background INR
(24KB)
Reduced
weights storage
Object
residual INR
(6KB)
Large Rapid-INR for high quality full image
(45KB)
<
Background
PSNR: 34.61
Object
PSNR: 35.73
Object PSNR: 30.21
Background PSNR: 34.61
Higher object quality
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Solution: Residual-INR
7
Step1: encode the whole image
Background INR and object INR are encoded in two separate steps
Use existing INR encoding methods, but choose a smaller INR size
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Solution: Residual-INR
8
Step2: encode the object
Object residual INR is an “add-on” on background INR
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Rationale: Why Residual Encoding?
9
Direct encoding and residual encoding comparison
Ours
Direct encoding cannot utilize the information learned by background INR
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Rationale: Why Residual Encoding?
10
High entropy
Low entropy
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Challenge of Residual-INR: Imbalanced Decoding
11
Use different sizes background and object INRs to encode different sizes images and objects
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Solution in Residual-INR: INR Grouping
12
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Solution in Residual-INR: INR Grouping
13
Slow decoding
Fast decoding
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Advantage of Residual-INR: Simple Software
14
Do not need complex software stack support
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Advantage of Residual-INR in Fog Computing
15
Fog computing
Reduced communication with hybrid JPEG and INR
JPEG
INR
JPEG
INR
INR
INR encoding: fog node
INR decoding: edge
<<
INR
JPEG
Much smaller data size in INR
Communication reduced
We develop a mathematical model for communication modeling and training strategy exploration
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Experiments
16
Task: single object detection fine-tuning
Detection backbone: YOLOv8 middle size (98.8MB)
Datasets: DAC-SDC, UAV123, OTB100 (multiple video sequences stored in JPEG format)
Hardware: Intel 6226R CPU and NVIDIA A6000 GPU to simulate the edge computing device for image decoding and training latency measurements
Case study: Apply Residual-INR to Rapid-INR (image) and NeRV (video)
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Improved Object Reconstruction Quality
17
BS: Baseline size
BE: Background INR size encode
DE: B-INR + object INR directly encodes raw RGB
RE: B-INR + object INR encodes residual RGB
QRE: Quantize B-INR to 8 bit and object INR to 16 bit
Low quality JPEG (quality = 0.5)
Rapid-INR
NeRV
Average image size
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Reduced Average Image Size
18
Baseline INR: quantize to 16bits
BS: Baseline size
BE: Background INR size encode
DE: B-INR + object INR directly encodes raw RGB
RE: B-INR + object INR encodes residual RGB
QRE: Quantize B-INR to 8 bit and object INR to 16 bit
Low quality JPEG (quality = 0.5)
Rapid-INR
NeRV
Average image size
Average image size reduced
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Training Accuracy Loss
19
JPEG
Rapid-INR
NeRV
Res-Rapid-INR (ours)
Res-NeRV (ours)
Amount of data transferred if training at fog node
Solid line: amount of data transferred
Better to train at edge
Dash line: mAP50-95 (accuracy)
Better to train at fog node
Accuracy (mAP50-95)
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Reduced Data Transmission
20
JPEG
Rapid-INR
NeRV
Res-Rapid-INR (ours)
Res-NeRV (ours)
Amount of data transferred if training at fog node
Solid line: amount of data transferred
Better to train at edge
Dash line: mAP50-95 (accuracy)
Better to train at fog node
Amount of data transmission saved
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Decoding and Communication Latency
21
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Different Compression Techniques Summary
22
| JPEG | Residual-INR |
Object quality | | |
Detection accuracy | | |
Storage | | |
Communication | | |
Decoding speed | | |
Good
Fair
Bad
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology
Summary & Thanks
23
Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology