1 of 23

Residual-INR: Communication Efficient On-Device�Learning Using Implicit Neural Representation

Hanqiu Chen1, Xuebin Yao2, Pradeep Subedi2 and Cong (Callie) Hao1

1Georgia Institute of Technology

2Samsung Semiconductor, Inc..

2 of 23

Background: On-device Learning @ Edge

2

Wildlife

monitor

Agriculture

Underwater

Why on-device learning?

  • Environment is changing, model needs to update

  • Deploy in the wild, hard to bring devices back

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

3 of 23

Challenge: Slow Communication

3

Edge devices need to share and synchronize newly collected data. But…

Expensive and slow communication using JPEG!

JPEG

Serverless edge computing

JPEG

JPEG

Slow wireless communication

(~ 2MB/s)

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

4 of 23

Opportunity: Using Implicit Neural Representation

4

Implicit neural representation (INR) can be used to compress the images/videos to reduce communication

Similarly, it can be applied to videos using frame time index as inputs (NeRV @ NIPS’21)

Rapid-INR @ ICCAD’23

1. Chen, Hanqiu, Hang Yang, Stephen Fitzmeyer, and Cong Hao. "Rapid-INR: Storage Efficient CPU-free DNN Training Using Implicit Neural Representation." In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD).

2. Chen, Hao, Bo He, Hanyu Wang, Yixuan Ren, Ser Nam Lim, and Abhinav Shrivastava. "Nerv: Neural representations for videos." Advances in Neural Information Processing Systems 34 (2021)

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

5 of 23

Limitation of RapidINR: Low Object Quality

5

  • Background PSNR is higher than object PSNR
  • But objects are what we care!!

Background PSNR: 33.89

Object PSNR:21.61

Raw JPEG

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

6 of 23

Solution: Residual-INR

6

Use two INRs to separately encode background and object

Background INR

(24KB)

Reduced

weights storage

Object

residual INR

(6KB)

Large Rapid-INR for high quality full image

(45KB)

<

Background

PSNR: 34.61

Object

PSNR: 35.73

Object PSNR: 30.21

Background PSNR: 34.61

Higher object quality

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

7 of 23

Solution: Residual-INR

7

Step1: encode the whole image

Background INR and object INR are encoded in two separate steps

Use existing INR encoding methods, but choose a smaller INR size

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

8 of 23

Solution: Residual-INR

8

Step2: encode the object

Object residual INR is an “add-on” on background INR

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

9 of 23

Rationale: Why Residual Encoding?

9

Direct encoding and residual encoding comparison

Ours

Direct encoding cannot utilize the information learned by background INR

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

10 of 23

Rationale: Why Residual Encoding?

10

High entropy

Low entropy

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

11 of 23

Challenge of Residual-INR: Imbalanced Decoding

11

Use different sizes background and object INRs to encode different sizes images and objects

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

12 of 23

Solution in Residual-INR: INR Grouping

12

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

13 of 23

Solution in Residual-INR: INR Grouping

13

Slow decoding

Fast decoding

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

14 of 23

Advantage of Residual-INR: Simple Software

14

Do not need complex software stack support

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

15 of 23

Advantage of Residual-INR in Fog Computing

15

Fog computing

Reduced communication with hybrid JPEG and INR

JPEG

INR

JPEG

INR

INR

INR encoding: fog node

INR decoding: edge

<<

INR

JPEG

Much smaller data size in INR

Communication reduced

We develop a mathematical model for communication modeling and training strategy exploration

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

16 of 23

Experiments

16

Task: single object detection fine-tuning

Detection backbone: YOLOv8 middle size (98.8MB)

Datasets: DAC-SDC, UAV123, OTB100 (multiple video sequences stored in JPEG format)

Hardware: Intel 6226R CPU and NVIDIA A6000 GPU to simulate the edge computing device for image decoding and training latency measurements

Case study: Apply Residual-INR to Rapid-INR (image) and NeRV (video)

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

17 of 23

Improved Object Reconstruction Quality

17

  • Compared with using background INR size encoding:
    • Using small object INR for object encoding brings 9.24 ~ 18.71 PSNR

BS: Baseline size

BE: Background INR size encode

DE: B-INR + object INR directly encodes raw RGB

RE: B-INR + object INR encodes residual RGB

QRE: Quantize B-INR to 8 bit and object INR to 16 bit

Low quality JPEG (quality = 0.5)

Rapid-INR

NeRV

Average image size

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

18 of 23

Reduced Average Image Size

18

  • With image sizes ranging from 8.3% to 18.4% of the original JPEG, Residual-INR achieves a PSNR over 38,closely approximating the quality of the raw RGB

Baseline INR: quantize to 16bits

BS: Baseline size

BE: Background INR size encode

DE: B-INR + object INR directly encodes raw RGB

RE: B-INR + object INR encodes residual RGB

QRE: Quantize B-INR to 8 bit and object INR to 16 bit

Low quality JPEG (quality = 0.5)

Rapid-INR

NeRV

Average image size

Average image size reduced

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

19 of 23

Training Accuracy Loss

19

  • With same amount of images transferred for training:
    • Residual-INR compressed images only cause marginal reduce in accuracy compared with JPEG

JPEG

Rapid-INR

NeRV

Res-Rapid-INR (ours)

Res-NeRV (ours)

Amount of data transferred if training at fog node

Solid line: amount of data transferred

Better to train at edge

Dash line: mAP50-95 (accuracy)

Better to train at fog node

Accuracy (mAP50-95)

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

20 of 23

Reduced Data Transmission

20

  • With same amount of images transferred for training:
    • Only requires 12%~19% amount of communication compared with JPEG

JPEG

Rapid-INR

NeRV

Res-Rapid-INR (ours)

Res-NeRV (ours)

Amount of data transferred if training at fog node

Solid line: amount of data transferred

Better to train at edge

Dash line: mAP50-95 (accuracy)

Better to train at fog node

Amount of data transmission saved

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

21 of 23

Decoding and Communication Latency

21

  • Compared with PyTorch single-thread CPU decoding:
    • Res-Rapid-INR speedup: up to 2.9X
    • Res-NeRV speedup: up to 2.25x
  • Compared with DALI GPU accelerated decoding:
    • Res-Rapid-INR speedup: up to 1.77X
    • Res-NeRV speedup: up to 1.38x

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

22 of 23

Different Compression Techniques Summary

22

JPEG

Residual-INR

Object quality

Detection accuracy

Storage

Communication

Decoding speed

Good

Fair

Bad

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology

23 of 23

Summary & Thanks

23

  • Residual-INR is a region importance aware INR encoding framework with a focus on object quality and reduces the communication of edge on-device learning.

    • System – Efficient fog online learning with hybrid JPEG-INR communication

    • Algorithm – Versatile region importance aware INR compression for reduced encoding redundancy

    • Hardware – CPU-free INR decoding with workload balancing

    • Mathematical modeling – An analytical math model for optimal communication strategy

    • Performance – storage efficient, reduced communication and fast CPU-free decoding�

Hanqiu Chen | Sharc-lab @ Georgia Institute of Technology