1 of 22

Review of HW&SW Deep-Learning projects in RTC-Seville

Alejandro Linares Barranco

2 of 22

Neuromorphic Processor Project (NPP)

2015-2018 phase1, 2018-2020 phase2

3 of 22

NPP Project Goals*

  1. To develop the theory and practice of training reduced-precision ANNs (analog deep neural networks) and accurate SNNs (spiking neural networks) with lower computational costs�
  2. To develop ANN and SNN hardware architectures aimed for SoC integration�
  3. To develop live demonstrations of visual and auditory processing by the ANN and SNN processors running on FPGAs

In NPP, ANNs and SNNs have been combined to implement convolutional and recurrent neural network theory and hardware that uses sparse and change-driven computing

* From kickoff meeting May 2015

4 of 22

Y Bengio, M Courbariaux

J Seo

R Manohar

T Delbruck SC Liu, �G Indiveri

F Corradi

ES Shim, SJ Suh, JH Lee

B Barranco, A Linares

NPP Team

5 of 22

MNIST on Stratix-V DE5 platform with Altera OpenCL SDK

Implementation and Performance study

Alejandro (USE) & Jae-sun Seo (ASU)

6 of 22

MNIST ConvNet architecture under evaluation. Similar to Le-Net proposed in 1998 by Lecun.

28x28 20@24x24 20@12x12 50@8x8 50@4x4 500 10

7 of 22

MNIST ConvNet architecture from Caffe http://caffe.berkeleyvision.org/

Caffe exports OpenCL for GP-GPUs. Easy to adapt to FPGAs.

8 of 22

Developed with AOCL in 2015

Today OpenVINO Toolkit works with Caffe & Tensorflow

9 of 22

10 of 22

Demo with MNIST and Altera DE5 platform

  • ~34ms / frame (~11ms on FPGA)

11 of 22

NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

Roshambo demonstration on Xilinx PSoC

Alejandro Linares-Barranco

Antonio Rios-Navarro

IEEE-TNN early access: https://doi.org/10.1109/TNNLS.2018.2852335

12 of 22

RoShamBo training images

13 of 22

RoShamBo CNN architecture

Conv 5x5

16x60x60

Total 18MOp (~9M MAC)

Paper

Scissors

Rock

Background

64x64

DVS 2D rectified histogram of

2k events

(0.1Hz – 200 Hz rate)

MaxPool 2x2

16x30x30

Conv 3x3

32x28x28

Conv 1x1

+

MaxPool 2x2

128x1x1

MaxPool 2x2

32x14x14

Conv 3x3

64x12x12

MaxPool 2x2

64x6x6

Conv 3x3

128x4x4

MaxPool 2x2

128x2x2

240x180 DVS “frames”

Conventional 4-layer LeNet with ReLU/MaxPool and 1 FC layer before output.

14 of 22

CNN Hw accelerator: NullHop main features

1. Compressed layers are stored using sparsity map

4. Zero pixel MACs are completely skipped

2. Pixels are loaded only once per 128 output feature maps

5. Kernels for layer are loaded to MAC SRAM banks

7. Output maps are computed in parallel (up to 128)

8. 2x2 max pooling “on the fly” cuts external DRAM writes by 4X

9. Compressed layer is written out ready for being streamed back

6. Controllers cluster MAC units for 16-128 output maps/pass

3. Pixel & channel ordering scales to arbitrary image size

15 of 22

FPGA infrastructure for NullHop testing and demonstration

Xilinx Zynq-7100 PSoC from a MMP AVnet module. Motherboard developed by RTC

16 of 22

Demo latencies of Roshambo CNN on NullHop. NIPS-2016

17 of 22

18 of 22

Deep-Learning applied to diagnostic assistance on Prostate Cancer

Tejido sano

(TS)

Gleason Grado 3

(G3)

Gleason Grado 4

(G4)

Gleason Grado 5

(G5)

WSI

Whole Slide Tissue Image

19 of 22

CNN-based methodology

DataSet collected from real tissue from Valme Hospital:

Dataset

WSI

I. Rotations

II. Brightness changes

III. Focus changes

TRAIN

X processed images

Labels file

TEST

Y processed images (from different sources of those of X)

Labels file

Always in process!

Artificial image processing for dataset increment

Resize

RoI extractions (TS, G3, G4 y G5)

20 of 22

CNN (LeNet, AlexNet, ResNet…)

INPUT

OUTPUT

21 of 22

Preliminary results

Architecture

Image size

# Images

pre-process?

Dataset artificial incr.?

Accuracy

LeNet

100x100

17.022

No

No

~68%

VGG19

100x100

17.022

No

No

~74%

22 of 22

Other on-going projects:

  • Apply FPGA CNN accelerators to mobile robots navigation
  • Breast Cancer: Modified published CNN (ResNet-50 and others)
  • Brain Glioma: SPP-Net: Spatial Pyramid Pooling, with Fuzzy C-means pre-pr.
  • Polyps’ detection in colonoscopy: Faster-RCNN, MICCAI’18 challenge’s DataSet
  • Glaucoma: U-NET, public DataSet