1 of 22

Atsumi Niwa

Event-based Vision Sensor and

On-chip Processing Development

CVPR 2023 Workshop on Event-based Vision

To spark imaginations and enrich society through the power of technology.

2 of 22

Outline

Introduction

HW development

Event processing development

Sony Semiconductor Solutions Corporation

2

3 of 22

Network of Operations on Event-based Vision Development

Hardware & Software R&D/Production

Sony Semiconductor Solutions

Kanagawa, Japan

Sensor Fabrication

Sony Semiconductor Manufacturing

Kumamoto, Japan

Hardware & Software R&D

Sony Advanced Visual Sensing

Zurich, Switzerland

Business Development

on going with customer

Sony Semiconductor Solutions Corporation

3

4 of 22

Sony’s EVS Development History

Sony has delivered high quality 1st EVS product in 2022

Sony is also continuously enlarging market

EVS Gen.1

EVS Gen.2

EVS Gen.3

Sony Semiconductor Solutions Corporation

4

5 of 22

【EVS Gen.1】 for Industry and IOT Market[1]

Feature

  • Sony’s first EVS with BI-Stacked Process
  • Production available : IMX636/637/646/647
  • Asynchronous readout architecture

Specification (production)

  • Resolution : 1280 x 720
  • Pixel size : 4.86μm□
  • Max event rate : 1.066Geps
  • Time resolution : 1μs
  • Latency : <1ms (5lux)
  • Power Consumption : 166mW

(1.8kHz light w/o ESP)

  • Data IF : MIPI, SLVC w/ EVT2.1, 3.0
  • ESP : Spatio-Temporal Contrast Filter

Event Rate Controller

Arbiter

T.Finateu, et.al. ISSCC, 2020

Sony Semiconductor Solutions Corporation

5

6 of 22

【EVS Gen.2】 for Wearable Device[2]

Feature

  • Synchronous event readout mode
  • AFE sharing technique 
  • Low noise intensity readout mode
  • High sensitivity at IR light

Specification

  • Resolution : 640 x 640
  • Pixel size : 2.97μm□
  • Max event rate : 1.7kfps
  • Latency : 500μs (2.2μW/cm2)
  • Power Consumption : 73mW(1.4Geps)
  • Intensity FWC : 10600e-
  • Intensity Random Noise : 2.6e-
  • Data IF : standard MIPI-CSI2
  • Processing : Anti-flicker

Auto threshold control

Row driver

A.Niwa, et.al. ISSCC, 2023

Sony Semiconductor Solutions Corporation

6

7 of 22

【EVS Gen.3】 for Smart Phone[3]

Feature

  • RGB Hybrid EVS
  • Adaptive event frame rate control
  • Adaptive event data compression
  • RGB + W(EVS) CF

Specification

  • Resolution : 7680 x 4336 (EVS : 1920 x 1084)
  • Pixel size : 1.22μm□
  • Max event rate : 4kfps
  • Latency : 3.2ms (1ux)
  • Power Consumption : 525mW(Hybrid mode)
  • Intensity FWC : 7700e-
  • Intensity Random Noise : 1.6e-
  • Data IF : MIPI-CSI2(virtual channel for RGB/event)
  • Processing : Event Drop Filter

K.Kodama, et.al. ISSCC, 2023

Sony Semiconductor Solutions Corporation

7

8 of 22

Target application

Gen. 1 Async. EVS

Gen. 2 Small EVS

Gen. 3 Hybrid EVS

Deblur

Monitoring

Spark / Liquid

3D Scan

AR/VR

VFI/Super Slow

3D Scan

Industry

Mobile

Imaging

Eye Tracking

Gesture

SLAM

(VLC basis / Visual basis)

Sony Semiconductor Solutions Corporation

8

9 of 22

Outline

Introduction

HW development

Event processing development

Sony Semiconductor Solutions Corporation

9

10 of 22

Motivation for signal processing development

The purpose is to improve contrast reliability

w/ appropriate threshold

w/o appropriate threshold

High Event

Reliability

Low Event

Reliability

???

flicker

w/ low threshold

w/ too high threshold

still flickering

still noisy

random noise

w/ low threshold

ATC

(ISSCC2023)

ATC

(ISSCC2023)

Sony Semiconductor Solutions Corporation

10

11 of 22

summary of processing

“random noise” and “flicker” suppression is developed

Target

Function

Algorithm

Symbol

Random noise

Denoise

2D area based isolated event filter[4]

A-1

2D TAP based isolated event filter[4]

A-2

3D area based isolated event filter[4]

A-3

Flicker

Anti Flicker

Filtering based Anti-Flicker

B-1

Threshold based Anti-Flicker(ISSCC2023) [2]

B-2

Both

Object

Enhancement

Edge preservation[5]

C-1

Pseudo-template matching

C-2

Equivalent suppression of noise and flicker

Sony Semiconductor Solutions Corporation

11

12 of 22

Denoise processing

Random noise is distinct with comparing noise threshold and event count

2D area-based

2D TAP-based

3D area-based

EVS

no contrast(test scene)

event count(event/pix/s)

contrast threshold

noise threshold

(from test scene)

1, define fixed area

2, count event-num of each area

3, event-num higher than thr?

YES : not filtering

  NO : filtering

1, define a 2D TAP around each event

2, count event-num of each tap

3, event-num higher than thr?

YES : not filtering

  NO : filtering

Case of threshold = 2

Case of threshold = 2

actual contrast(use-case)

noise

signal

not filtering

filtering

Processing is almost same as “2D area-based” but event counting period T is added.

Each area has current event rate(ER) by using IIR.

Case of threshold = 2

IIR blend ratio = 50%

t

ER=1.0

ER=1.5

ER=3.75

ER=3.0

ER=2.5

ER=1.75

Sony Semiconductor Solutions Corporation

12

13 of 22

Anti-Flicker processing

Flickering!

EVS

periodical trend

filtering based processing

threshold based processing(ISSCC2023)

1, define fixed area

2, track event count trend of each area

3, event rate has periodic?

YES : filtering

  NO : not filtering

PIX�AFE

event

counter

frequency

analysis

threshold

decision

updated threshold

1, track event count trend of specific area

2, track event count trend of area

3, event rate has periodic?

YES : higher threshold should be set

  NO : threshold will not be updated

count

event count

t

event count

t

event count

t

t

event count

Detect flicker from event periodical event trend,

and digitally remove by area wise or globally suppressed in analog

Sony Semiconductor Solutions Corporation

13

14 of 22

Object Enhancement(OE) processing

edge preservation

not filtering

filtering

1, generate NxNpix tap of each event

2, calculate the gradient of each tap

3, are there any directions where the gradient is small?

YES : filtering

  NO : not filtering

True event is distinct with gradient calculation or pattern matching

pseudo-template matching

T frame(input)

T-1 frame

not filtering

filtering

1, generate NxNpix tap of each event

2, count event-num in the tap of current frame

3, measure the similarity between current frame and previous frame�  not similar : filtering

  similar : not filtering

similar

not similar

Sony Semiconductor Solutions Corporation

14

15 of 22

Evaluation setup

Event Camera

Object

(rotating)

light source

Static for denoise

Flickering for anti-flicker

ノイズ領域

Quantifying with each ratio of noise suppression and object preservation

object area

(Regarding events as signal)

noise area

(Regarding events as noise)

object

event output

noise suppression ratio

w/o processing

w/ processing

1.0

object preservation ratio

w/o processing

w/ processing

1.0

Lower is better

Higher is better

It is created from noise area events.

It indicate how many noise could be suppressed after processing.

It is created from object area events.

It indicate how many signal could be preserved after processing.

Sony Semiconductor Solutions Corporation

15

16 of 22

Denoise processing result

w/o

noise suppression ratio

object preservation ratio

2D TAP based

(A-2)

edge preservation

(C-1)

A-1,2,3 are pretty good.

Because optimized noise threshold is applied through test.

Object preservation ratio of C-2 degrade because matching fails due to different movement distance between center and periphery.

Pseudo template matching (C-2)

Sony Semiconductor Solutions Corporation

16

17 of 22

Anti-flicker processing result(for local flickering)

w/o

B-1 is the best and C-2 is second.

There are no big difference in object preservation.

B-2 of global counting suffers from detecting flicker and suppression.

C-1 preserves the edge of blinking, and shows poor noise suppression.

filter based

(B-1)

threshold based

(B-2)

edge preservation

(C-1)

Pseudo template matching (C-2)

noise suppression ratio

object preservation ratio

Sony Semiconductor Solutions Corporation

17

18 of 22

Anti-flicker processing result(for global flickering)

C-1 is the best.

B-1 and B-2 of anti-flicker methods remove object with blinking.

C-2 also removes many object because pattern matching is difficult due to blinking.

w/o

filter based

(B-1)

edge preservation

(C-1)

Pseudo template matching (C-2)

noise suppression ratio

object preservation ratio

threshold based

(B-2)

Sony Semiconductor Solutions Corporation

18

19 of 22

Challenge of signal processing for each application

Optimizing processing to maximize application characteristics

is main future development theme

Category

Applications

Noise suppression

Flicker suppression

Denoise

Object

Enhance.

Anti-Flicker

Object

Enhance.

Industry & IOT

Liquid

Spark

3D

Mixed Reality

Hand tracking

Eye tracking

Visible light communication

Visual based SLAM

Mobile Imaging

Deblur

Low power movie

Super Slow

Sony Semiconductor Solutions Corporation

19

20 of 22

References

[1]T.Finateu, et.al. ISSCC, 2020, A 1280×720 Back-Illuminated Stacked Temporal

Contrast Event-Based Vision Sensor with 4.86μm Pixels, 1.066GEPS Readout,

Programmable Event-Rate Controller and Compressive Data-Formatting Pipeline

[2]A.Niwa, et.al. ISSCC, 2023, A 2.97μm-Pitch Event-Based Vision Sensor with

Shared Pixel Front-End Circuitry and Low-Noise Intensity Readout Mode

[3]K.Kodama, et.al. ISSCC, 2023, 1.22µm 35.6M-pixel RGB Hybrid Event-Based Vision

Sensor with 4.88µm-Pitch Event Pixels and up to 10K Event Frame Rate by

Adaptive Control on Event Sparsity

[4]Y.Feng, et.al. 2020, https://www.mdpi.com/2076-3417/10/6/2024

[5]R.Jain, et.al. 1995, MACHINE VISION

Sony Semiconductor Solutions Corporation

20

21 of 22

SONYはソニーグループ株式会社の登録商標または商標です。

各ソニー製品の商品名・サービス名はソニーグループ株式会社またはその関連会社の登録商標または商標です。その他の製品および会社名は、各社の商号、登録商標または商標です。

22 of 22

Appendix. How to calculate noise rate and event rate

ノイズ領域

 

Noise : calculate from noise area.

Signal : calculate from object area

1 , Rotate the output of each frames back to the initial phase position.

2 , Expand these on the polar coordinate system and sum all frames. Then we can get peak count map.

object area

noise area

 

NC: average event count of noise area[count/pix]

T: measurement period[frame]

PC: event count of peak pix[count/pix]

T: measurement period[frame]

frame0

(initial)

frame1

frame2

frame0’

frame1’

frame2’

ω

ω : angular velocity[radian / frame]

angle

expand

radius

frame0’

frame0’’

angle

radius

frame1’’

・・・

angle

count

PC

Sony Semiconductor Solutions Corporation

22