1 of 18

Modeling Low-Level Biological Vision with Deep Neural Networks

Qiang Li

https://qianglisinoeusa.github.io/

Faculty of Biological Sciences

Image Processing Laboratory

University of Valencia, Spain

Octo, 13 2022

2 of 18

Understanding Biological Vision Functions

Computational Neuroscience

Deep Learning

Information Theory:Total correlation

Image Statistics

Psychophysics

Computational NeuroImaging: Functional connectivity

Computer Vision: Attention/Saliency prediction

3 of 18

Outline

Background

The similarities and differences between deep neural networks and brain

Human contrast sensitivity functions

Contrast sensitivity functions in autoencoder

4 of 18

Background

Similarities

Differences

Yamins and DiCarlo, 2016

5 of 18

Khaligh-Razavi & Kriegeskorte (2014)

The similarities between deep feedforward AlexNet and Brain (one example)

The similarities between recurrent neural networks and Brain

deep feedforward

Recurrent (dynamic)

Liao and Poggio, 2016

6 of 18

The differences between deep neural networks and Brain (some examples)

7 of 18

The Human Contrast Sensitivity Functions

Spatio-Chromatic CSFs

Spatio-Temporal CSFs

Psychophysical Experiments

Campbell-Robson Chart

8 of 18

2. Contrast sensitivity functions in autoencoder

31 papers

9 of 18

Mathematics models

Graphical models

Retina(20MB/second) - LGN(1MB/second) - V1(40bits/second)

Information Bottleneck

10 of 18

Functional goals

Architectures

11 of 18

Training with pairwise images

Training with pairwise videos

12 of 18

Spatio-Chromatic Sinus

Spatio-Temporal Sinus

Achromatic videos

Chromatic videos

15 of 18

Spatiotemporal chromatic CSFs of a high-resolution movie

Spatiotemporal chromatic CSFs of a low-resolution movie

16 of 18

As a first contribution, we show that a very popular type of convolutional neural networks (CNNs), called autoencoders, may develop human-like CSFs in the spatiotemporal and chromatic dimensions when trained to perform some basic low-level vision tasks (like retinal noise and optical blur removal), but not others (like chromatic) adaptation or pure reconstruction after simple bottlenecks).

As a second contribution, we provide experimental evidence of the fact that, for some functional goals (at low abstraction level), deeper CNNs that are better in reaching the quantitative goal are actually worse in replicating human-like phenomena (such as the CSFs). This low-level result (for the explored networks) is not necessarily in contradiction with other works that report advantages of deeper nets in modeling higher level vision goals. However, in line with a growing body of literature, our results suggests another word of caution about CNNs in vision science because the use of simplified units or unrealistic architectures in goal optimization may be a limitation for the modeling and understanding of human vision.

Conclusions

17 of 18

Take-home Messages

Modeling biological vision with deep neural networks and spike trains.

Yamins and DiCarlo, 2021

18 of 18

Thanks to all my co-authors and related funding support for this research

Thank you for your time and attention

Q&A