Modeling Low-Level Biological Vision with Deep Neural Networks
Qiang Li
https://qianglisinoeusa.github.io/
Faculty of Biological Sciences
Image Processing Laboratory
University of Valencia, Spain
Octo, 13 2022
Understanding Biological Vision Functions
Computational Neuroscience
Deep Learning
Information Theory:Total correlation
Image Statistics
Psychophysics
Computational NeuroImaging: Functional connectivity
Computer Vision: Attention/Saliency prediction
Outline
Similarities
Differences
Yamins and DiCarlo, 2016
Khaligh-Razavi & Kriegeskorte (2014)
The similarities between deep feedforward AlexNet and Brain (one example)
The similarities between recurrent neural networks and Brain
deep feedforward
Recurrent (dynamic)
Liao and Poggio, 2016
The differences between deep neural networks and Brain (some examples)
The Human Contrast Sensitivity Functions
Spatio-Chromatic CSFs
Spatio-Temporal CSFs
Psychophysical Experiments
Campbell-Robson Chart
2. Contrast sensitivity functions in autoencoder
31 papers
Mathematics models
Graphical models
Retina(20MB/second) - LGN(1MB/second) - V1(40bits/second)
Information Bottleneck
Functional goals
Architectures
Training with pairwise images
Training with pairwise videos
Spatio-Chromatic Sinus
Spatio-Temporal Sinus
Achromatic videos
Chromatic videos
Results
Spatiotemporal chromatic CSFs of a high-resolution movie
Spatiotemporal chromatic CSFs of a low-resolution movie
As a first contribution, we show that a very popular type of convolutional neural networks (CNNs), called autoencoders, may develop human-like CSFs in the spatiotemporal and chromatic dimensions when trained to perform some basic low-level vision tasks (like retinal noise and optical blur removal), but not others (like chromatic) adaptation or pure reconstruction after simple bottlenecks).
As a second contribution, we provide experimental evidence of the fact that, for some functional goals (at low abstraction level), deeper CNNs that are better in reaching the quantitative goal are actually worse in replicating human-like phenomena (such as the CSFs). This low-level result (for the explored networks) is not necessarily in contradiction with other works that report advantages of deeper nets in modeling higher level vision goals. However, in line with a growing body of literature, our results suggests another word of caution about CNNs in vision science because the use of simplified units or unrealistic architectures in goal optimization may be a limitation for the modeling and understanding of human vision.
Conclusions
Take-home Messages
Yamins and DiCarlo, 2021
Thanks to all my co-authors and related funding support for this research
Thank you for your time and attention
Q&A