1 of 15

�Fully Complex-valued Deep Learning model for Visual Perception

Aniruddh Sikdar*¹, Sumanth Udupa*², Suresh Sundaram²�

¹Robert Bosch Centre for Cyber-Physical Systems, Indian Institute of Science, Bengaluru, India.�²Department of Aerospace Engineering, Indian Institute of Science, Bengaluru, India.

Robert Bosch Centre for Cyber-Physical Systems

2 of 15

Introduction

Complex-valued deep learning is used for,

speech processing,
robotics,
signal processing and image processing.

Complex – plane.

Complex-valued deep learning models have

richer representation ability,
easier optimization,
efficient multi-task learning,
superior MRI and SAR reconstructions.

3 of 15

Background

Complex-valued (CV) models can be used for image classification tasks.

The state of the art CV-models typically train in the following manner:

This training scheme

Projects features from complex to real-valued domains within the network.
Embeddings are optimized in the real-valued domain.

4 of 15

Motivation

However, this learning process is not optimal,

Projecting to a real-valued domain causes a loss of phase information.
Rich representation ability of complex-valued fully connected layers is not used [1,2].
Operating in the whole complex domain may improve overall generalization ability.

[1] Sundaram Suresh, Narasimhan Sundararajan, and Ramasamy Savitha, Fully Complex-valued Multi-Layer Perceptron Networks, pp. 31–47, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.

[2] Tohru Nitta, “Solving the xor problem and the detection of symmetry using a single complex-valued neuron,” Neural Networks, vol. 16, no. 8, pp. 1101–1105, 2003.

[3] Tohru Nitta, “Orthogonality of decision boundaries in complex-valued neural networks,” Neural computation, vol. 16, no. 1, pp. 73–97, 2004.

Contributions

A novel, fully complex-valued learning scheme is proposed using orthogonal decision boundary theory [3] to train a Fully Complex-valued Convolutional Neural Network (FC-CNN).

This training lets FC-CNN operate completely in the complex domain.
FC-CNN has same number of parameters as its real-valued counterpart.

A new regularized complex-valued loss function and a two-step training strategy are proposed.

To address overfitting, leading to faster learning.
Better convergence.

5 of 15

Fully Complex-valued Deep learning for image classification.

Prerequisites of Complex-valued Deep learning.

Complex-valued convolution: Convolving a complex matrix with the kernel W = A +iB, the output corresponding to the input patch h =X +iY is given by,

Complex-valued activation function:

Mainly CReLU and complex cardioid activation functions are used.

[4] Patrick M Virtue, Complex-valued deep learning with applications to magnetic resonance image synthesis, University of California, Berkeley, 2019.

Cardioid activation function[4].

Cardioid activation function in complex plane.

6 of 15

Fully Complex-valued Convolutional Neural Network (FC-CNN)

FC-CNN is a shallow fully complex-valued network based on Cifar-Net model.

Model architecture is followed for fair-er comparison with shallow complex-valued models.

FC-CNN model architecture.

Complex-valued one-hot encoding of labels:

For the label c^t of the t^th sample, one-hot encoding y^t_k ={y^t₁, y^t₂, .. y^t_k... y^t_n} is as follows:

7 of 15

Fully Complex-valued Convolutional Neural Network (FC-CNN)

Regularized complex-valued hinge loss:

The complex-valued hinge loss e [1] is given by,

where is the complex-valued one hot encoding and is the complex-valued predictions.

For regularizing effect, two terms are defined.

[1] Sundaram Suresh, Narasimhan Sundararajan, and Ramasamy Savitha, Fully Complex-valued Multi-Layer Perceptron Networks, pp. 31–47, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.

Error threshold

Maximum threshold

8 of 15

Fully Complex-valued Convolutional Neural Network (FC-CNN)

The following condition is used to update e :

For misclassified samples, the complex-valued hinge loss e remains the same.

Real-valued loss function E is defined as the product of complex-valued hinge loss e and its complex conjugate, and is given by

9 of 15

Proposed Fully Complex-valued learning.

Novel fully complex-valued learning using Orthogonal Decision Boundary theory to train the network in complex-domain and preserve phase information.

[3] Nitta, "Orthogonality of decision boundaries in complex-valued neural networks." Neural computation,2004.

Block diagram of fully complex-valued learning with two-step training strategy.

Orthogonal decision boundary theory[3].

Real and imaginary decision boundaries of fully complex-valued deep learning models.

10 of 15

Experimental results.

To validate FC-CNN and compare with other models, following datasets are used:

CIFAR-10/100.
SVHN.

Models are compared on,

RGB data.
LAB encoding[5],

L*a*b encoding used to convert RGB into two–channeled complex-valued encodings,

Sliding encodings[5].

Converts RGB into two complex-valued channels,

�

[5] Utkarsh Singhal, Yifei Xing, and Stella X Yu, “Codomain symmetry for complex-valued deep learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 681– 690.

DCN (ICLR ‘18).
Sur-Real (TNNLS’ 20).
CDS – I/E (CVPR ‘22)

11 of 15

Experimental results.

RQ1: Is retaining phase information and operating in the complete complex plane necessary?

Findings:

FC-CNN outperforms real-valued CNN on all datasets and encodings.
FC-CNN achieves comparable performance to SOTA on CIFAR-10, SVHN.
For CIFAR-100, it achieves SOTA performance with 25% fewer parameters on both real and complex-valued encodings.

�

Comparison of test accuracy.

Emphirically justify the intuition that fully complex-valued learning is much better compared to other complex-valued training schemes.

DCN (ICLR ‘18).
Sur-Real (TNNLS’ 20).
CDS – I/E (CVPR ‘22)

12 of 15

Experimental results.

RQ2: Does the extension of real-valued CNN to complex-domain improve the performance?

Findings:

Extending the real-valued models to complex-domain with the same number of parameters increases the performance by 1.25%.
Finetuning step helps train the fully connected complex-valued layers better, with an increase of 2.65%.
FC-CNN has the least FLOPS compared to DCN and other complex-valued models.

�

Comparison of test accuracy and FLOPS.

Empirically justify the intuition that operating in the complex domain, even for real-valued images, gives an increase in performance.

13 of 15

Experimental results.

Performance analysis of FC-CNN:

Performance curves of test accuracy of real-valued CNN, CV-models, and FC-CNN on CIFAR-100 (RGB) dataset.

It converges within 15k iterations, while all the other models take considerably more training time.
Generalizes better on larger datasets than real-valued and CV models.

14 of 15

Conclusions

Fully Complex-valued Deep Learning model for Visual Perception:

Proposed novel fully complex-valued learning for FC-CNN using Orthogonal Decision Boundary theory to train the network in complex-domain.

Empirically justified the intuitions for shallow complex-valued models:
Training completely in complex-domain in an end-to-end manner,
Operating all the quadrants of the complex domain in feature space.
This learning scheme aids the use of representation power of complex-valued, fully connected layers.
FC-CNN shows an improved performance of 2.3-8.5% compared to the real-valued counterpart.

Proposed regularized complex-valued hinge loss and two-step training strategy.

Improves the performance by 2.65% for large datasets like CIFAR-100.

This training scheme helps to generalize better on larger datasets than real-valued and complex-valued training schemes.

1 of 15

2 of 15

3 of 15

4 of 15

5 of 15

6 of 15

7 of 15

8 of 15

9 of 15

10 of 15

11 of 15

12 of 15

13 of 15

14 of 15

15 of 15