1 of 15

�Fully Complex-valued Deep Learning model for Visual Perception

Aniruddh Sikdar*1, Sumanth Udupa*2, Suresh Sundaram2

1Robert Bosch Centre for Cyber-Physical Systems, Indian Institute of Science, Bengaluru, India.�2Department of Aerospace Engineering, Indian Institute of Science, Bengaluru, India.

Robert Bosch Centre for Cyber-Physical Systems

*

1

2 of 15

Introduction

  • Complex-valued deep learning is used for,
    • speech processing,
    • robotics,
    • signal processing and image processing.

Complex – plane.

Complex-valued deep learning models have

    • richer representation ability,
    • easier optimization,
    • efficient multi-task learning,
    • superior MRI and SAR reconstructions.

*

2

3 of 15

Background

    • Complex-valued (CV) models can be used for image classification tasks.
      • The state of the art CV-models typically train in the following manner:

This training scheme

  • Projects features from complex to real-valued domains within the network.
  • Embeddings are optimized in the real-valued domain.

*

3

4 of 15

Motivation

    • However, this learning process is not optimal,
      • Projecting to a real-valued domain causes a loss of phase information.
      • Rich representation ability of complex-valued fully connected layers is not used [1,2].
      • Operating in the whole complex domain may improve overall generalization ability.

[1] Sundaram Suresh, Narasimhan Sundararajan, and Ramasamy Savitha, Fully Complex-valued Multi-Layer Perceptron Networks, pp. 31–47, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.

[2] Tohru Nitta, “Solving the xor problem and the detection of symmetry using a single complex-valued neuron,” Neural Networks, vol. 16, no. 8, pp. 1101–1105, 2003.

[3] Tohru Nitta, “Orthogonality of decision boundaries in complex-valued neural networks,” Neural computation, vol. 16, no. 1, pp. 73–97, 2004.

Contributions

    • A novel, fully complex-valued learning scheme is proposed using orthogonal decision boundary theory [3] to train a Fully Complex-valued Convolutional Neural Network (FC-CNN).
      • This training lets FC-CNN operate completely in the complex domain.
      • FC-CNN has same number of parameters as its real-valued counterpart.
    • A new regularized complex-valued loss function and a two-step training strategy are proposed.
      • To address overfitting, leading to faster learning.
      • Better convergence.

*

4

5 of 15

Fully Complex-valued Deep learning for image classification.

  • Prerequisites of Complex-valued Deep learning.
    • Complex-valued convolution: Convolving a complex matrix with the kernel W = A +iB, the output corresponding to the input patch h =X +iY is given by,

  • Complex-valued activation function:
    • Mainly CReLU and complex cardioid activation functions are used.

[4] Patrick M Virtue, Complex-valued deep learning with applications to magnetic resonance image synthesis, University of California, Berkeley, 2019.

Cardioid activation function[4].

Cardioid activation function in complex plane.

*

5

6 of 15

Fully Complex-valued Convolutional Neural Network (FC-CNN)

  • FC-CNN is a shallow fully complex-valued network based on Cifar-Net model.
    • Model architecture is followed for fair-er comparison with shallow complex-valued models.

FC-CNN model architecture.

  • Complex-valued one-hot encoding of labels:
        • For the label ct of the tth sample, one-hot encoding ytk ={yt1, yt2, .. ytk... ytn} is as follows:

*

6

7 of 15

Fully Complex-valued Convolutional Neural Network (FC-CNN)

  • Regularized complex-valued hinge loss:
    • The complex-valued hinge loss e [1] is given by,

where is the complex-valued one hot encoding and is the complex-valued predictions.

  • For regularizing effect, two terms are defined.

[1] Sundaram Suresh, Narasimhan Sundararajan, and Ramasamy Savitha, Fully Complex-valued Multi-Layer Perceptron Networks, pp. 31–47, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.

Error threshold

Maximum threshold

*

7

8 of 15

Fully Complex-valued Convolutional Neural Network (FC-CNN)

  • The following condition is used to update e :

  • For misclassified samples, the complex-valued hinge loss e remains the same.

  • Real-valued loss function E is defined as the product of complex-valued hinge loss e and its complex conjugate, and is given by

*

8

9 of 15

Proposed Fully Complex-valued learning.

  • Novel fully complex-valued learning using Orthogonal Decision Boundary theory to train the network in complex-domain and preserve phase information.

[3] Nitta, "Orthogonality of decision boundaries in complex-valued neural networks." Neural computation,2004.

Block diagram of fully complex-valued learning with two-step training strategy.

Orthogonal decision boundary theory[3].

Real and imaginary decision boundaries of fully complex-valued deep learning models.

*

9

10 of 15

Experimental results.

  • To validate FC-CNN and compare with other models, following datasets are used:
    • CIFAR-10/100.
    • SVHN.

  • Models are compared on,
    • RGB data.
    • LAB encoding[5],
      • L*a*b encoding used to convert RGB into two–channeled complex-valued encodings,

    • Sliding encodings[5].
      • Converts RGB into two complex-valued channels,

[5] Utkarsh Singhal, Yifei Xing, and Stella X Yu, “Codomain symmetry for complex-valued deep learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 681– 690.

  1. DCN (ICLR ‘18).
  2. Sur-Real (TNNLS’ 20).
  3. CDS – I/E (CVPR ‘22)

*

10

11 of 15

Experimental results.

  • RQ1: Is retaining phase information and operating in the complete complex plane necessary?

  • Findings:
    • FC-CNN outperforms real-valued CNN on all datasets and encodings.
    • FC-CNN achieves comparable performance to SOTA on CIFAR-10, SVHN.
    • For CIFAR-100, it achieves SOTA performance with 25% fewer parameters on both real and complex-valued encodings.

Comparison of test accuracy.

Emphirically justify the intuition that fully complex-valued learning is much better compared to other complex-valued training schemes.

  1. DCN (ICLR ‘18).
  2. Sur-Real (TNNLS’ 20).
  3. CDS – I/E (CVPR ‘22)

*

11

12 of 15

Experimental results.

  • RQ2: Does the extension of real-valued CNN to complex-domain improve the performance?

  • Findings:
    • Extending the real-valued models to complex-domain with the same number of parameters increases the performance by 1.25%.
    • Finetuning step helps train the fully connected complex-valued layers better, with an increase of 2.65%.
    • FC-CNN has the least FLOPS compared to DCN and other complex-valued models.

Comparison of test accuracy and FLOPS.

Empirically justify the intuition that operating in the complex domain, even for real-valued images, gives an increase in performance.

*

12

13 of 15

Experimental results.

  • Performance analysis of FC-CNN:

Performance curves of test accuracy of real-valued CNN, CV-models, and FC-CNN on CIFAR-100 (RGB) dataset.

  • It converges within 15k iterations, while all the other models take considerably more training time.
  • Generalizes better on larger datasets than real-valued and CV models.

*

13

14 of 15

Conclusions

  • Fully Complex-valued Deep Learning model for Visual Perception:
    • Proposed novel fully complex-valued learning for FC-CNN using Orthogonal Decision Boundary theory to train the network in complex-domain.
      • Empirically justified the intuitions for shallow complex-valued models:
      • Training completely in complex-domain in an end-to-end manner,
      • Operating all the quadrants of the complex domain in feature space.
      • This learning scheme aids the use of representation power of complex-valued, fully connected layers.
      • FC-CNN shows an improved performance of 2.3-8.5% compared to the real-valued counterpart.

    • Proposed regularized complex-valued hinge loss and two-step training strategy.
      • Improves the performance by 2.65% for large datasets like CIFAR-100.

    • This training scheme helps to generalize better on larger datasets than real-valued and complex-valued training schemes.

*

14

15 of 15

Thank you!!!�

*

15