1 of 24

Scene Flow and Stereo Disparity Evaluation on KITTI Benchmarks: A Comprehensive Survey�[Yuvraj Singh and Yingzi Yang]

Problem: State-of-the-art scene flow models have worse % of outliers in disparity benchmarks as compared to stereo models despite having temporal information.

KITTI Scene Flow Leaderboard [March 2022]

Scene flow models have worse disparity evaluation performance as compared to stereo models.

KITTI Stereo Leaderboard [March 2022]

2 of 24

Plans and algorithms

Extending our depth/disparity estimation survey
Proposing changes to scene flow models based on the best stereo disparity models

Baseline Algorithms

FlyingThings3D

Driving

Monkaa

Training Datasets

Fine-Tuning, Testing Dataset

KITTI 2015

Stereo Disparity/Depth: PSMNet

This gives us a general pipeline to work on

Scene Flow (FlowNet + DispNet): Mayer et al.

FlowNet (Dosovitskiy)

3 of 24

Experiments and analysis

Metrics

percentage of disparity estimation outliers (KITTI 2015)

Expectation

Improvement in disparity metrics due to temporal information

Some preliminary ideas

Use the best pre-trained stereo disparity estimator to provide the D dimension of the RGB-D input to RAFT-3D
Use context encoder for both images

Alternative baseline approach

(RAFT-3D by Teed et al.):

Using recurrent GRU-based operations with stereo pipeline.

(from KITTI leaderboard)

Context encoder

Stereo RGB-D input

4 of 24

Generalization of NeRF Techniques�Reza Averly and Deepak Warrier

Motivation:

Comprehensive survey study on NeRF techniques (NeRF, pixelNeRF, MetaNLR++, NeRF-W, etc.)

Challenges:

- Computational Power

- No universal dataset

(Mildenhall et al, 2020)

5 of 24

Plans and Algorithms

NeRF

pixelNeRF

MetaNLR++

Plan: Look at a few implementations of NeRF and test them on a common set of datasets.

See how well the algorithms can generalize to standard + few shot on varying types of data

6 of 24

Experiments and Analysis

PSNR	Dataset	ShapeNet	NeRF Dataset	MVS Dataset	...
Algorithm
NeRF
pixelNeRF
MetaNLR++
...

SSIM	Dataset	ShapeNet	NeRF Dataset	MVS Dataset	...
Algorithm
NeRF
pixelNeRF
MetaNLR++
...

Expectation: pixelNeRF will generalize the best, since its design is built around single-image reconstruction

Investigate: What can we do to NeRF or other models to achieve similar generalization performance?

Quantitative + Qualitative Analysis !!

Click to add text

7 of 24

On connecting co-segmentation and weakly supervised segmentation �

Weakly supervised semantic segmentation

Inexact supervision with image-level label

Co-segmentation

@Credit by PUZZLE-CAM: IMPROVED LOCALIZATION VIA MATCHING PARTIAL AND FULL FEATURES

@Credit by Learning with Free Object Segments for Long-Tailed Instance Segmentation

Group Members: Jike Zhong, Wenjin Fu, Tianle Chen

8 of 24

Recent Studies

There are recent studies in the weakly supervised learning domain, using unlabeled or partial labeled data to for semantic segmentation tasks [1 2 3].

However, most of studies achieve semantic-segmentation by using the class activation maps (CAM), without considering the correlation between images.

[1] S. Jo and I. -J. Yu, "Puzzle-CAM: Improved Localization Via Matching Partial And Full Features," 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 639-643, doi: 10.1109/ICIP42928.2021.9506058.

[2] Sun, W., Zhang, J., & Barnes, N. (2022). Inferring the class conditional response map for weakly supervised semantic segmentation. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). https://doi.org/10.1109/wacv51458.2022.00271

[3] Lee, S., Lee, M., Lee, J., & Shim, H. (2021). Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr46437.2021.00545

Correlation?

9 of 24

Plan & Dataset

We hope to use the idea of co-segmentation on the weak supervised semantic segmentation task (WSSS), improve the segmentation model by leveraging the correlation between images.

The dataset we plan to work on is PASCAL VOC 2012：

http://host.robots.ox.ac.uk/pascal/VOC/voc2012/

Baseline:

PUZZLE-CAM: IMPROVED LOCALIZATION VIA MATCHING PARTIAL AND FULL FEATURES

@Credit by PUZZLE-CAM: IMPROVED LOCALIZATION VIA MATCHING PARTIAL AND FULL FEATURES

10 of 24

Domain Adaptation for 3D Detection in Autonomous Vehicle Perception : An Empirical Study

[Divyanshu Tak, Mengdi Fan and Chaeun Hong]

Lidar Data Difference

Problem for 3D Object Detection

Waymo Lidar dataset

Cepton Lidar data

How to improve the 3D object detection performance when training data and testing data come from different lidar sensors ?

Detection performance is good when train and test data come from same modality.

11 of 24

Plans and algorithms

Dataset

Popular Algorithms

Setting

KITTI

Dataset

Waymo

Dataset

Custom

Dataset

(Cepton)

Train Data with multiple point cloud density/resolution.

Train

Model robust to Input data resolution.

Eval

Test on different Lidar data.

SECOND

PointPillars

PointRCNN

12 of 24

Experiments and Analysis

Compare SOTAs and analyze the sensor mismatch problem

Evaluation of existing 3D detection methods

Analysis and Expectations

Methods	KITTI	Waymo	Custom
Point-RCNN
PointPillars
SECOND
……

Think about methods to make a model invariant to different lidar scan patterns

Velodyne Lidar

Cepton Lidar

Resolution Invariance?

Leverage data from different Lidar sensors (Different datasets) , along with self-supervised learning algorithms

13 of 24

A Study of Transformers in CLVISION 2022 Challenge �[Cheng-Hao Tu and Xinyu Zhou]

The Continual Learning Problem

Transformers Reduce Forgetting?

Computer Vision Tasks

Image Classification

Object Detection

Forgetting

14 of 24

Plans and algorithms

Dataset

Questions

Methods

Do Transformers reduce forgetting in continual learning paradigms for both classification and object detection?
What makes Transformers robust against forgetting?

EgoObjects by Meta in CLVISION 2022

house/office environment
6619 images in the currently released demo dataset

Swin-Transformer

ViT

ResNet-50(-FPN)

Detection:

RepPoint

Faster R-CNN

Classification:

linear classifier

Fine-tuning

EWC

LwF

Memory-replay

Backbone

Output head

CL method

15 of 24

Experiments and analysis

Results on Transformer + CL methods

Methods	stage 1	stage 2	stage 3
Swin-Transfomrer + Fine-tuning
ViT + LwF
ViT + Memory replay

Analysis and Expectation

Comparison of weight changing in different components

x2 {Classification, Detection}

Uq

Uk

Uv

MLP

Swin Transformers

Conv

ResNet50

different changing speed????

Comparison of gains from CL methods under Transformers and CNNs

Swin Transformers

ResNet50

LwF

EWC

LwF

Different gains ?????

16 of 24

Lane Detection ��Shree Sai Charan Nannapaneni

As part of this project, I’ll Identify the correct lane marking type and color on each side of the car

Motivation:

Lane detection is one of the most important tasks in Autonomous Driving and carries 20 points in SAE AutoDrive Challenge II

Challenges:

There are not many papers that cover lane detection.
Most probably will have to tune the parameters of existing model and train on my dataset.�

17 of 24

Plans and Algorithms

Try existing models, if not satisfied with the results, fine-tune the network weights and train it on my dataset.

CondLaneNet based on ResNet-101 has given good results on CULane dataset. BezierLaneNet based on Resnet-34 also gave pretty good results on LLAMAS (Labelled Lane Markers) dataset.

18 of 24

Experiments and Analysis

I will be using a custom dataset which contains images clicked in M-City (test site)

Metrics – F1 score and Speed

Experiment – After successfully completing lane detection, I want to add noise to an image and see how well the lanes are being detected

19 of 24

Evaluation of Self-Supervised Learning�Algorithms for Medical Applications�[Parth Kharwar]

Topic

Motivation

Large scale application

Societal impact

Problems

Sufficient evaluation

Computational limitations

Testing performance of standard algorithms on different tasks

20 of 24

Plans and algorithms

Datasets

Oasis�Brains

CheXpert

CT Medical Images

Approach

Baseline using ImageNET

Recursive testing

21 of 24

Experiments and analysis

Evaluation

Metrics

Expectation

Benchmark: Self-supervised on ImageNET, Supervised on ImageNET, Supervised on Medical Data

Similar performance compared to self-supervised
Lower performance compared supervised

Compare unlabelled, labelled accuracy performance

22 of 24

topic: End-to-end motion planning for autonomous driving system

presenter: Yi Mao, Zhihao Zhang

23 of 24

perception

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, "CARLA: An Open Urban Driving Simulator," arXiv pre-print server, 2017-11-10 2017, doi: None arxiv:1711.03938.

learn from expert

Learn from the environment

planning

24 of 24

KITTI http://www.cvlibs.net/datasets/kitti/

Cityscapes semantic https://www.cityscapes-dataset.com/ with benchmark

Waymo Open perception: Dataset 3D lidar, 2D Camera labels motion: coordinate frames

Argoverse motion forecasting, 3D tracking, stereo depth estimation algorithms, mapping

Photo Tourism http://phototour.cs.washington.edu/

Datasets

Real World

Simulator

Carla

Want to combine all kinds of evaluation metrics to form a new one