1 of 27

CSE 5539: �Depth Estimation

2 of 27

LiDAR-based 3D perception

[Source: Graham Murdoch/Popular Science]

LiDAR:

  • Light Detection and Ranging sensor
  • accurate 3D point clouds of the environment

3 of 27

What is the problem?

  • LiDAR is expensive:

A car: $ 20K

  • Over-reliance on LiDAR is risky (not robust)
  • Can we use the affordable optical cameras?

16-line: $ 8K

64-line: $ 75K

4 of 27

LiDAR vs. camera-based depth

5 of 27

Camera-based depth estimation

6 of 27

Pseudo-LiDAR representation

7 of 27

Stereo Depth Estimation

Another Look

8 of 27

Pseudo-LiDAR framework

  • Pseudo-LiDAR representation: gluing depth estimation + LiDAR-based detection

  • SOTA camera-based depth estimators and LiDAR-based object detectors can seamlessly be incorporated!

Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger,

"Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving," CVPR, 2019

Depth estimation models

  • DORN (CVPR18)
  • MC-CNN (JMLR 16)
  • GC-Net (ICCV 17)
  • PSMNet (CVPR18)
  • ……

3D object detection models

  • PIXOR (CVPR 18)
  • F-PointNet (CVPR 18)
  • AVOD (IROS 18)
  • PointRCNN (CVPR 19)
  • ……

9 of 27

Trivial? but not really!

  • Our explanation: camera-based depth representation is poor
  • LiDAR-based: 3D point clouds

  • Camera-based: 2D depth maps

[Source: Mask R-CNN, ICCV 2017]

[Source: VoxelNet, CVPR 2018]

10 of 27

Experimental results (AP:BEV / AP:3D)

[4] X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler, and R. Urtasun. Monocular 3d object detection for autonomous driving. In CVPR, 2016.

[5] X. Chen, K. Kundu, Y. Zhu, A. G. Berneshawi, H. Ma, S. Fidler, and R. Urtasun. 3d object proposals for accurate object class detection. In NIPS, 2015.

[16] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. Waslander. Joint 3d proposal generation and object detection from view aggregation. In IROS, 2018.

[23] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas. Frustum pointnets for 3d object detection from rgb-d data. In CVPR, 2018.

[30] B. Xu and Z. Chen. Multi-level fusion based 3d object detection from monocular images. In CVPR, 2018.

~300% improvement

11 of 27

LiDAR

pseudo-LiDAR

Depth-map

12 of 27

Experimental results (AP:BEV / AP:3D)

[4] X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler, and R. Urtasun. Monocular 3d object detection for autonomous driving. In CVPR, 2016.

[5] X. Chen, K. Kundu, Y. Zhu, A. G. Berneshawi, H. Ma, S. Fidler, and R. Urtasun. 3d object proposals for accurate object class detection. In NIPS, 2015.

[16] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. Waslander. Joint 3d proposal generation and object detection from view aggregation. In IROS, 2018.

[23] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas. Frustum pointnets for 3d object detection from rgb-d data. In CVPR, 2018.

[30] B. Xu and Z. Chen. Multi-level fusion based 3d object detection from monocular images. In CVPR, 2018.

13 of 27

Improvement upon pseudo-LiDAR

  • Improved camera-based depth estimation
  • End-to-end training
  • Multi-sensor fusion

14 of 27

Stereo depth estimation

=

Il

Ir

D

Z

disparity

depth

15 of 27

Stereo depth estimation

Disparity Map

Left

Right

disparity

Depth Map

16 of 27

Stereo depth estimation

Left

Right

 

 

 

Probability

 

Disparity

17 of 27

Stereo depth estimation

Left

Right

Neural

Net

Prob.

 

 

 

Probability

 

Disparity

18 of 27

Stereo depth estimation

=

Il

Ir

D

Z

disparity

depth

19 of 27

Improved stereo depth estimation

=

Il

Ir

D

Z

Optimizing the depth error

disparity

depth

20 of 27

Improved stereo depth estimation

Il

Ir

Z

Stereo depth network (SDN)

Optimizing the depth error

depth

21 of 27

Separated training

Depth

estimation

Left Image

Right Image

Depth loss

Depth map

Detection results

Object detection loss

3D object

detection

Point cloud/Voxel

Non differentiable

conversion

22 of 27

End-to-end training

Depth

estimation

Change of Representation

Left Image

Right Image

Detection results

Object detection loss

Depth loss

Depth map

Point cloud/Voxel

3D object

detection

23 of 27

Experimental results (AP: BEV; moderate)

Yurong You*, Yan Wang*, Wei-Lun Chao*, Divyansh Garg, Geoff Pleiss, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger, "Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving,”

ICLR, 2020

IoU = 0.5

IoU = 0.7

54

77

90

20

56

88

Stereo

depth

map

PL

LiDAR

Stereo

depth

map

PL

84

64

PL++

PL++

LiDAR

24 of 27

Experimental results (AP: BEV; moderate)

Rui Qian*, Divyansh Garg*, Yan Wang, Yurong You, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, and Wei-Lun Chao, "End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection,"

CVPR, 2020

IoU = 0.5

IoU = 0.7

54

77

84

90

20

56

64

88

Stereo

depth

map

PL

PL++

LiDAR

Stereo

depth

map

PL

PL++

85

66

E2E-PL

E2E-PL

LiDAR

25 of 27

Multi-sensor fusion (depth completion/correction)

4-line LiDAR

  • Cheap: ~$600
  • Very accurate
  • Extreme sparse

Graph-based Depth Correction (GDC)

26 of 27

Graph-based depth correction (GDC)

27 of 27

Experimental results (AP: BEV; moderate)

54

77

84

85

90

20

56

64

66

88

Stereo

depth

map

PL

PL++

E2E-PL

LiDAR

Stereo

depth

map

PL

PL++

E2E-PL

88

77

PL++

(GDC)

PL++

(GDC)

LiDAR

Yurong You*, Yan Wang*, Wei-Lun Chao*, Divyansh Garg, Geoff Pleiss, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger, "Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving,”

ICLR, 2020

IoU = 0.5

IoU = 0.7