What You See is What You Get: Exploiting Visibility for 3D Object Detection
Peiyun Hu, Jason Ziglar, David Held, Deva Ramanan �CVPR 2020
Nicholas Vadivelu
2020/07/07
Motivation
Contributions
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Ray Casting Overview (2D case)
# V is visibility: a multichannel 2D feature map �V[:] <- UNKNOWN
for each LiDAR Point (a, b, c):� x, y, z <- source� while (x, y, z) != (a, b, c):� V[x, y, z] <- FREE� x, y, z <- next voxel on ray� V[a, b, c] <- BLOCKED
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Object Augmentation
Temporal Aggregation
Approach: A Two-stream Network
Experiments
Ablation: Late vs Early Fusion
Ablation: Types of Object Augmentation
Ablation
Ablation: Object Augmentation
Ablation: Temporal Aggregation
Ablation: Visibility Stream
Related Work: Visibility
Thoughts
Thanks for Listening!