Dynamic Depth: Disentangling Object Motion and Occlusion for Unsupervised Multi-frame Monocular Depth
Monocular Depth Prediction
Unsupervised �Monocular Depth Prediction
Re-projection Loss is the key for unsupervised monocular depth prediction.
Pose Net
6-DOF Pose
Depth Net
Re-Projection Loss
Multi-frame �Monocular Depth Prediction
Cost volume is proved to be an effective way to leverage temporal frames to improve the overall depth quality, which is also based on the re-projection geometry.
Pose Net
Depth Encoder
Re-
Projection
Depth Decoder
Cost Volume
Re-projection Geometry
Re-Projection
Suppose to match
Dynamic Obj Mismatch problem
Dynamic objects will cause the ‘Mismatch’ problem.
Re-Projection
Obj Motion
Mismatch!
Dynamic Obj Occlusion problem
Dynamic objects will cause ‘Occlusion’ problem.
Re-Projection
Obj Motion
Mismatch!
Occlusion!
Occluded!
Visible!
Motivation:
We Propose DynamicDepth:
Depth Prior Net
Pose Net
Depth
Encoder
Occlusion-aware
cost volume
Depth
Decoder
Dynamic Object Motion Disentanglement
(DOMD)
Dynamic Object Cycle Consistency Loss
Our Contribution:
DOMD Module
Re-project the dynamic object patch with ‘depth prior’ prediction
Depth Prior Prediction
DOMD Module
Replace dynamic object patch with re-projected image patch.
DOMD
DOMD Module
This replacement will alleviate the ‘Mismatch’ problem.
Re-Projection
Obj Motion
Match
Occlusion
Occlusion-aware Cost Volume
…
Occlusion-aware
Cost Volume
Occlusion Filling
…
Sharing Weights
-
-
-
-
…
Warp by All Depth Hypothesis
The occluded areas are filled with non-occluded cost values.
Occlusion-aware Re-projection Loss
Re-proj Error at t-1
The occluded areas are filled with non-occluded cost values.
Occlusion-aware Re-projection Loss
Source Frame
: From visible frame
: From occluded frame
Widely Used Per-pixel min Loss
Re-proj Error at t-1
Occlusion-aware Re-projection Loss
Source Frame
Our Occlusion-aware Loss
: From visible frame
: From occluded frame
Re-proj Error at t-1
Conclusion: Our method outperformed all the other methods on the Cityscapes and KITTI dataset.
Conclusion: Our method significantly outperformed all the other methods especially on the Dynamic objects areas.