Differentiable Raycasting for Self-supervised Occupancy Forecasting

ECCV 2022

Tarasha Khurana*1 Peiyun Hu*1 Achal Dave1 Jason Ziglar2 David Held1,2 Deva Ramanan1,2
1Robotics Institute, Carnegie Mellon University
2Argo AI *Equal contribution

code
teaser
We propose emergent occupancy as a novel self-supervised representation for motion planning. Occupancy is independent of changes in sensor pose Δy, which is in contrast to prior work on self-supervised learning from LiDAR, specifically, ego-centric freespace, which changes with (a-b) sensor pose motion Δy and (b-c) scene motion Δs. We use differentiable raycasting to naturally decouple ego motion from scene motion, allowing us to learn to forecast occupancy by self-supervision from pose-aligned LiDAR sweeps.
teaser PDF / BibTeX

Abstract

Motion planning for safe autonomous driving requires learning how the environment around an ego-vehicle evolves with time. Ego-centric perception of driveable regions in a scene not only changes with the motion of actors in the environment, but also with the movement of the ego-vehicle itself. Self-supervised representations proposed for large-scale planning, such as ego-centric freespace, confound these two motions, making the representation difficult to use for downstream motion planners. In this paper, we use geometric occupancy as a natural alternative to view-dependent representations such as freespace. Occupancy maps naturally disentagle the motion of the environment from the motion of the ego-vehicle. However, one cannot directly observe the full 3D occupancy of a scene (due to occlusion), making it difficult to use as a signal for learning. Our key insight is to use differentiable raycasting to "render" future occupancy predictions into future LiDAR sweep predictions, which can be compared with ground-truth sweeps for self-supervised learning. The use of differentiable raycasting allows occupancy to emerge as an internal representation within the forecasting network. In the absence of groundtruth occupancy, we quantitatively evaluate the forecasting of raycasted LiDAR sweeps and show improvements of upto 15 F1 points. For downstream motion planners, where emergent occupancy can be directly used to guide non-driveable regions, this representation relatively reduces the number of collisions with objects by up to 17% as compared to freespace-centric motion planners.

github

Code

Our code has been released at https://github.com/tarashakhurana/emergent-occ-forecasting.

Acknowledgments

This work was supported by the CMU Argo AI Center for Autonomous Vehicle Research.

Webpage template stolen from the amazing Peiyun Hu.