Skip to content

Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

License

Notifications You must be signed in to change notification settings

hustvl/DiffusionDrive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiffusionDrive

Truncated Diffusion Model for End-to-End Autonomous Driving

Bencheng Liao1,2, Shaoyu Chen2,3, Haoran Yin3, Bo Jiang2, Cheng Wang1,2, Sixu Yan2, Xinbang Zhang3, Xiangyu Li3, Ying Zhang3, Qian Zhang3, Xinggang Wang2 📧

1 Institute of Artificial Intelligence, HUST, 2 School of EIC, HUST, 3 Horizon Robotics

(📧) corresponding author, [email protected]

DiffusionDrive  huggingface weights 

News

  • Jan. 18th, 2025: We release the initial version of code and weight on nuScenes, along with documentation and training/evaluation scripts. Please run git checkout nusc to use it.
  • Dec. 16th, 2024: We release the initial version of code and weight on NAVSIM, along with documentation and training/evaluation scripts.
  • Nov. 25th, 2024: We released our paper on Arxiv. Code/Models are coming soon. Please stay tuned! ☕️

Table of Contents

Introduction

Diffusion policy exhibits promising multimodal property and distributional expressivity in robotic field, while not ready for real-time end-to-end autonomous driving in more dynamic and open-world traffic scenes. To bridge this gap, we propose a novel truncated diffusion model, DiffusionDrive, for real-time end-to-end autonomous driving, which is much faster (10x reduction in diffusion denoising steps), more accurate (3.5 higher PDMS on NAVSIM), and more diverse (64% higher mode diversity score) than the vanilla diffusion policy. Without bells and whistles, DiffusionDrive achieves record-breaking 88.1 PDMS on NAVSIM benchmark with the same ResNet-34 backbone by directly learning from human demonstrations, while running at a real-time speed of 45 FPS.

Truncated Diffusion Policy. Pipeline of DiffusionDrive. DiffusionDrive is highly flexible to integrate with onboard sensor data and existing perception modules.

Qualitative Results on NAVSIM Navtest Split

Going straight with car-following and lane-changing behaviors. Going straight with diverse lane-changing behavior, which interacts with traffic light and stops at the stop line. Turning left with diverse lane-changing behavior, which interacts with surrounding agents. Turning right with car-following and overtaking behaviors.

Video Demo on Real-world Application

final_github.mp4

Getting Started

Checkpoint

Results on NAVSIM

Method Model Size Backbone PDMS Weight Download
DiffusionDrive 60M ResNet-34 88.1 Hugging Face

Results on nuScenes

Method Backbone Weight Log L2 (m) 1s L2 (m) 2s L2 (m) 3s L2 (m) Avg Col. (%) 1s Col. (%) 2s Col. (%) 3s Col. (%) Avg
DiffusionDrive ResNet-50 HF Github 0.27 0.54 0.90 0.57 0.03 0.05 0.16 0.08

Contact

If you have any questions, please contact Bencheng Liao via email ([email protected]).

Acknowledgement

DiffusionDrive is greatly inspired by the following outstanding contributions to the open-source community: NAVSIM, Transfuser, Diffusion Policy, MapTR, VAD, SparseDrive.

Citation

If you find DiffusionDrive is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

 @article{diffusiondrive,
  title={DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving},
  author={Bencheng Liao and Shaoyu Chen and Haoran Yin and Bo Jiang and Cheng Wang and Sixu Yan and Xinbang Zhang and Xiangyu Li and Ying Zhang and Qian Zhang and Xinggang Wang},
  journal={arXiv preprint arXiv:2411.15139},
  year={2024}
}