jxbbb / TOD3Cap
[ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
☆113Updated 3 weeks ago
Alternatives and similar repositories for TOD3Cap:
Users that are interested in TOD3Cap are comparing it to the libraries listed below
- ☆42Updated 2 months ago
- ☆83Updated 2 months ago
- Doe-1: Closed-Loop Autonomous Driving with Large World Model☆88Updated 2 months ago
- Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model☆78Updated 3 months ago
- Source code for NeurIPS paper "POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images"☆104Updated 2 months ago
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆65Updated 3 months ago
- [NeurIPS 2024] A Unified Framework for 3D Scene Understanding☆135Updated 4 months ago
- [CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding☆122Updated this week
- BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence☆34Updated last week
- Project Page for GaussianFormer☆25Updated 9 months ago
- Official Code Release of Delphi☆54Updated 9 months ago
- Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving (AAAI-25)☆34Updated last month
- OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving☆168Updated 9 months ago
- [ICRA'2024] MonoOcc: Digging into Monocular Semantic Occupancy Prediction☆97Updated last year
- [ECCV 2024] Monocular Occupancy Prediction for Scalable Indoor Scenes☆54Updated 6 months ago
- Code release for our NeurIPS 2023 paper "Uni3DETR: Unified 3D Detection Transformer", our ECCV 2024 paper "OV-Uni3DETR: Towards Unified O…☆96Updated 7 months ago
- [ECCV 2024] Occupancy as Set of Points☆88Updated 8 months ago
- Official implementation of "LidarDM: Generative LiDAR Simulation in a Generated World" (ICRA 2025)☆148Updated 10 months ago
- ☆35Updated 3 months ago
- Code for CVPR2025 paper: Generating Multimodal Driving Scenes via Next-Scene Prediction☆33Updated last week
- ☆90Updated last year
- Code for "Open Vocabulary Monocular 3D Object Detection"☆40Updated last month
- ☆104Updated 8 months ago
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆51Updated 7 months ago
- Code&Data for Grounded 3D-LLM with Referent Tokens☆108Updated 2 months ago
- BEVGen☆73Updated last year
- [CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"☆220Updated 7 months ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆91Updated 4 months ago
- [CVPR 2025] ReconDreamer☆118Updated 3 months ago
- ☆77Updated last year