[ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
☆129Mar 1, 2025Updated last year
Alternatives and similar repositories for TOD3Cap
Users that are interested in TOD3Cap are comparing it to the libraries listed below
Sorting:
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆62Aug 3, 2024Updated last year
- CoRL2024 | Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving☆72Oct 30, 2024Updated last year
- [ECCV 2024] Embodied Understanding of Driving Scenarios☆208Jul 2, 2025Updated 8 months ago
- [ICRA'2024] MonoOcc: Digging into Monocular Semantic Occupancy Prediction☆115Oct 23, 2023Updated 2 years ago
- [ECCV 2024] Official GitHub repository for the paper "LingoQA: Visual Question Answering for Autonomous Driving"☆201Sep 26, 2024Updated last year
- ☆39Jun 8, 2024Updated last year
- Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)☆100Dec 5, 2024Updated last year
- ☆71Aug 12, 2024Updated last year
- [CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Langu…☆311Jul 17, 2024Updated last year
- 😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.☆261Jan 14, 2026Updated last month
- [CVPR 2024] A world model for autonomous driving.☆412Dec 7, 2023Updated 2 years ago
- ☆45Apr 14, 2023Updated 2 years ago
- Code for NeurIPS 2024 "Dual-frame Fluid Motion Estimation with Test-time Optimization and Zero-divergence Loss"☆12Oct 13, 2024Updated last year
- Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving (ICCV 2025)☆36May 29, 2025Updated 9 months ago
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆12Apr 12, 2024Updated last year
- Code&Data for Grounded 3D-LLM with Referent Tokens☆132Jan 5, 2025Updated last year
- Constraint Satisfaction Visual Grounding☆15Aug 10, 2025Updated 6 months ago
- ☆42Jan 26, 2023Updated 3 years ago
- ☆576Feb 22, 2026Updated last week
- [ECCV 2024] Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving☆99Jan 1, 2024Updated 2 years ago
- [ECCV 2024] 3D World Model for Autonomous Driving☆524Apr 12, 2024Updated last year
- [AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Refer…☆44Dec 20, 2023Updated 2 years ago
- OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving☆195May 31, 2024Updated last year
- [CVPR 2024 Highlight] Visual Point Cloud Forecasting☆346Jul 2, 2025Updated 8 months ago
- [COLING 2025] Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs☆52Jan 22, 2025Updated last year
- This repository is an official implementation of ADAPT: Action-aware Driving Caption Transformer, accepted by ICRA 2023.☆418Jun 11, 2024Updated last year
- Driving Everywhere with Large Language Model Policy Adaptation☆17Jul 4, 2024Updated last year
- (ICLR2025) Enhancing End-to-End Autonomous Driving with Latent World Model☆317Jun 29, 2025Updated 8 months ago
- 【IEEE T-IV】A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios☆51May 26, 2024Updated last year
- [AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024☆66Apr 9, 2024Updated last year
- [ECCV 2024] GenAD: Generative End-to-End Autonomous Driving☆475May 27, 2025Updated 9 months ago
- [AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.☆227Nov 1, 2024Updated last year
- A curated list of awesome knowledge-driven autonomous driving (continually updated)☆495Jun 7, 2024Updated last year
- [ECCV 2024] Asynchronous Large Language Model Enhanced Planner for Autonomous Driving☆111May 28, 2025Updated 9 months ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆43Dec 9, 2024Updated last year
- ☆42Oct 19, 2022Updated 3 years ago
- Code for "Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers" (NeurIPS 2024)☆206Oct 20, 2025Updated 4 months ago
- [ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World☆373Oct 21, 2025Updated 4 months ago
- Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"☆278Mar 19, 2025Updated 11 months ago