XiandaGuo / Drive-MLLM
☆31Updated last week
Alternatives and similar repositories for Drive-MLLM:
Users that are interested in Drive-MLLM are comparing it to the libraries listed below
- ☆56Updated 6 months ago
- Official repository for paper "Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving"☆27Updated 2 weeks ago
- Doe-1: Closed-Loop Autonomous Driving with Large World Model☆85Updated last month
- ☆54Updated 2 months ago
- Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives☆52Updated 2 weeks ago
- [CVPR 2024] LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs☆29Updated 11 months ago
- Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)☆84Updated 3 months ago
- Official repository for the NuScenes-MQA. This paper is accepted by LLVA-AD Workshop at WACV 2024.☆25Updated last year
- Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving☆79Updated last year
- [AAAI2025] Language Prompt for Autonomous Driving☆132Updated 2 months ago
- 【IEEE T-IV】A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios☆49Updated 9 months ago
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆61Updated 3 months ago
- [CVPR 2024] MAPLM: A Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding☆125Updated last year
- [ECCV 2024] Asynchronous Large Language Model Enhanced Planner for Autonomous Driving☆64Updated last week
- ☆69Updated 5 months ago
- [ECCV 2024] Official implementation for "RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception"☆27Updated 3 months ago
- [AAAI24]This is the implementation for the paper M-BEV: Masked BEV Perception for Robust Autonomous Driving☆38Updated 11 months ago
- CoMamba: Real-time Cooperative Perception Unlocked with State Space Models☆21Updated 5 months ago
- ☆12Updated 9 months ago
- [IROS 2023] DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception☆29Updated last year
- [ECCV2024] UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving☆53Updated 6 months ago
- [ECCV 2024] Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression☆40Updated 5 months ago
- (ICLR2025) Enhancing End-to-End Autonomous Driving with Latent World Model☆120Updated last week
- [Official] [IROS 2024] A goal-oriented planning to lift VLN performance for Closed-Loop Navigation: Simple, Yet Effective☆28Updated 11 months ago
- officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"☆14Updated 8 months ago
- Code release for the ECCV 2024 paper 'Fully Test-Time Adaptation for Monocular 3D Object Detection'☆42Updated 3 months ago
- [ICCV 2023] GeoMIM: towards better 3d knowledge transfer via masked image modeling for multi-view 3d understanding☆47Updated last year
- CoRL2024 | Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving☆55Updated 4 months ago
- ☆28Updated 6 months ago