sled-group / DOROTHIE
Official Code for DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents (Findings of EMNLP 2022)
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for DOROTHIE
- A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and A…☆35Updated this week
- A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-t…☆75Updated last month
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆122Updated 3 weeks ago
- Official Implementation of 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs☆30Updated 5 months ago
- [Official] [IROS 2024] A goal-oriented planning to lift VLN performance for Closed-Loop Navigation: Simple, Yet Effective☆26Updated 7 months ago
- ☆33Updated last year
- Official implementation for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding☆42Updated last year
- Code for Stable Control Representations☆17Updated 5 months ago
- ☆50Updated 3 months ago
- Scaffold Prompting to promote LMMs☆30Updated 6 months ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated 7 months ago
- A paper list of world model☆25Updated 6 months ago
- Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving☆72Updated 10 months ago
- [ICLR 2023] SQA3D for embodied scene understanding and reasoning☆117Updated last year
- Official repository of S-Agents: Self-organizing Agents in Open-ended Environment☆17Updated 8 months ago
- [ECCV 2024] The official code for "Dolphins: Multimodal Language Model for Driving“☆47Updated 4 months ago
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆29Updated last week
- Code&Data for Grounded 3D-LLM with Referent Tokens☆89Updated last month
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated last month
- ☆12Updated 5 months ago
- ☆61Updated last month
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆110Updated 3 months ago
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆53Updated last month
- ☆104Updated last year
- [ECCV 2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining☆80Updated last year
- Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)☆68Updated 2 weeks ago
- ☆50Updated 8 months ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆69Updated 3 weeks ago
- Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"☆190Updated last year
- Official implementation of WebVLN: Vision-and-Language Navigation on Websites☆24Updated 10 months ago