B0B8K1ng / WMNavigation
☆17Updated 2 weeks ago
Alternatives and similar repositories for WMNavigation:
Users that are interested in WMNavigation are comparing it to the libraries listed below
- BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence☆38Updated last week
- Official repository of General Scene Adaptation for Vision-and-Language Navigation (ICLR'2025)☆26Updated 2 weeks ago
- [TMLR 2024] repository for VLN with foundation models☆81Updated last week
- ☆18Updated this week
- [CVPR 2025] RoomTour3D - Geometry-aware, cheap and automatic data from web videos for embodied navigation☆29Updated 2 weeks ago
- [CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'☆35Updated 11 months ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆93Updated 4 months ago
- [ECCV24] Navigation Instruction Generation with BEV Perception and Large Language Models☆27Updated 8 months ago
- Official implementation of Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation (CoRL'24).☆60Updated 3 weeks ago
- [CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding☆93Updated 3 weeks ago
- Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING☆29Updated 8 months ago
- ☆22Updated 2 months ago
- [CVPR 2025] UniGoal: Towards Universal Zero-shot Goal-oriented Navigation☆36Updated 2 weeks ago
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆52Updated 8 months ago
- officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"☆14Updated 8 months ago
- Official implementation of SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts☆17Updated 3 months ago
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆113Updated last month
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆67Updated 5 months ago
- ☆49Updated 6 months ago
- [AAAI-25 Oral] Official Implementation of "FLAME: Learning to Navigate with Multimodal LLM in Urban Environments"☆44Updated last month
- Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method (CVPR-25)☆50Updated last week
- Project Page for GaussianFormer☆25Updated 10 months ago
- ☆36Updated last month
- ☆44Updated 2 months ago
- [ECCV 2024] Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression☆41Updated 6 months ago
- Code of the paper "NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning" (TPAMI 2025)☆57Updated 2 weeks ago
- CoRL2024 | Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving☆55Updated 5 months ago
- [IJCV 2024]☆14Updated 4 months ago
- Benchmark and model for step-by-step reasoning in autonomous driving.☆38Updated 2 weeks ago
- [CVPR 2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos☆66Updated last week