InternRobotics / VLM-GrounderLinks
[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
☆110Updated 2 months ago
Alternatives and similar repositories for VLM-Grounder
Users that are interested in VLM-Grounder are comparing it to the libraries listed below
Sorting:
- [CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding☆157Updated 3 months ago
- Unifying 2D and 3D Vision-Language Understanding☆98Updated last week
- InternRobotics' open platform for building generalized navigation foundation models.☆106Updated this week
- ☆49Updated 9 months ago
- Code&Data for Grounded 3D-LLM with Referent Tokens☆126Updated 6 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆162Updated last month
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆55Updated 11 months ago
- ☆108Updated last week
- Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"☆144Updated 2 weeks ago
- [RAL 2024] OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding☆28Updated 5 months ago
- [CVPR 2025] Source codes for the paper "3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning"☆157Updated last month
- Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING☆30Updated last year
- ☆68Updated 6 months ago
- [ICLR 2025] Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention☆23Updated 5 months ago
- [NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding☆95Updated 5 months ago
- [CVPR 2024] Memory-based Adapters for Online 3D Scene Perception☆120Updated 4 months ago
- Official implementation of Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation (CoRL'24).☆67Updated 4 months ago
- ☆64Updated 2 months ago
- Open-source implementations on real robots☆34Updated 8 months ago
- ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images (NeurIPS2024)☆82Updated 7 months ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆60Updated last month
- Official code for the CVPR 2025 paper "Navigation World Models".☆327Updated 3 weeks ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆153Updated last month
- 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.☆67Updated 3 months ago
- Official repository of General Scene Adaptation for Vision-and-Language Navigation (ICLR'2025)☆47Updated 3 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆111Updated 4 months ago
- Official implementation of NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments (ICCV'25).☆22Updated 3 weeks ago
- Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"☆247Updated 4 months ago
- ☆20Updated 2 months ago
- [ICCV 2025] Detect Anything 3D in the Wild☆161Updated 3 weeks ago