raphael-sch / VELMALinks

VELMA agent for VLN in Street View

☆28

Alternatives and similar repositories for VELMA

Users that are interested in VELMA are comparing it to the libraries listed below

Sorting:

expectorlin / NavCoT
Code of the paper "NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning" (TPAMI 2025)
☆115Updated 5 months ago
wzcai99 / Pixel-Navigator
Official GitHub Repository for Paper "Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill", …
☆123Updated last year
GengzeZhou / NavGPT
[AAAI 2024] Official implementation of NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
☆289Updated 2 years ago
Ram81 / goat-bench
☆115Updated last year
zd11024 / NaviLLM
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'
☆212Updated last year
xyz9911 / FLAME
[AAAI-25 Oral] Official Implementation of "FLAME: Learning to Navigate with Multimodal LLM in Urban Environments"
☆66Updated 3 weeks ago
cshizhe / VLN-DUET
Official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation (CVPR'22 Oral).
☆233Updated 2 years ago
chen-judge / MapGPT
[ACL 24] The official implementation of MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation.
☆116Updated 6 months ago
LaVi-Lab / NaviLLM
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'
☆56Updated last year
GengzeZhou / NavGPT-2
[ECCV 2024] Official implementation of NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
☆220Updated last year
LYX0501 / DiscussNav
☆37Updated last year
Rongtao-Xu / Awesome-LLM-EN
☆120Updated 2 years ago
jackyzengl / GRID
☆28Updated 5 months ago
gunagg / zson
ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings. NeurIPS 2022
☆95Updated 2 years ago
HCPLab-SYSU / EXPRESS-Bench
Embodied Question Answering (EQA) benchmark and method (ICCV 2025)
☆42Updated 3 months ago
zhangyuejoslin / VLN-Survey-with-Foundation-Models
[TMLR 2024] repository for VLN with foundation models
☆213Updated last month
LYX0501 / InstructNav
☆179Updated 8 months ago
Stanford-ILIAD / explore-eqa
Public release for "Explore until Confident: Efficient Exploration for Embodied Question Answering"
☆71Updated last year
JeremyLinky / YouTube-VLN
[ICCV'23] Learning Vision-and-Language Navigation from YouTube Videos
☆64Updated 11 months ago
expectorlin / CONSOLE
Code of the paper "Correctable Landmark Discovery via Large Models for Vision-Language Navigation" (TPAMI 2024)
☆16Updated last year
CrystalSixone / VLN-GOAT
Repository for Vision-and-Language Navigation via Causal Learning (Accepted by CVPR 2024)
☆92Updated 5 months ago
vdorbala / LGX
Code for LGX (Language Guided Exploration). We use LLMs to perform embodied robot navigation in a zero-shot manner.
☆66Updated 2 years ago
XinyuSun / FGPrompt
official implementation of NeurIPS 2023 paper "FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation"
☆37Updated last year
XinyuSun / PSL-InstanceNav
official implementation for ECCV 2024 paper "Prioritized Semantic Learning for Zero-shot Instance Navigation"
☆42Updated 5 months ago
Feliciaxyao / ICML2024-FSTTA
Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation
☆28Updated last year
yhanCao / CogNav_ObjNav
the official implementation of CogNav [ICCV 2025]
☆50Updated 2 months ago
ybgdgh / L3MVN
Leveraging Large Language Models for Visual Target Navigation
☆141Updated 2 years ago
naokiyokoyama / ovon
Open Vocabulary Object Navigation
☆101Updated 6 months ago
B0B8K1ng / WMNavigation
[IROS'25 Oral] WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation
☆131Updated last month
bagh2178 / SG-Nav
[NeurIPS 2024] SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
☆285Updated 2 months ago