[ICCV 25] Official repository of "Collaborative Instance Object Navigation: Leveraging Uncertainty-Awareness to Minimize Human-Agent Dialogues"
☆25Dec 6, 2025Updated 2 months ago
Alternatives and similar repositories for CoIN
Users that are interested in CoIN are comparing it to the libraries listed below
Sorting:
- Code for ICRA24 paper "Think, Act, and Ask: Open-World Interactive Personalized Robot Navigation" Paper//arxiv.org/abs/2310.07968 …☆31Jun 18, 2024Updated last year
- Starter code and instructions for participating in MultiON Challenge 2021.☆12Jun 12, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- DreamSmooth: Improving Model-Based RL with Reward Smoothing (ICLR 2024)☆12May 6, 2024Updated last year
- ☆14Aug 28, 2024Updated last year
- Code for MME-SID accepted to CIKM 2025 Full Research track.☆27Oct 29, 2025Updated 4 months ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 5 months ago
- quagga☆10Apr 7, 2020Updated 5 years ago
- Agentic Keyframe Search for Video Question Answering☆16Apr 7, 2025Updated 10 months ago
- Dataset for Image-Goal Navigation in Habitat☆11Feb 24, 2022Updated 4 years ago
- ☆16Oct 9, 2024Updated last year
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 7 months ago
- Project for SNARE benchmark☆11Jun 5, 2024Updated last year
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"☆15Oct 2, 2025Updated 5 months ago
- Code for our works: LCSA, C2SLR, and SRM☆20Nov 22, 2024Updated last year
- HARPER is a HRI dataset for 3D Human Pose Estimation and Forecasting from the Robot’s Perspective.☆13Sep 2, 2025Updated 6 months ago
- [ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring☆24Aug 8, 2025Updated 6 months ago
- [EMNLP 2024 Industry track] MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank P…☆14Mar 4, 2025Updated last year
- Minha versão do Flappy Bird escrita em Assembly 8086 ^-^☆13Nov 2, 2024Updated last year
- This repository contains all files and exercises done from chapter 1 to 6, including some exercises for other chapters☆16Jun 10, 2023Updated 2 years ago
- Repository for Offline Visual Representation Learning v1 and v2☆13Jan 24, 2023Updated 3 years ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆16Oct 31, 2024Updated last year
- ☆12Jan 10, 2025Updated last year
- ☆12Oct 5, 2020Updated 5 years ago
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆16May 8, 2025Updated 9 months ago
- [ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval☆13Nov 5, 2023Updated 2 years ago
- ☆14Dec 16, 2021Updated 4 years ago
- ☆13Jan 3, 2024Updated 2 years ago
- Official GitHub repository for the paper "Adversarial Attacks on Robotic Vision Language Action Models"☆29May 28, 2025Updated 9 months ago
- RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.☆13May 26, 2023Updated 2 years ago
- [ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts☆14Jan 13, 2025Updated last year
- The code implementation for UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings (ICLR 2026).☆43Feb 25, 2026Updated last week
- [ICLR 2025] Data-Augmented Phrase-Level Alignment for Mitigating Object Hallucination☆19Jan 27, 2025Updated last year
- Code for Generalizable Articulated Object Reconstruction from Casually Captured RGBD Videos☆25Nov 17, 2025Updated 3 months ago
- Code for Multi-Aspect Cross-modal Quantization for Generative Recommendation. (AAAI 2026 Oral)☆30Dec 9, 2025Updated 2 months ago
- 北航 计算机组成原理 课程设计 计组☆11Dec 14, 2021Updated 4 years ago
- A tiny PyTorch library for depth map manipulations.☆13Apr 11, 2024Updated last year
- [ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…☆22Jul 28, 2025Updated 7 months ago