google-research-datasets / RxR
Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual per…
☆124Updated last year
Alternatives and similar repositories for RxR:
Users that are interested in RxR are comparing it to the libraries listed below
- Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…☆88Updated last year
- Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation☆159Updated 2 years ago
- Vision and Language Agent Navigation☆73Updated 3 years ago
- Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"☆73Updated 6 months ago
- REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments☆116Updated last year
- 🔀 Visual Room Rearrangement☆106Updated last year
- A curated list of research papers in Vision-Language Navigation (VLN)☆190Updated 9 months ago
- Official codebase for EmbCLIP☆117Updated last year
- ☆43Updated 2 years ago
- Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation☆44Updated 3 years ago
- Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)☆52Updated 2 years ago
- Code for reproducing the results of NeurIPS 2020 paper "MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation”☆47Updated 4 years ago
- Ideas and thoughts about the fascinating Vision-and-Language Navigation☆174Updated last year
- Vision-and-Language Navigation in Continuous Environments using Habitat☆333Updated last week
- ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings. NeurIPS 2022☆65Updated last year
- Codebase for the Airbert paper☆42Updated last year
- Code and models of MOCA (Modular Object-Centric Approach) proposed in "Factorizing Perception and Policy for Interactive Instruction Foll…☆37Updated 6 months ago
- An open source framework for research in Embodied-AI from AI2.☆323Updated last week
- Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).☆107Updated last year
- Cooperative Vision-and-Dialog Navigation☆68Updated 2 years ago
- The ProcTHOR-10K Houses Dataset☆91Updated 2 years ago
- Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods☆118Updated last year
- Code for the paper Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration☆93Updated 2 years ago
- TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.☆135Updated 8 months ago
- 🐍 A Python Package for Seamless Data Distribution in AI Workflows☆21Updated last year
- Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…☆24Updated 3 years ago
- large scale pretrain for navigation task☆89Updated last year
- Utility functions when working with Ai2-THOR. Try to do one thing once.☆45Updated 2 years ago
- A mini-framework for running AI2-Thor with Docker.☆33Updated 8 months ago
- ☆45Updated 2 years ago