google-research-datasets / RxRLinks
Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual per…
☆164Updated 2 years ago
Alternatives and similar repositories for RxR
Users that are interested in RxR are comparing it to the libraries listed below
Sorting:
- Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation☆196Updated 3 years ago
- REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments☆141Updated 2 years ago
- A curated list of research papers in Vision-Language Navigation (VLN)☆229Updated last year
- ☆45Updated 3 years ago
- Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"☆88Updated last year
- Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation☆51Updated 4 years ago
- Habitat-Web is a web application to collect human demonstrations for embodied tasks on Amazon Mechanical Turk (AMT) using the Habitat sim…☆59Updated 3 years ago
- Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…☆92Updated 2 years ago
- The ProcTHOR-10K Houses Dataset☆113Updated 2 years ago
- 🔀 Visual Room Rearrangement☆123Updated 2 years ago
- ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings. NeurIPS 2022☆94Updated 2 years ago
- Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)☆57Updated 3 years ago
- Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).☆135Updated 2 years ago
- Code for sim-to-real transfer of a pretrained Vision-and-Language Navigation (VLN) agent to a robot using ROS.☆45Updated 5 years ago
- Code for reproducing the results of NeurIPS 2020 paper "MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation”☆55Updated 4 years ago
- Teaching robots to respond to open-vocab queries with CLIP and NeRF-like neural fields☆177Updated last year
- Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods☆128Updated 2 years ago
- 🏘️ Scaling Embodied AI by Procedurally Generating Interactive 3D Houses☆396Updated 2 years ago
- [CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'☆211Updated last year
- Official codebase for EmbCLIP☆132Updated 2 years ago
- Code repository for the Habitat Synthetic Scenes Dataset (HSSD) paper.☆106Updated last year
- [CVPR 2023] CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation☆144Updated 2 years ago
- Code for training embodied agents using imitation learning at scale in Habitat-Lab☆44Updated 7 months ago
- Ideas and thoughts about the fascinating Vision-and-Language Navigation☆275Updated 2 years ago
- Codebase for the Airbert paper☆46Updated 2 years ago
- Utility functions when working with Ai2-THOR. Try to do one thing once.☆54Updated 3 years ago
- ☆33Updated 2 years ago
- [ICCV'23] Learning Vision-and-Language Navigation from YouTube Videos☆64Updated 10 months ago
- SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World☆139Updated last year
- Cooperative Vision-and-Dialog Navigation☆71Updated 2 years ago