google-research-datasets / RxR
Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual per…
☆142Updated last year
Alternatives and similar repositories for RxR:
Users that are interested in RxR are comparing it to the libraries listed below
- Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation☆173Updated 2 years ago
- Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"☆78Updated 9 months ago
- Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…☆90Updated last year
- 🔀 Visual Room Rearrangement☆113Updated last year
- REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments☆123Updated last year
- A curated list of research papers in Vision-Language Navigation (VLN)☆204Updated last year
- Ideas and thoughts about the fascinating Vision-and-Language Navigation☆216Updated last year
- Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation☆45Updated 3 years ago
- Vision-and-Language Navigation in Continuous Environments using Habitat☆416Updated 3 months ago
- Vision and Language Agent Navigation☆76Updated 4 years ago
- ☆49Updated 3 years ago
- Official codebase for EmbCLIP☆122Updated last year
- ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings. NeurIPS 2022☆73Updated 2 years ago
- Teaching robots to respond to open-vocab queries with CLIP and NeRF-like neural fields☆165Updated last year
- ☆44Updated 2 years ago
- Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).☆121Updated last year
- The ProcTHOR-10K Houses Dataset☆100Updated 2 years ago
- Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods☆122Updated 2 years ago
- [CVPR 2023] CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation☆127Updated last year
- Code for reproducing the results of NeurIPS 2020 paper "MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation”☆49Updated 4 years ago
- Code for training embodied agents using imitation learning at scale in Habitat-Lab☆40Updated last week
- RoboTHOR Challenge☆88Updated 3 years ago
- Codebase for the Airbert paper☆45Updated 2 years ago
- Official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation (CVPR'22 Oral).☆170Updated last year
- Cooperative Vision-and-Dialog Navigation☆70Updated 2 years ago
- Code and Data of the CVPR 2022 paper: Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language N…☆117Updated last year
- Utility functions when working with Ai2-THOR. Try to do one thing once.☆45Updated 2 years ago
- 🏘️ Scaling Embodied AI by Procedurally Generating Interactive 3D Houses☆332Updated 2 years ago
- PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning. CVPR 2022 (Oral).☆97Updated 2 years ago
- Code for the habitat challenge☆325Updated 2 years ago