google-research-datasets / RxR
Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual per…
☆113Updated last year
Related projects: ⓘ
- A curated list of research papers in Vision-Language Navigation (VLN)☆180Updated 5 months ago
- Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation☆149Updated 2 years ago
- Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…☆83Updated last year
- Vision and Language Agent Navigation☆71Updated 3 years ago
- Vision-and-Language Navigation in Continuous Environments using Habitat☆261Updated 9 months ago
- Ideas and thoughts about the fascinating Vision-and-Language Navigation☆140Updated last year
- Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation☆42Updated 2 years ago
- Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"☆66Updated 2 months ago
- REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments☆109Updated last year
- ☆39Updated 2 years ago
- Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)☆52Updated last year
- 🔀 Visual Room Rearrangement☆104Updated last year
- Codebase for the Airbert paper☆41Updated last year
- Code and models of MOCA (Modular Object-Centric Approach) proposed in "Factorizing Perception and Policy for Interactive Instruction Foll…☆37Updated 2 months ago
- Cooperative Vision-and-Dialog Navigation☆66Updated last year
- Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods☆113Updated last year
- The ProcTHOR-10K Houses Dataset☆76Updated last year
- large scale pretrain for navigation task☆85Updated last year
- TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.☆132Updated 4 months ago
- 🐍 A Python Package for Seamless Data Distribution in AI Workflows☆19Updated 9 months ago
- ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings. NeurIPS 2022☆59Updated last year
- Repository for DialFRED.☆40Updated last year
- Code release for Fried et al., Speaker-Follower Models for Vision-and-Language Navigation. in NeurIPS, 2018.☆127Updated last year
- Official codebase for EmbCLIP☆111Updated last year
- PyTorch code for ICLR 2019 paper: Self-Monitoring Navigation Agent via Auxiliary Progress Estimation☆118Updated 11 months ago
- Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…☆24Updated 3 years ago
- ☆59Updated 2 years ago
- Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).☆97Updated last year
- Feature resources of "Diagnosing the Environment Bias in Vision-and-Language Navigation"☆17Updated 4 years ago
- PyTorch Code of NAACL 2019 paper "Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout"☆122Updated 2 years ago