google-research-datasets / RxRLinks

Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and Telugu, and 126k navigation following demonstrations. Both annotation types include dense spatiotemporal alignments between the text and the visual per…

☆162

Alternatives and similar repositories for RxR

Users that are interested in RxR are comparing it to the libraries listed below

Sorting:

YicongHong / Recurrent-VLN-BERT
Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
☆190Updated 3 years ago
YuankaiQi / REVERIE
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
☆138Updated 2 years ago
daqingliu / awesome-vln
A curated list of research papers in Vision-Language Navigation (VLN)
☆226Updated last year
GT-RIPL / robo-vln
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
☆84Updated last year
YicongHong / Fine-Grained-R2R
Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation
☆51Updated 3 years ago
valtsblukis / hlsm
☆44Updated 3 years ago
arjunmajum / vln-bert
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)
☆57Updated 3 years ago
alexpashevich / E.T.
Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…
☆90Updated 2 years ago
cshizhe / VLN-HAMT
Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).
☆131Updated 2 years ago
mmurray / cvdn
Cooperative Vision-and-Dialog Navigation
☆71Updated 2 years ago
airbert-vln / airbert
Codebase for the Airbert paper
☆46Updated 2 years ago
saimwani / multiON
Code for reproducing the results of NeurIPS 2020 paper "MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation”
☆54Updated 4 years ago
YicongHong / Thinking-VLN
Ideas and thoughts about the fascinating Vision-and-Language Navigation
☆262Updated 2 years ago
jialuli-luka / VLN-SIG
☆32Updated 2 years ago
vincentcartillier / Semantic-MapNet
☆81Updated 3 years ago
YicongHong / Discrete-Continuous-VLN
Code and Data of the CVPR 2022 paper: Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language N…
☆133Updated last year
facebookresearch / habitat-challenge
Code for the habitat challenge
☆340Updated 2 years ago
jialuli-luka / EnvEdit
Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)
☆30Updated 3 years ago
gunagg / zson
ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings. NeurIPS 2022
☆93Updated 2 years ago
batra-mlp-lab / vln-sim2real
Code for sim-to-real transfer of a pretrained Vision-and-Language Navigation (VLN) agent to a robot using ROS.
☆44Updated 4 years ago
soyeonm / FILM
Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods
☆126Updated 2 years ago
jacobkrantz / VLN-CE
Vision-and-Language Navigation in Continuous Environments using Habitat
☆587Updated 9 months ago
allenai / procthor-10k
The ProcTHOR-10K Houses Dataset
☆112Updated 2 years ago
VegB / VLN-Transformer
Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"
☆26Updated 4 years ago
Ram81 / habitat-web
Habitat-Web is a web application to collect human demonstrations for embodied tasks on Amazon Mechanical Turk (AMT) using the Habitat sim…
☆59Updated 3 years ago
HanqingWangAI / SSM-VLN
Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation"
☆41Updated 4 years ago
allenai / ai2thor-rearrangement
🔀 Visual Room Rearrangement
☆122Updated 2 years ago
allenai / embodied-clip
Official codebase for EmbCLIP
☆132Updated 2 years ago
real-stanford / cow
[CVPR 2023] CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
☆141Updated last year
notmahi / clip-fields
Teaching robots to respond to open-vocab queries with CLIP and NeRF-like neural fields
☆176Updated last year