Code and models of MOCA (Modular Object-Centric Approach) proposed in "Factorizing Perception and Policy for Interactive Instruction Following" (ICCV 2021). We address the task of long horizon instruction following with a modular architecture that decouples a task into visual perception and action policy prediction.
☆40Jun 21, 2024Updated last year
Alternatives and similar repositories for moca
Users that are interested in moca are comparing it to the libraries listed below
Sorting:
- Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…☆24Jun 28, 2021Updated 4 years ago
- 3D household task-based dataset created using customised AI2-THOR.☆14Apr 14, 2022Updated 3 years ago
- Official Implementation of CAPEAM (ICCV'23)☆16Nov 30, 2024Updated last year
- ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks☆491Feb 5, 2026Updated 3 weeks ago
- Prompter for Embodied Instruction Following☆18Nov 30, 2023Updated 2 years ago
- Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods☆127Apr 9, 2023Updated 2 years ago
- Code for EmBERT, a transformer model for embodied, language-guided visual task completion.☆60Apr 10, 2024Updated last year
- ☆17Mar 26, 2021Updated 4 years ago
- Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"☆27Mar 4, 2021Updated 5 years ago
- A visual semantic planner for the ALFRED virtual agent challenge using the GPT-2 language model☆16Oct 1, 2020Updated 5 years ago
- TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.☆143May 6, 2024Updated last year
- Official Implementation of CL-ALFRED (ICLR'24)☆30Oct 24, 2024Updated last year
- Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding☆10Jan 5, 2026Updated 2 months ago
- ☆45Jun 24, 2022Updated 3 years ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Enhance robot task understanding ability through visual semantic graph☆10May 20, 2021Updated 4 years ago
- Code for EMNLP 2022 Paper DANLI: Deliberative Agent for Following Natural Language Instructions☆18May 1, 2025Updated 10 months ago
- Data and Code for StructuredRegex.☆15Nov 16, 2023Updated 2 years ago
- [CVPR2019] Synthesizing Environment-Aware Activities via Activity Sketches☆13Oct 3, 2023Updated 2 years ago
- An open source framework for research in Embodied-AI from AI2.☆378Aug 22, 2025Updated 6 months ago
- [ACM MM 2022] Target-Driven Structured Transformer Planner for Vision-Language Navigation☆17Nov 1, 2022Updated 3 years ago
- Implementation of the Playground environment from the paper Language as a Cognitive Tool to Imagine Goals inCuriosity-Driven Exploration.☆11Mar 5, 2021Updated 4 years ago
- ☆12Dec 22, 2021Updated 4 years ago
- Official Repository of NeurIPS2021 paper: PTR☆32Dec 17, 2021Updated 4 years ago
- Codes of CVPR2022 paper: Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction☆32Aug 23, 2022Updated 3 years ago
- ☆13Dec 6, 2018Updated 7 years ago
- Code of the CVPR 2022 paper "HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation"☆30Aug 21, 2023Updated 2 years ago
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"☆98May 8, 2025Updated 9 months ago
- Code for "Learning Affordance Landscapes for Interaction Exploration in 3D Environments" (NeurIPS 20)☆38Jul 6, 2023Updated 2 years ago
- https://robotmlcourse.github.io/SP20/index.html☆14Aug 27, 2020Updated 5 years ago
- Code and dataset for NAACL 2022 paper "CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination" Hyounghun Kim, Abhay Zala, Mohi…☆16Nov 26, 2022Updated 3 years ago
- [ICCP 2021] ConvNeRF, a novel scheme to generate opacity radiance fields with a convolutional neural renderer for fuzzy objects with high…☆14Sep 9, 2022Updated 3 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆16Apr 23, 2024Updated last year
- ☆18May 14, 2022Updated 3 years ago
- ☆17Jan 19, 2026Updated last month
- Panoramic Graph Environment Annotation toolkit, for collecting audio and text annotations in panoramic graph environments such as Matterp…☆20Mar 5, 2021Updated 4 years ago
- a library for deep reinforcement learning, with applications for navigation☆16Feb 6, 2018Updated 8 years ago
- Official implementation of Layout-aware Dreamer for Embodied Referring Expression Grounding [AAAI 23].☆16Apr 13, 2023Updated 2 years ago
- ☆25Sep 12, 2019Updated 6 years ago