gistvision / mocaLinks

Code and models of MOCA (Modular Object-Centric Approach) proposed in "Factorizing Perception and Policy for Interactive Instruction Following" (ICCV 2021). We address the task of long horizon instruction following with a modular architecture that decouples a task into visual perception and action policy prediction.

☆40

Alternatives and similar repositories for moca

Users that are interested in moca are comparing it to the libraries listed below

Sorting:

alexpashevich / E.T.
Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…
☆93Updated 2 years ago
snaredataset / snare
SNARE Dataset with MATCH and LaGOR models
☆24Updated last year
594zyc / HiTUT
Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…
☆25Updated 4 years ago
facebookresearch / interaction-exploration
Code for "Learning Affordance Landscapes for Interaction Exploration in 3D Environments" (NeurIPS 20)
☆38Updated 2 years ago
xavierpuigf / watch_and_help
Code for the paper Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration
☆102Updated 3 years ago
SamsonYuBaiJian / actionet
3D household task-based dataset created using customised AI2-THOR.
☆15Updated 3 years ago
allenai / ai2thor-rearrangement
🔀 Visual Room Rearrangement
☆124Updated 2 years ago
allenai / embodied-clip
Official codebase for EmbCLIP
☆132Updated 2 years ago
Buzz-Beater / EgoTaskQA
Code for NeurIPS 2022 Datasets and Benchmarks paper - EgoTaskQA: Understanding Human Tasks in Egocentric Videos.
☆36Updated 2 years ago
VegB / VLN-Transformer
Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"
☆26Updated 4 years ago
Gabesarch / TIDEE
code for TIDEE: Novel Room Reorganization using Visuo-Semantic Common Sense Priors
☆41Updated 2 years ago
eric-ai-lab / VLMbench
NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"
☆96Updated 7 months ago
wllmzhu / G-VUE
General-purpose Visual Understanding Evaluation
☆20Updated last year
evelinehong / PTR
Official Repository of NeurIPS2021 paper: PTR
☆32Updated 3 years ago
allenai / prior
🐍 A Python Package for Seamless Data Distribution in AI Workflows
☆26Updated 2 years ago
MohitShridhar / ingress
Visual Grounding of Referring Expressions for Human-Robot Interaction
☆26Updated 7 years ago
caiqi / Silver-Bullet-3D
This repository is the official implementation of *Silver-Bullet-3D* Solution for SAPIEN ManiSkill Challenge 2021
☆20Updated 3 years ago
allenai / learning_from_interaction
Learning about objects and their properties by interacting with them
☆12Updated 5 years ago
HanqingWangAI / SSM-VLN
Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation"
☆42Updated 4 years ago
mmurray / cvdn
Cooperative Vision-and-Dialog Navigation
☆71Updated 3 years ago
HanqingWangAI / Active_VLN
The repository of ECCV 2020 paper `Active Visual Information Gathering for Vision-Language Navigation`
☆43Updated 3 years ago
allenai / interactron
A Model for Embodied Adaptive Object Detection
☆46Updated 3 years ago
valtsblukis / hlsm
☆45Updated 3 years ago
YushengZhao / TD-STP
[ACM MM 2022] Target-Driven Structured Transformer Planner for Vision-Language Navigation
☆17Updated 3 years ago
singhgautam / sysbinder
Official Code for Neural Systematic Binder
☆33Updated 2 years ago
soyeonm / FILM
Official repository of ICLR 2022 paper FILM: Following Instructions in Language with Modular Methods
☆128Updated 2 years ago
yixchen / YouRefIt_ERU
☆19Updated 2 years ago
weituo12321 / PREVALENT
large scale pretrain for navigation task
☆93Updated 2 years ago
haoliuhl / instructrl
Instruction Following Agents with Multimodal Transforemrs
☆53Updated 3 years ago
GT-RIPL / robo-vln
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
☆88Updated last year