facebookresearch / EgoTV
EgoTV Egocentric Task Verification from Natural Language Task Descriptions
β27Updated last year
Alternatives and similar repositories for EgoTV
Users that are interested in EgoTV are comparing it to the libraries listed below
Sorting:
- Code for NeurIPS 2022 Datasets and Benchmarks paper - EgoTaskQA: Understanding Human Tasks in Egocentric Videos.β32Updated 2 years ago
- π Visual Room Rearrangementβ113Updated last year
- Official codebase for EmbCLIPβ125Updated last year
- NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"β91Updated last week
- code for TIDEE: Novel Room Reorganization using Visuo-Semantic Common Sense Priorsβ37Updated last year
- π A Python Package for Seamless Data Distribution in AI Workflowsβ22Updated last year
- [ICLR 2023] SQA3D for embodied scene understanding and reasoningβ132Updated last year
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ128Updated 6 months ago
- Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal traβ¦β90Updated last year
- β48Updated last year
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"β27Updated last year
- β46Updated 5 months ago
- Prompter for Embodied Instruction Followingβ18Updated last year
- Official Implementation of CAPEAM (ICCV'23)β13Updated 5 months ago
- Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"β78Updated 10 months ago
- Dataset and baseline for Scenario Oriented Object Navigation (SOON)β18Updated 3 years ago
- Affordance Grounding from Demonstration Video to Target Image (CVPR 2023)β44Updated 9 months ago
- Official Implementation of ReALFRED (ECCV'24)β39Updated 7 months ago
- Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"β49Updated 3 months ago
- A mini-framework for running AI2-Thor with Docker.β34Updated last year
- β42Updated last year
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)β31Updated 2 years ago
- Implementation (R2R part) for the paper "Iterative Vision-and-Language Navigation"β14Updated last year
- β22Updated 3 years ago
- Official implementation of Layout-aware Dreamer for Embodied Referring Expression Grounding (AAAI'23).β17Updated 2 years ago
- [CVPR 2022] Joint hand motion and interaction hotspots prediction from egocentric videosβ64Updated last year
- Code and models of MOCA (Modular Object-Centric Approach) proposed in "Factorizing Perception and Policy for Interactive Instruction Follβ¦β37Updated 10 months ago
- β44Updated 2 years ago
- Implementation of our ICCV 2023 paper DREAMWALKER: Mental Planning for Continuous Vision-Language Navigationβ19Updated last year
- Instruction Following Agents with Multimodal Transforemrsβ52Updated 2 years ago