facebookresearch / EgoTVLinks

EgoTV Egocentric Task Verification from Natural Language Task Descriptions

☆27

Alternatives and similar repositories for EgoTV

Users that are interested in EgoTV are comparing it to the libraries listed below

Sorting:

Buzz-Beater / EgoTaskQA
Code for NeurIPS 2022 Datasets and Benchmarks paper - EgoTaskQA: Understanding Human Tasks in Egocentric Videos.
☆36Updated 2 years ago
wllmzhu / G-VUE
General-purpose Visual Understanding Evaluation
☆20Updated last year
joyhsu0504 / LEFT
☆46Updated last year
allenai / embodied-clip
Official codebase for EmbCLIP
☆132Updated 2 years ago
eric-ai-lab / VLMbench
NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"
☆96Updated 7 months ago
Gabesarch / TIDEE
code for TIDEE: Novel Room Reorganization using Visuo-Semantic Common Sense Priors
☆41Updated 2 years ago
alexpashevich / E.T.
Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal tra…
☆93Updated 2 years ago
allenai / interactron
A Model for Embodied Adaptive Object Detection
☆46Updated 3 years ago
UMass-Embodied-AGI / MultiPLY
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
☆134Updated last year
Sid2697 / HOI-Ref
Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"
☆29Updated last year
allenai / ai2thor-rearrangement
🔀 Visual Room Rearrangement
☆124Updated 2 years ago
gistvision / moca
Code and models of MOCA (Modular Object-Centric Approach) proposed in "Factorizing Perception and Policy for Interactive Instruction Foll…
☆40Updated last year
showlab / afformer
Affordance Grounding from Demonstration Video to Target Image (CVPR 2023)
☆44Updated last year
valtsblukis / hlsm
☆45Updated 3 years ago
zehao-wang / LAD
Official implementation of Layout-aware Dreamer for Embodied Referring Expression Grounding [AAAI 23].
☆16Updated 2 years ago
facebookresearch / VidOSC
Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)
☆35Updated last year
ZhuFengdaaa / SOON
Dataset and baseline for Scenario Oriented Object Navigation (SOON)
☆20Updated 4 years ago
stevenlsw / hoi-forecast
[CVPR 2022] Joint hand motion and interaction hotspots prediction from egocentric videos
☆71Updated last year
EmbodiedGPT / EgoCOT_Dataset
☆54Updated last year
xvjiarui / IMProv
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
☆58Updated last year
ChenYi99 / EgoPlan
[IJCV] EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning
☆74Updated last year
pairlab / SlotFormer
Code release for ICLR 2023 paper: SlotFormer on object-centric dynamics models
☆117Updated 2 years ago
fpv-iplab / EASG
Action Scene Graphs for Long-Form Understanding of Egocentric Videos (CVPR 2024)
☆44Updated 8 months ago
SilongYong / SQA3D
[ICLR 2023] SQA3D for embodied scene understanding and reasoning
☆152Updated 2 years ago
facebookresearch / ego4d-goalstep
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)
☆52Updated last year
salesforce / paprika
Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"
☆50Updated 10 months ago
lhc1224 / Cross-View-AG
Official PyTorch Implementation of Learning Affordance Grounding from Exocentric Images, CVPR 2022
☆70Updated last year
mees / hulc2
[ICRA2023] Grounding Language with Visual Affordances over Unstructured Data
☆45Updated 2 years ago
Gabesarch / HELPER
☆32Updated last year
facebookresearch / EgoT2
Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)
☆33Updated 2 years ago