neu-vi / Diag-HOI
☆22Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Diag-HOI
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆31Updated last year
- Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…☆23Updated last year
- [CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning☆64Updated 2 years ago
- Code for ECCV2022 Paper "Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection"☆34Updated last year
- [ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning☆64Updated 2 years ago
- Code accompanying Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos (CVPR 2021)☆33Updated 3 years ago
- The 1st place solution of 2022 Ego4d Natural Language Queries.☆32Updated 2 years ago
- Action Scene Graphs for Long-Form Understanding of Egocentric Videos (CVPR 2024)☆30Updated last month
- Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"☆49Updated last year
- Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021☆39Updated 6 months ago
- CVPR 2024 "Instance Tracking in 3D Scenes from Egocentric Videos"☆17Updated 4 months ago
- [CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning☆33Updated 2 years ago
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"☆18Updated last year
- Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"☆23Updated last year
- ☆20Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆19Updated last month
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)☆13Updated 2 years ago
- Official Code for ECCV2022: Learning Semantic Correspondence with Sparse Annotations☆18Updated 2 years ago
- [NeurIPS 2022] Official implementation of the paper "Rethinking Resolution in the Context of Efficient Video Recognition".☆32Updated 2 years ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated last month
- ☆14Updated 2 months ago
- CVPR2021: Detecting Human-Object Interaction via Fabricated Compositional Learning☆16Updated 3 years ago
- [arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"☆118Updated 3 months ago
- [ICCV 2023] Label-Efficient Online Continual Object Detection in Streaming Video☆17Updated 10 months ago
- ☆58Updated last year
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆34Updated 3 weeks ago
- ☆24Updated 3 years ago
- [ICCV 2023] MGMAE: Motion Guided Masking for Video Masked Autoencoding☆20Updated last year
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆33Updated last year