Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)
☆34Jun 12, 2023Updated 2 years ago
Alternatives and similar repositories for EgoT2
Users that are interested in EgoT2 are comparing it to the libraries listed below
Sorting:
- EgoTV Egocentric Task Verification from Natural Language Task Descriptions☆27Jan 9, 2024Updated 2 years ago
- Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…☆24Jun 13, 2024Updated last year
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆26Jan 6, 2024Updated 2 years ago
- Domain Adaptation and Adapters☆16Feb 28, 2023Updated 3 years ago
- In this codebase we establish a benchmark for egocentric user adaptation based on Ego4d.First, we start from a population model which ha…☆15Jan 16, 2025Updated last year
- Tracking Multiple Deformable Objects in Egocentric Videos (CVPR 2023)☆13Apr 10, 2023Updated 2 years ago
- [ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".☆13Feb 24, 2025Updated last year
- Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)☆35Sep 9, 2024Updated last year
- ☆13Jul 22, 2025Updated 7 months ago
- A PyTorch Implementation of LaplaceNet:A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification☆15Feb 8, 2022Updated 4 years ago
- Libraries and tools to support Transfer Learning☆20Apr 29, 2025Updated 10 months ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory. CVPR 2023.☆17Jan 26, 2024Updated 2 years ago
- Code for "Distributed, Egocentric Representations of Graphs for Detecting Critical Structures" (ICML 2019)☆20Aug 24, 2021Updated 4 years ago
- ☆22Mar 7, 2025Updated 11 months ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- [NeurIPS 2022] Egocentric Video-Language Pretraining☆256May 9, 2024Updated last year
- Official PyTorch Implementation for Continual Learning and Private Unlearning☆18Jul 19, 2022Updated 3 years ago
- The champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023☆18Jan 23, 2024Updated 2 years ago
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Dec 21, 2023Updated 2 years ago
- [ECCV 2022] Tackling Long-Tailed Category Distribution Under Domain Shifts☆25Nov 29, 2022Updated 3 years ago
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆46Aug 14, 2023Updated 2 years ago
- ☆22Mar 20, 2024Updated last year
- Disentangled Pre-training for Human-Object Interaction Detection☆27Sep 17, 2025Updated 5 months ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆26Feb 22, 2024Updated 2 years ago
- ☆27Aug 17, 2023Updated 2 years ago
- [AAAI 2025] Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.☆25Dec 30, 2024Updated last year
- [NeurIPS‘24] Multi-Object 3D Grounding with Dynamic Modules and Language Informed Spatial Attention☆27Jun 15, 2025Updated 8 months ago
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆29Apr 16, 2024Updated last year
- Official repository for the paper "End-to-End Visual Editing with a Generatively Pre-Trained Artist", which is accepted at ECCV 2022. Her…☆29Dec 28, 2022Updated 3 years ago
- Chain-of-Thought Predictive Control☆57May 1, 2023Updated 2 years ago
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning☆41Aug 4, 2025Updated 7 months ago
- Code for Self-and-Collaborative Attention Network from "SCAN: Self-and-Collaborative Attention Network for Video Person Re-identification…☆26Jun 1, 2019Updated 6 years ago
- ☆26Jun 20, 2024Updated last year
- Graph learning framework for long-term video understanding☆72Jul 13, 2025Updated 7 months ago
- Offical implemention of the paper DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction☆29May 26, 2024Updated last year
- Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…☆28Nov 8, 2023Updated 2 years ago
- ☆33May 15, 2024Updated last year