Wayne-Mai / EgoLocView external linksLinks
For Ego4D VQ3D Task
☆22Jan 9, 2024Updated 2 years ago
Alternatives and similar repositories for EgoLoc
Users that are interested in EgoLoc are comparing it to the libraries listed below
Sorting:
- An experiment with movie scenes and contrastive learning☆11Feb 1, 2025Updated last year
- Human-centric environment representations from egocentric video☆14Feb 5, 2026Updated last week
- [CHI24] AI-Assisted In-Context Writing on OHMD During Travels☆11Dec 19, 2024Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated 11 months ago
- FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024☆22Dec 9, 2024Updated last year
- ☆19Apr 14, 2023Updated 2 years ago
- Official repo for EscapeCraft (an 3D environment for room escape) and benchmark MM-Escape. This work is accepted by ICCV 2025.☆36Jul 7, 2025Updated 7 months ago
- [ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation☆24Aug 26, 2023Updated 2 years ago
- Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…☆24Jun 13, 2024Updated last year
- Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]☆102Jul 2, 2024Updated last year
- EgoTV Egocentric Task Verification from Natural Language Task Descriptions☆27Jan 9, 2024Updated 2 years ago
- Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)☆35Sep 9, 2024Updated last year
- 📚 A collection of resources and papers on Large Language Models in autonomous driving☆27Oct 30, 2023Updated 2 years ago
- A Massive Multi-Discipline Lecture Understanding Benchmark☆32Nov 1, 2025Updated 3 months ago
- A curated list of egocentric (first-person) vision and related area resources☆306Oct 14, 2024Updated last year
- A curated list of resources about long-context in large-language models and video understanding.☆31Aug 8, 2023Updated 2 years ago
- ☆41Sep 9, 2025Updated 5 months ago
- Egocentric Video Understanding Dataset (EVUD)☆33Jul 4, 2024Updated last year
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- ☆37Sep 16, 2024Updated last year
- Official implementation of `Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning`, CVPR 2025☆13Aug 1, 2025Updated 6 months ago
- [NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos☆48Jun 13, 2025Updated 8 months ago
- Self-Supervised Learning with Multi-View Rendering for 3D Point Cloud Analysis (ACCV 2022)☆10Jul 22, 2024Updated last year
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 4 months ago
- 基于langchain和chatglm6b构建的智能问答系统,支持自定义语料☆10Jun 25, 2023Updated 2 years ago
- Official implementation of Recurrent Action Transformer with Memory, an offline RL agent with memory mechanisms. https://sites.google.com…☆18Nov 23, 2025Updated 2 months ago
- ☆11Dec 13, 2023Updated 2 years ago
- ☆11Nov 21, 2022Updated 3 years ago
- ☆13Jul 22, 2022Updated 3 years ago
- ICCV'23 | Adverse Weather Removal with Codebook Priors☆10Aug 28, 2023Updated 2 years ago
- Official code for "Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model"☆12Oct 29, 2022Updated 3 years ago
- This repo contains the code for the recipe of the winning entry to the Ego4d VQ2D challenge at CVPR 2022.☆41Mar 7, 2023Updated 2 years ago
- Action Scene Graphs for Long-Form Understanding of Egocentric Videos (CVPR 2024)☆45Apr 9, 2025Updated 10 months ago
- ☆11Aug 29, 2022Updated 3 years ago
- [NeurIPS 2025] Multipole Attention for Efficient Long Context Reasoning☆20Dec 5, 2025Updated 2 months ago
- ☆12Jan 10, 2025Updated last year
- ☆13Jun 9, 2020Updated 5 years ago
- Implementation of "HumanReg: Self-supervised Non-rigid Registration of Sparse Human Point Cloud" (3DV 2024)☆15Oct 26, 2024Updated last year
- The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"☆16Mar 25, 2025Updated 10 months ago