Official implementation of EgoHOD at ICLR 2025; 14 EgoVis Challenge Winners in CVPR 2024
☆32Nov 25, 2025Updated 3 months ago
Alternatives and similar repositories for EgoHOD
Users that are interested in EgoHOD are comparing it to the libraries listed below
Sorting:
- We introduce DiffH2O, a diffusion-based framework to synthesize dexterous hand-object interactions. DiffH2O generates realistic hand-obje…☆31Nov 21, 2025Updated 3 months ago
- ☆29Nov 14, 2025Updated 3 months ago
- InternRobotics' open-source toolbox for vision-based embodied spatial intelligence.☆47Sep 18, 2025Updated 5 months ago
- Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…☆24Jun 13, 2024Updated last year
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆82Nov 27, 2025Updated 3 months ago
- Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"☆29Aug 28, 2023Updated 2 years ago
- [CVPR 2024 Champions][ICLR 2025] Solutions for EgoVis Chanllenges in CVPR 2024☆133May 11, 2025Updated 9 months ago
- (ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"☆32Apr 8, 2025Updated 10 months ago
- [AAAI26 oral] CronusVLA: Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling☆88Jan 11, 2026Updated last month
- ☆18Jan 8, 2026Updated last month
- Implementation for "StyleGAN-Canvas: Augmenting StyleGAN3 for Real-Time Human-AI Co-Creation"☆12May 24, 2023Updated 2 years ago
- HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos☆163Apr 5, 2025Updated 10 months ago
- Unofficial PyTorch implementation of MapNet: An Allocentric Spatial Memory for Mapping Environments☆12Jun 4, 2020Updated 5 years ago
- Minimal codes for "Task-Oriented Dexterous Hand Pose Synthesis Using Differentiable Grasp Wrench Boundary Estimator [IROS 2024]"☆15Feb 12, 2025Updated last year
- [CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'☆13Jun 16, 2024Updated last year
- ☆16Nov 11, 2025Updated 3 months ago
- [ICLR'25] Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?☆12Apr 11, 2025Updated 10 months ago
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- ☆10May 24, 2024Updated last year
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆137Jul 28, 2025Updated 7 months ago
- ☆15May 13, 2024Updated last year
- Responsible Robotic Manipulation☆16Aug 31, 2025Updated 6 months ago
- ☆17Dec 2, 2024Updated last year
- MaskPlanner is a deep learning model for the quick generation of multiple, long-horizon paths from free-form 3D objects represented as po…☆21Jun 20, 2025Updated 8 months ago
- [CVPR'2025] "DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation"☆20Jul 3, 2025Updated 7 months ago
- operation system simulator base on JavaScript☆12Sep 18, 2020Updated 5 years ago
- A cage-based deformation for meshes in 2D.☆14Sep 8, 2018Updated 7 years ago
- This repository is the implementation of Gripper-agnostic Diffusion Policy for pick-and-place manipulation in SE(3) space☆19Feb 28, 2025Updated last year
- Implementation of MetaQNN (https://arxiv.org/abs/1611.02167, https://github.com/bowenbaker/metaqnn.git) with Additions and Modifications …☆11Aug 8, 2018Updated 7 years ago
- ☆16Jun 9, 2025Updated 8 months ago
- [WIP] Python port/rewrite of pbrt, the physically based renderer by Matt Pharr and Greg Humphreys☆13May 19, 2013Updated 12 years ago
- Reachy2 Unity package to mirror a real or fake robot's state☆18Jul 18, 2025Updated 7 months ago
- The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"☆16Mar 25, 2025Updated 11 months ago
- ☆24Jun 12, 2025Updated 8 months ago
- Code for the paper "Attention Meets Post-hoc Interpretability: A Mathematical Perspective", ICML 2024☆21Nov 10, 2025Updated 3 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated 11 months ago
- [ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆15Oct 27, 2024Updated last year
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- Specialized encoders for robot manipulation. Sparsh-Skin An encoder tailored for magnetic tactile sensors to understand interactions from…☆26Aug 20, 2025Updated 6 months ago