InternRobotics / EgoHODLinks
Official implementation of EgoHOD at ICLR 2025; 14 EgoVis Challenge Winners in CVPR 2024
☆25Updated last month
Alternatives and similar repositories for EgoHOD
Users that are interested in EgoHOD are comparing it to the libraries listed below
Sorting:
- ☆50Updated 6 months ago
- A curated list of Egocentric Action Understanding resources☆32Updated 2 months ago
- Code implementation of the paper 'FIction: 4D Future Interaction Prediction from Video'☆17Updated 7 months ago
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆33Updated last year
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆121Updated 3 months ago
- Official code releasse for "The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation"☆28Updated 2 months ago
- (ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"☆30Updated 7 months ago
- ☆27Updated 5 months ago
- Official code for MotionBench (CVPR 2025)☆59Updated 8 months ago
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆39Updated 8 months ago
- Accepted by CVPR 2024☆39Updated last year
- Affordance Grounding from Demonstration Video to Target Image (CVPR 2023)☆44Updated last year
- ☆21Updated last year
- ☆97Updated last week
- Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)☆35Updated last year
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆76Updated 9 months ago
- Bidirectional Mapping between Action Physical-Semantic Space☆31Updated 2 months ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆24Updated 6 months ago
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆85Updated 5 months ago
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆29Updated last year
- Official Implementation of paper "Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence"☆135Updated 3 months ago
- [ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".☆13Updated 9 months ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆68Updated 3 months ago
- OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding☆19Updated 3 months ago
- [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding☆66Updated last month
- Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs☆51Updated 3 months ago
- A list of works on video generation towards world model☆172Updated 3 weeks ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆90Updated this week
- VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos☆70Updated last week
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆40Updated last month