Biscue5 / EgoScalerView external linksLinks
[CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
☆33Dec 2, 2025Updated 2 months ago
Alternatives and similar repositories for EgoScaler
Users that are interested in EgoScaler are comparing it to the libraries listed below
Sorting:
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆18Apr 11, 2025Updated 10 months ago
- [ICCV 2025] IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation☆63Aug 4, 2025Updated 6 months ago
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆33Jan 29, 2026Updated 2 weeks ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆176Jun 20, 2025Updated 7 months ago
- [ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆43Aug 9, 2025Updated 6 months ago
- Documentation and software tools for the Novel Sensors for Autonomous Vehicle Perception (NSAVP) dataset☆24Jan 26, 2024Updated 2 years ago
- [CVPR 2025] VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation☆45Jun 20, 2025Updated 7 months ago
- Code for using the Grasp Affordance Reasoning dataset☆10Sep 17, 2019Updated 6 years ago
- ROS wrapper of Nvidia Contact-graspnet model.☆17Jul 3, 2023Updated 2 years ago
- [CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".☆20Jun 16, 2025Updated 7 months ago
- Code for Stable Control Representations☆26Apr 5, 2025Updated 10 months ago
- VGGT 3D Vision Agent optimized for Apple Silicon with Metal Performance Shaders☆86Nov 5, 2025Updated 3 months ago
- Public part of the Robotic Perception group (MIS lab, France) library about vision-based state estimation of robot and scene☆23May 20, 2025Updated 8 months ago
- Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer☆28Nov 4, 2025Updated 3 months ago
- Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning☆41Dec 17, 2025Updated last month
- ☆11Jul 19, 2023Updated 2 years ago
- HD-EPIC Python script to download the entire datasets or parts of it☆17Oct 7, 2025Updated 4 months ago
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆26Dec 18, 2025Updated last month
- [ICCV 2025 Spotlight] DexVLG: Dexterous Vision-Language-Grasp Model at Scale☆48Jul 24, 2025Updated 6 months ago
- [ECCV'24] 3D Reconstruction of Objects in Hands without Real World 3D Supervision☆17Feb 3, 2025Updated last year
- Code and data for UniEgoMotion (ICCV 2025)☆43Nov 11, 2025Updated 3 months ago
- ☆37May 28, 2025Updated 8 months ago
- [SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling☆42Sep 26, 2025Updated 4 months ago
- Nano Banana Studio: AI-Powered Marketing Asset Creator with Real-Time Brand Enhancement☆39Sep 10, 2025Updated 5 months ago
- [CoRL 2025] UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations☆74Dec 18, 2025Updated last month
- ☆18Nov 4, 2024Updated last year
- ☆16Sep 24, 2024Updated last year
- ☆17May 7, 2025Updated 9 months ago
- (ICCV 25) MonoFusion☆60Oct 27, 2025Updated 3 months ago
- [IROS 2025] Novel Diffusion Models for Multimodal 3D Hand Trajectory Prediction☆23Dec 2, 2025Updated 2 months ago
- Extended implementation of RoboDexVLM (IROS 2025)☆31Nov 13, 2025Updated 3 months ago
- Official implementation of "Latent Action Learning Requires Supervision in the Presence of Distractors", ICML 2025☆33Jul 8, 2025Updated 7 months ago
- [CVPR 2025] Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model☆32Jun 26, 2025Updated 7 months ago
- Detic + SAM for open-vocabulary object detection and segmentation.☆19Nov 10, 2025Updated 3 months ago
- [RSS 2025] GauSS-MI: Gaussian Splatting Shannon Mutual Information for Active 3D Reconstruction☆82Oct 7, 2025Updated 4 months ago
- ☆41Aug 27, 2024Updated last year
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆79May 17, 2025Updated 8 months ago
- [NeurIPS 2025] Streaming 3D Reconstruction with Explicit Spatial Pointer Memory☆180Sep 26, 2025Updated 4 months ago
- CVPR 2025: VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction☆73Aug 1, 2025Updated 6 months ago