hrlics / HoPELinks
[NeurIPS 2025] HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models
☆21Updated 3 weeks ago
Alternatives and similar repositories for HoPE
Users that are interested in HoPE are comparing it to the libraries listed below
Sorting:
- CoRL 2025☆22Updated 3 months ago
- Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.☆44Updated 2 months ago
- [ICCV 2025] Official repo of "EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow"☆25Updated last month
- EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments☆22Updated 6 months ago
- DSPv2: Improved Dense Policy for Effective and Generalizable Whole-body Mobile Manipulation☆23Updated last week
- Code for "AffordanceLLM: Grounding Affordance from Vision Language Models"☆14Updated last year
- [CVPR'2025] "DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation"☆18Updated 4 months ago
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆51Updated 7 months ago
- ☆47Updated 5 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆98Updated 2 weeks ago
- Code for "CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects", NeurIPS 2025☆49Updated last month
- 3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians (ACM MM 25)☆60Updated 4 months ago
- 🔥 The first open-sourced diffusion vision-langauge-action model.☆57Updated this week
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆42Updated 2 months ago
- Codes of Paper "Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding"☆20Updated last year
- Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".☆40Updated 4 months ago
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆30Updated 3 weeks ago
- [ICLR 2025] Dataset and Code for Paper "Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels"☆43Updated 4 months ago
- Evo-0: Vision-Language-Action Model with Implicit Spatial Understanding.☆45Updated this week
- [ICCV 2025] RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping☆28Updated this week
- Pi0-VLA Repository of "MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies"☆22Updated last month
- [CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation☆138Updated 2 weeks ago
- ☆25Updated 9 months ago
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆158Updated last month
- KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation☆21Updated 7 months ago
- ☆25Updated 5 months ago
- [AAAI 2026] D²PPO: Diffusion Policy Policy Optimization with Dispersive Loss.☆27Updated this week
- Galaxea's first diffusion policy release☆32Updated 3 months ago
- MAPLE infuses dexterous manipulation priors from egocentric videos into vision encoders, making their features well-suited for downstream…☆28Updated 7 months ago
- [IROS 2025] CRUISE: Cooperative Reconstruction and Editing in V2X Scenarios using Gaussian Splatting☆27Updated 4 months ago