Kami-code / HandsOnVLM-releaseLinks

HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction

☆42

Alternatives and similar repositories for HandsOnVLM-release

Users that are interested in HandsOnVLM-release are comparing it to the libraries listed below

Sorting:

Reagan1311 / LOCATE
LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding (CVPR 2023)
☆45Updated 2 years ago
cvlab-columbia / dreamitate
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)
☆58Updated 6 months ago
video-to-action / video-to-action-release
[ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration
☆58Updated 7 months ago
TritiumR / Prompting-with-the-Future
Implementation of Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins. [RSS 2025]
☆45Updated 2 months ago
Hoyyyaard / 3DFlowAction
☆41Updated 5 months ago
shivanshpatel35 / rigvid
☆43Updated 5 months ago
michaelyuancb / general_flow
Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"
☆68Updated last year
MarionLepert / phantom
☆60Updated 3 months ago
Reagan1311 / OOAL
One-Shot Open Affordance Learning with Foundation Models (CVPR 2024)
☆45Updated last year
KimHanjung / UniSkill
[CoRL 2025] UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations
☆71Updated 3 months ago
Nimolty / RoboKeyGen
☆19Updated last year
ControlVLA / ControlVLA
Code Repository for ControlVLA, CoRL2025.
☆81Updated last month
ZzZZCHS / RoboGround
Code & data for "RoboGround: Robotic Manipulation with Grounded Vision-Language Priors" (CVPR 2025)
☆32Updated 6 months ago
rainbow979 / robodreamer
☆89Updated last year
homangab / Track-2-Act
code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation
☆100Updated last year
HeegerGao / FLIP
Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks
☆79Updated last year
HaoyiZhu / PointCloudMatters
[NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
☆89Updated last year
sihengz02 / RoLA
[CoRL 2025] Robot Learning from Any Images
☆34Updated last month
TianxingChen / G3Flow
[CVPR 25] G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
☆91Updated 6 months ago
leolyliu / TACO-Instructions
Official repository of "TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding".
☆61Updated 3 weeks ago
TEA-Lab / Robo-ABC
[ECCV 2024] 🎉 Official repository of "Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipu…
☆92Updated last year
robomonkey-vla / RoboMonkey
☆25Updated last month
Fsoft-AIC / Open-Vocabulary-Affordance-Detection-in-3D-Point-Clouds
[IROS 2023] Open-Vocabulary Affordance Detection in 3d Point Clouds
☆81Updated last year
liy1shu / FlowBotHD
FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation
☆14Updated last year
horipse01 / 3d-foundation-policy
☆86Updated 3 months ago
Selen-Suyue / MBA
[RA-L 2025] Motion Before Action: Diffusing Object Motion as Manipulation Condition
☆67Updated last month
Tengbo-Yu / AnyBimanual
[ICCV2025] AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation
☆93Updated 5 months ago
vlc-robot / robot_sugar
Official implementation of "SUGAR: Pre-training 3D Visual Representations for Robotics" (CVPR'24).
☆45Updated 6 months ago
hgaurav2k / hop
Hand-object interaction Pretraining From Videos
☆110Updated 3 months ago
wudongming97 / AffordanceNet
[ICCV 2025] RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
☆30Updated last month