wudongming97 / AffordanceNetLinks
[ICCV 2025] RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
☆24Updated last month
Alternatives and similar repositories for AffordanceNet
Users that are interested in AffordanceNet are comparing it to the libraries listed below
Sorting:
- ☆50Updated 11 months ago
- The offical repo for paper "VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers" (ICCV 2025)☆79Updated last month
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆46Updated 5 months ago
- Unifying 2D and 3D Vision-Language Understanding☆104Updated 2 months ago
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆80Updated 11 months ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆64Updated last month
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆165Updated 3 months ago
- Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"☆66Updated last month
- Official implementation of "SUGAR: Pre-training 3D Visual Representations for Robotics" (CVPR'24).☆42Updated 3 months ago
- [NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning☆86Updated 11 months ago
- Open-source implementations on real robots☆34Updated 9 months ago
- List of papers on video-centric robot learning☆21Updated 10 months ago
- ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images (NeurIPS2024)☆83Updated this week
- [CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding☆55Updated last year
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆41Updated 9 months ago
- Official implemetation of the paper "InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning"☆45Updated 3 weeks ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆114Updated 4 months ago
- [arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence☆52Updated last month
- Code & data for "RoboGround: Robotic Manipulation with Grounded Vision-Language Priors" (CVPR 2025)☆27Updated 3 months ago
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆35Updated last week
- One-Shot Open Affordance Learning with Foundation Models (CVPR 2024)☆43Updated last year
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆159Updated 3 months ago
- Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING☆30Updated last year
- Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆83Updated last month
- ☆56Updated 7 months ago
- (CVPR 2025) A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning☆17Updated 6 months ago
- ☆34Updated 2 months ago
- InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation☆45Updated last week
- LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding (CVPR 2023)☆41Updated 2 years ago
- code for affordance-r1☆30Updated 3 weeks ago