CVMI-Lab / SlotMIMLinks

(CVPR 2025) A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning

☆20

Alternatives and similar repositories for SlotMIM

Users that are interested in SlotMIM are comparing it to the libraries listed below

Sorting:

michaelyuancb / general_flow
Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"
☆66Updated 11 months ago
HeegerGao / FLIP
Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks
☆79Updated 11 months ago
video-to-action / video-to-action-release
[ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration
☆58Updated 6 months ago
Dantong88 / LLARVA
☆60Updated 11 months ago
kahnchana / LangToMo
[WIP] Code for LangToMo
☆20Updated 4 months ago
homangab / Track-2-Act
code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation
☆97Updated last year
jmwang0117 / Video4Robot
List of papers on video-centric robot learning
☆22Updated last year
ZzZZCHS / RoboGround
Code & data for "RoboGround: Robotic Manipulation with Grounded Vision-Language Priors" (CVPR 2025)
☆28Updated 5 months ago
vlc-robot / robot_sugar
Official implementation of "SUGAR: Pre-training 3D Visual Representations for Robotics" (CVPR'24).
☆44Updated 5 months ago
Max-Fu / otter
[ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
☆110Updated 7 months ago
vlc-robot / hiveformer
☆33Updated last year
Hoyyyaard / 3DFlowAction
☆40Updated 4 months ago
Koorye / Inspire
Official implemetation of the paper "InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning"
☆45Updated last month
Kami-code / HandsOnVLM-release
HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction
☆41Updated 2 months ago
Reagan1311 / OOAL
One-Shot Open Affordance Learning with Foundation Models (CVPR 2024)
☆45Updated last year
sled-group / RACER
[ICRA 2025] RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
☆38Updated last year
pickxiguapi / Embodied-R1
Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"
☆102Updated 3 months ago
rainbow979 / robodreamer
☆86Updated last year
xiaoxiao0406 / VQ-VLA
The offical repo for paper "VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers" (ICCV 2025)
☆94Updated this week
HaoyiZhu / PointCloudMatters
[NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
☆88Updated last year
Tengbo-Yu / AnyBimanual
[ICCV2025] AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation
☆91Updated 4 months ago
TencentARC / Moto
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
☆147Updated last month
MCG-NJU / Tra-MoE
[CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
☆51Updated 7 months ago
BerkeleyAutomation / mirage
Mirage: a zero-shot cross-embodiment policy transfer method. Benchmarking code for cross-embodiment policy transfer.
☆29Updated last year
TeleeMa / Sigma-Agent
This is the official repo for [CoRL 2024] Contrastive Imitation Learning for Language-guided Multi-Task Robotic Manipulation
☆31Updated last year
bytedance / IRASim
☆125Updated 4 months ago
liufanfanlff / RoboUniview
☆61Updated 9 months ago
Reagan1311 / LOCATE
LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding (CVPR 2023)
☆44Updated 2 years ago
liyi14 / HAMSTER_beta
☆47Updated 7 months ago
bytedance / GR-MG
Official implementation of GR-MG
☆90Updated 10 months ago