Koorye / PCDLinks
[ICLR 2026] Official implemetation of the paper "Policy Contrastive Decoding for Robotic Foundation Models"
☆21Updated last week
Alternatives and similar repositories for PCD
Users that are interested in PCD are comparing it to the libraries listed below
Sorting:
- [ICRA 2026] Official implemetation of the paper "InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning"☆47Updated last week
- The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)☆19Updated 7 months ago
- [WIP] Code for LangToMo☆20Updated 7 months ago
- Code & data for "RoboGround: Robotic Manipulation with Grounded Vision-Language Priors" (CVPR 2025)☆38Updated 8 months ago
- [ICCV 2025] MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation☆49Updated 3 months ago
- Code for "AffordanceLLM: Grounding Affordance from Vision Language Models"☆14Updated last year
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆55Updated 10 months ago
- Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"☆121Updated 5 months ago
- Official code for "From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation"☆29Updated 6 months ago
- [NeurIPS 2025] VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation☆65Updated 4 months ago
- [WACV 2025 Oral] Transferring Foundation Models for Generalizable Robotic Manipulation☆26Updated 10 months ago
- official repo for AGNOSTOS, a cross-task manipulation benchmark, and X-ICM method, a cross-task in-context manipulation (VLA) method☆57Updated 2 months ago
- Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models☆32Updated 3 months ago
- Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies☆55Updated 2 months ago
- VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning☆124Updated 4 months ago
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆79Updated 8 months ago
- [NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning☆70Updated last month
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆115Updated 9 months ago
- An official implementation of Touch100k: A Large-Scale Touch-Language-Vision Dataset for Touch-Centric Multimodal Representation☆32Updated last year
- 🦾 A Dual-System VLA with System2 Thinking☆132Updated 5 months ago
- [AAAI26 oral] CronusVLA: Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling☆87Updated 3 weeks ago
- [CVPR 2024] Binding Touch to Everything: Learning Unified Multimodal Tactile Representations☆81Updated 2 months ago
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆41Updated 4 months ago
- Official Release of "Mixture of Horizons in Action Chunking"☆40Updated 2 months ago
- [ICLR 2026] InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation☆94Updated last week
- ☆92Updated last year
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos☆162Updated 4 months ago
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆38Updated 3 months ago
- ☆47Updated 7 months ago
- [ICML 2025] Rethinking Latent Redundancy in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation☆46Updated 8 months ago