jetteezhou / PhysVLMLinks
PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability
☆34Updated 9 months ago
Alternatives and similar repositories for PhysVLM
Users that are interested in PhysVLM are comparing it to the libraries listed below
Sorting:
- InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation☆90Updated 3 months ago
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆121Updated last year
- ✨✨【NeurIPS 2025】Official implementation of BridgeVLA☆163Updated 3 months ago
- ☆64Updated last year
- Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"☆116Updated 4 months ago
- ☆130Updated 3 months ago
- [NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning☆65Updated 3 weeks ago
- ☆126Updated 4 months ago
- Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models☆96Updated last week
- VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning☆113Updated 3 months ago
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆79Updated 7 months ago
- Official implementation of Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation. Accepted in NeurIPS 2025.☆93Updated 3 weeks ago
- Official Repository for SAM2Act☆219Updated 4 months ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆154Updated 9 months ago
- F1: A Vision Language Action Model Bridging Understanding and Generation to Actions☆153Updated last week
- ICCV2025☆145Updated last month
- Official repository of LIBERO-plus, a generalized benchmark for in-depth robustness analysis of vision-language-action models.☆167Updated 3 weeks ago
- MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation☆56Updated 2 months ago
- ☆86Updated 3 months ago
- ☆79Updated 4 months ago
- ☆63Updated 10 months ago
- [NeurIPS 2025 Spotlight] SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation☆218Updated 6 months ago
- [IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models☆98Updated last year
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆148Updated last year
- Official implementation of "OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning"☆205Updated 7 months ago
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆273Updated this week
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆130Updated 4 months ago
- ☆89Updated last year
- code for CoRL2025 "LaDiWM: A Latent Diffusion-based World Model for Predictive Manipulation"☆42Updated last month
- [CVPR 25] G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation☆93Updated 7 months ago