unira-zwj / PhysVLM
PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability
☆14Updated 2 months ago
Alternatives and similar repositories for PhysVLM
Users that are interested in PhysVLM are comparing it to the libraries listed below
Sorting:
- ☆34Updated 4 months ago
- [CVPR 25] G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation☆63Updated last month
- [RAL 2024] OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding☆27Updated 3 months ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆37Updated 5 months ago
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆111Updated 7 months ago
- Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting☆32Updated 7 months ago
- ☆14Updated 2 weeks ago
- ☆63Updated 4 months ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆62Updated 5 months ago
- ☆49Updated 7 months ago
- Code & data for "RoboGround: Robotic Manipulation with Grounded Vision-Language Priors" (CVPR 2025)☆12Updated last week
- ☆83Updated last week
- [NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning☆77Updated 7 months ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆54Updated 3 weeks ago
- Unifying 2D and 3D Vision-Language Understanding☆82Updated last month
- This repo has moved to https://github.com/haosulab/ManiSkill☆8Updated last year
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆64Updated 2 weeks ago
- Official implementation of "Re3Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation"☆90Updated 2 months ago
- ☆59Updated last month
- [RSS 2025] Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation☆79Updated last month
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆33Updated last month
- ☆52Updated 2 months ago
- View-Invariant Policy Learning via Zero-Shot Novel View Synthesis (CoRL 2024)☆20Updated 4 months ago
- Click to Grasp takes calibrated RGB-D images of a tabletop and user-defined part instances in diverse source images as input, and produce…☆18Updated last year
- IROS 2024 | PreAfford: Universal Affordance-Based Pre-grasping for Diverse Objects and Scenes☆11Updated 7 months ago
- ☆16Updated last year
- ☆72Updated 8 months ago
- ☆64Updated last month
- Official Code for Dexterous Grasp Transformer (CVPR 2024)☆44Updated 5 months ago
- ☆18Updated 3 months ago