☆189Aug 1, 2025Updated 7 months ago
Alternatives and similar repositories for villa-x
Users that are interested in villa-x are comparing it to the libraries listed below
Sorting:
- ICCV2025☆161Dec 10, 2025Updated 2 months ago
- ☆53Updated this week
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos☆164Oct 1, 2025Updated 5 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆134Nov 4, 2025Updated 4 months ago
- Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization☆349Jan 27, 2026Updated last month
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆297Jan 6, 2026Updated 2 months ago
- A unified robotic manipulation learning framework☆21Sep 4, 2025Updated 6 months ago
- [RSS 2025] Gripper Keypose and Object Pointflow as Interfaces for Bimanual Robotic Manipulation☆77Jul 22, 2025Updated 7 months ago
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆342Jul 23, 2025Updated 7 months ago
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆280Jul 8, 2025Updated 8 months ago
- Official code for "QueST: Self-Supervised Skill Abstractions for Continuous Control" [NeurIPS 2024]☆108Nov 21, 2024Updated last year
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆115Apr 14, 2025Updated 10 months ago
- Handeye calibration for FR3 & Realsense with Ros2. Using Ros2 Humble, easy_handeye2, ros2_aruco.☆20Jun 4, 2025Updated 9 months ago
- [CoRL 2024 Outstanding Paper Award Finalist] Equivariant Diffusion Policy☆127Feb 13, 2025Updated last year
- Official code of RDT 2☆732Feb 7, 2026Updated last month
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆41Sep 15, 2025Updated 5 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆86May 21, 2025Updated 9 months ago
- [ICCV 2025] RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping☆37Nov 21, 2025Updated 3 months ago
- A semi print-in-place hand for human-like manipulation, designed to be built by anyone.☆17Jan 5, 2026Updated 2 months ago
- ☆14Apr 14, 2025Updated 10 months ago
- F1: A Vision Language Action Model Bridging Understanding and Generation to Actions☆162Jan 2, 2026Updated 2 months ago
- OpenHelix: An Open-source Dual-System VLA Model for Robotic Manipulation☆347Aug 27, 2025Updated 6 months ago
- [RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions☆1,017Nov 19, 2025Updated 3 months ago
- Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Goo…☆991Dec 20, 2025Updated 2 months ago
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆478Jan 22, 2025Updated last year
- Official repository of LIBERO-plus, a generalized benchmark for in-depth robustness analysis of vision-language-action models.☆233Jan 21, 2026Updated last month
- Official Algorithm Codebase for the Paper "BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household A…☆164Aug 24, 2025Updated 6 months ago
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力,以及大模型在前沿数学研究中的潜在能力。☆17May 16, 2025Updated 9 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 4 months ago
- Prototyping mujoco simulation environments.☆11Feb 20, 2025Updated last year
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model☆12Feb 11, 2025Updated last year
- This is the official code repo for GLOVER and GLOVER++.☆50Aug 6, 2025Updated 7 months ago
- MuMA-ToM: Multi-modal Multi-Agent Theory of Mind☆38Jan 23, 2025Updated last year
- Repository to train and evaluate RoboAgent☆360Apr 2, 2024Updated last year
- Official implementation of CharacterShot: Controllable and Consistent 4D Character Animation☆49Feb 27, 2026Updated last week
- Code for "ACG: Action Coherence Guidance for Flow-based Vision-Language-Action Models" (ICRA 2026)☆62Feb 21, 2026Updated 2 weeks ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 6 months ago
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated last month
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year