H-Freax / Awesome-Video-Robotic-PapersLinks
This repository compiles a list of papers related to the application of video technology in the field of robotics! Starβ the repo and follow me if you like what you seeπ€©.
β170Updated 10 months ago
Alternatives and similar repositories for Awesome-Video-Robotic-Papers
Users that are interested in Awesome-Video-Robotic-Papers are comparing it to the libraries listed below
Sorting:
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimizationβ153Updated 8 months ago
- Official Repository for MolmoActβ276Updated 2 weeks ago
- Official implementation of "Data Scaling Laws in Imitation Learning for Robotic Manipulation"β197Updated last year
- Code for subgoal synthesis via image editingβ144Updated 2 years ago
- Embodied Chain of Thought: A robotic policy that reason to solve the task.β344Updated 8 months ago
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasetsβ172Updated 2 months ago
- β257Updated last year
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extractionβ112Updated 8 months ago
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policyβ226Updated 9 months ago
- VLAC: A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learningβ249Updated 3 months ago
- OpenVLA: An open-source vision-language-action model for robotic manipulation.β316Updated 9 months ago
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.β350Updated last month
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)β309Updated 5 months ago
- Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videosβ192Updated 3 months ago
- Official repository of LIBERO-plus, a generalized benchmark for in-depth robustness analysis of vision-language-action models.β151Updated 2 weeks ago
- [ICRA 2025] In-Context Imitation Learning via Next-Token Predictionβ105Updated 9 months ago
- InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulationβ87Updated 2 months ago
- β125Updated 4 months ago
- [ICLR 2025] LAPA: Latent Action Pretraining from Videosβ427Updated 11 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.β243Updated last year
- Official implementation of "OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning"β205Updated 6 months ago
- A Vision-Language Model for Spatial Affordance Prediction in Roboticsβ207Updated 5 months ago
- Latest Advances on Vison-Language-Action Models.β122Updated 9 months ago
- [ICCV 2025] RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraintsβ97Updated 3 months ago
- [CoRL2024] ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter. https://arxiv.org/abs/2407.11298β105Updated 5 months ago
- Reimplementation of GR-1, a generalized policy for robotics manipulation.β146Updated last year
- VLA-0: Building State-of-the-Art VLAs with Zero Modificationβ408Updated 2 weeks ago
- A Benchmark for Low-Level Manipulation in Home Rearrangement Tasksβ168Updated last week
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`β147Updated last year
- Autoregressive Policy for Robot Learning (RA-L 2025)β144Updated 9 months ago