Darren-greenhand / LLaVA_OpenVLALinks
Converted the training data of OpenVLA into general form of multimodal training instructions and then used with LLaVA-OneVision
☆23Updated 10 months ago
Alternatives and similar repositories for LLaVA_OpenVLA
Users that are interested in LLaVA_OpenVLA are comparing it to the libraries listed below
Sorting:
- ☆126Updated last year
- ☆59Updated 7 months ago
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆233Updated last month
- [CVPR 2025] RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete. Official Repository.☆342Updated last month
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆77Updated last year
- 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识☆252Updated last year
- [NeurIPS 2025] Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learning☆270Updated 4 months ago
- 多模态具身智能大模型 OpenVLA 的复现以及在 LIBERO 数据集上的微调改进☆286Updated 3 months ago
- VLA-Arena is an open-source benchmark for systematic evaluation of Vision-Language-Action (VLA) models.☆58Updated last week
- MLLM @ Game☆14Updated 6 months ago
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆148Updated last month
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆121Updated last year
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆181Updated 2 months ago
- 将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调☆445Updated 2 months ago
- 基于InternLM2大模型的离线具身智能导盲犬☆108Updated last year
- The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.☆319Updated 2 months ago
- Open Platform for Embodied Agents☆333Updated 10 months ago
- ☆80Updated 3 months ago
- ✨First Open-Source R1-like Video-LLM [2025/02/18]☆376Updated 9 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆139Updated 11 months ago
- ☆344Updated last year
- Qwen2.5 0.5B GRPO☆71Updated 9 months ago
- Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function ind…☆103Updated last year
- [COLM 2025] Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources☆285Updated 3 months ago
- ☆92Updated last month
- SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning☆1,022Updated last month
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆145Updated 7 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆119Updated 9 months ago
- Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conv…☆510Updated 8 months ago
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]☆173Updated last month