TinyLoopX / RLLaVALinks
RLLaVA is a user-friendly framework for multi-modal RL research and optimized for resource-constrained teams.
☆54Updated last week
Alternatives and similar repositories for RLLaVA
Users that are interested in RLLaVA are comparing it to the libraries listed below
Sorting:
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆331Updated 8 months ago
- (ICLR 2026) Unveiling Super Experts in Mixture-of-Experts Large Language Models☆35Updated 4 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆147Updated 10 months ago
- DELT: Data Efficacy for Language Model Training☆43Updated 2 weeks ago
- ☆125Updated last year
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 8 months ago
- The SAIL-VL2 series model developed by the BytedanceDouyinContent Group☆76Updated 4 months ago
- Open-Pandora: On-the-fly Control Video Generation☆35Updated last year
- ☆59Updated 6 months ago
- Adapt an LLM model to a Mixture-of-Experts model using Parameter Efficient finetuning (LoRA), injecting the LoRAs in the FFN.☆84Updated 3 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆91Updated 11 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆200Updated 6 months ago
- [SCIS] MULTI-Benchmark: Multimodal Understanding Leaderboard with Text and Images☆44Updated 2 months ago
- ☆110Updated last year
- ☆74Updated 8 months ago
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)☆172Updated 3 months ago
- ICLR 2025☆30Updated 8 months ago
- ☆143Updated 2 months ago
- OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe☆141Updated last month
- [ICML 2025 Oral] Mixture of Lookup Experts☆70Updated 2 months ago
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆149Updated 4 months ago
- ☆111Updated 7 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- LMM solved catastrophic forgetting, AAAI2025☆45Updated 9 months ago
- [ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…☆41Updated 9 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆153Updated 7 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆91Updated 6 months ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆64Updated 3 months ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆164Updated 4 months ago
- Extrapolating RLVR to General Domains without Verifiers☆200Updated 5 months ago