groundlight / r1_vlm
Build your own visual reasoning model
☆357Updated this week
Alternatives and similar repositories for r1_vlm:
Users that are interested in r1_vlm are comparing it to the libraries listed below
- procedural reasoning datasets☆573Updated this week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆307Updated 5 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆436Updated 3 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆322Updated 4 months ago
- Exploring Applications of GRPO☆189Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆171Updated this week
- Verifiers for LLM Reinforcement Learning☆881Updated last month
- System 2 Reasoning Link Collection☆828Updated last month
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆336Updated 2 weeks ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆908Updated 3 weeks ago
- ☆150Updated 2 months ago
- ☆1,017Updated 4 months ago
- Pretraining code for a large-scale depth-recurrent language model☆755Updated 3 weeks ago
- Tina: Tiny Reasoning Models via LoRA☆164Updated last week
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆487Updated 3 weeks ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆692Updated 3 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,094Updated 3 months ago
- TTRL: Test-Time Reinforcement Learning☆407Updated last week
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆186Updated 3 weeks ago
- ☆199Updated 2 months ago
- Large Reasoning Models☆804Updated 5 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym☆448Updated last month
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆512Updated last month
- Friends of OLMo and their links.☆274Updated 4 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆281Updated this week
- ☆181Updated 2 months ago
- minimal GRPO implementation from scratch☆87Updated last month
- ☆524Updated 2 weeks ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆305Updated 6 months ago