axon-rl / gemLinks
A Gym for Agentic LLMs
β444Updated 2 weeks ago
Alternatives and similar repositories for gem
Users that are interested in gem are comparing it to the libraries listed below
Sorting:
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"β185Updated 8 months ago
- πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.β627Updated last week
- [COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Modelsβ139Updated last month
- [ICLR 2026] Learning to Reason without External Rewardsβ389Updated 2 weeks ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksβ261Updated 9 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learningβ350Updated last week
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β361Updated this week
- Ideas for projects related to Tinkerβ164Updated 3 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".β116Updated 6 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learningβ175Updated 4 months ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Exampleβ405Updated 2 months ago
- A repo for open research on building large reasoning modelsβ136Updated last week
- β352Updated 6 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agentsβ236Updated 6 months ago
- β117Updated last year
- Repo of paper "Free Process Rewards without Process Labels"β168Updated 10 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learningβ261Updated 8 months ago
- β224Updated 10 months ago
- β330Updated 8 months ago
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environmentsβ177Updated 3 weeks ago
- β113Updated 7 months ago
- Async pipelined version of Verlβ124Updated 10 months ago
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike statβ¦β427Updated 2 weeks ago
- Reinforcing General Reasoning without Verifiersβ96Updated 7 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and trainingβ285Updated last year
- A brief and partial summary of RLHF algorithms.β144Updated 11 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β344Updated 3 months ago
- Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Realityβ317Updated last month
- Benchmarking Agentic LLM and VLM Reasoning On Gamesβ228Updated 2 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inferenceβ418Updated 2 months ago