Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
☆741Jun 6, 2025Updated 8 months ago
Alternatives and similar repositories for Skywork-OR1
Users that are interested in Skywork-OR1 are comparing it to the libraries listed below
Sorting:
- Official Repo for Open-Reasoner-Zero☆2,087Jun 2, 2025Updated 9 months ago
- ☆813Jun 9, 2025Updated 8 months ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆282Sep 25, 2025Updated 5 months ago
- Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.☆3,586Updated this week
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,219Aug 27, 2025Updated 6 months ago
- Simple RL training for reasoning☆3,830Dec 23, 2025Updated 2 months ago
- Democratizing Reinforcement Learning for LLMs☆5,167Updated this week
- A series of technical report on Slow Thinking with LLM☆760Aug 13, 2025Updated 6 months ago
- ☆762Dec 23, 2025Updated 2 months ago
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)☆9,037Feb 21, 2026Updated last week
- Scalable RL solution for advanced reasoning of language models☆1,809Mar 18, 2025Updated 11 months ago
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,739May 11, 2025Updated 9 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆19,519Updated this week
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆221Nov 27, 2025Updated 3 months ago
- ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Rei…☆1,328May 16, 2025Updated 9 months ago
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,522Updated this week
- ☆1,104Jan 10, 2026Updated last month
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆193Mar 20, 2025Updated 11 months ago
- ☆335May 24, 2025Updated 9 months ago
- slime is an LLM post-training framework for RL Scaling.☆4,381Updated this week
- Reproduce R1 Zero on Logic Puzzle☆2,439Mar 20, 2025Updated 11 months ago
- Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.☆745Jun 6, 2025Updated 8 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆705Oct 15, 2025Updated 4 months ago
- O1 Replication Journey☆1,999Jan 14, 2025Updated last year
- Scaling RL on advanced reasoning models☆665Oct 20, 2025Updated 4 months ago
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models☆1,833Jan 17, 2025Updated last year
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,085Nov 13, 2025Updated 3 months ago
- An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models☆2,881Updated this week
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆418Oct 4, 2025Updated 4 months ago
- repo for paper https://arxiv.org/abs/2504.13837☆329Dec 17, 2025Updated 2 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆533Updated this week
- (best/better) practices of megatron on veRL and tuning guide☆131Sep 26, 2025Updated 5 months ago
- [ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆568May 6, 2025Updated 9 months ago
- AllenAI's post-training codebase☆3,592Updated this week
- ☆331May 31, 2025Updated 9 months ago
- a-m-team's exploration in large language modeling☆194May 29, 2025Updated 9 months ago
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆1,164Jul 15, 2025Updated 7 months ago
- Large Reasoning Models☆807Dec 3, 2024Updated last year
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆184May 20, 2025Updated 9 months ago