[ICLR2026] codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆781Feb 4, 2026Updated last month
Alternatives and similar repositories for R-Zero
Users that are interested in R-Zero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- codes for Efficient Test-Time Scaling via Self-Calibration☆19Sep 13, 2025Updated 6 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆259Feb 4, 2026Updated last month
- [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆1,034Mar 11, 2026Updated 2 weeks ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆48Oct 16, 2025Updated 5 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆179Sep 18, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- SSRL: Self-Search Reinforcement Learning☆206Aug 20, 2025Updated 7 months ago
- ☆12Apr 18, 2025Updated 11 months ago
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆36Oct 26, 2025Updated 5 months ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27May 13, 2025Updated 10 months ago
- A version of verl to support diverse tool use☆923Mar 2, 2026Updated 3 weeks ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆224Nov 27, 2025Updated 4 months ago
- ☆21Dec 14, 2024Updated last year
- Agent0 Series: Self-Evolving Agents from Zero Data☆1,103Feb 17, 2026Updated last month
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆148Apr 9, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,309Nov 13, 2025Updated 4 months ago
- Official Repository of Absolute Zero Reasoner☆1,829Aug 24, 2025Updated 7 months ago
- [NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning☆67Oct 31, 2025Updated 4 months ago
- Democratizing Reinforcement Learning for LLMs☆5,297Updated this week
- ☆64Jan 12, 2026Updated 2 months ago
- ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …☆45Aug 6, 2025Updated 7 months ago
- Self-Questioning Language Models☆57Jan 5, 2026Updated 2 months ago
- XmodelLM☆38Nov 19, 2024Updated last year
- [ICLR 2026] Learning to Reason without External Rewards☆403Jan 26, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆1,403Sep 12, 2025Updated 6 months ago
- VisPlay: Self-Evolving Vision-Language Models☆51Feb 25, 2026Updated last month
- ☆31Sep 12, 2025Updated 6 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆265May 5, 2025Updated 10 months ago
- ☆47Feb 12, 2026Updated last month
- ☆38Oct 28, 2025Updated 5 months ago
- Towards a Unified View of Large Language Model Post-Training☆208Sep 8, 2025Updated 6 months ago
- Code for "Variational Reasoning for Language Models"☆58Sep 29, 2025Updated 6 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆20,286Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆25Feb 20, 2026Updated last month
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 5 months ago
- Model souping for LLMs☆72Nov 18, 2025Updated 4 months ago
- Official Repo for Open-Reasoner-Zero☆2,088Jun 2, 2025Updated 9 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆552Sep 8, 2025Updated 6 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆426Mar 20, 2026Updated last week
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆683Mar 16, 2025Updated last year