[ICLR2026] codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆805Feb 4, 2026Updated 3 months ago
Alternatives and similar repositories for R-Zero
Users that are interested in R-Zero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- codes for Efficient Test-Time Scaling via Self-Calibration☆20Sep 13, 2025Updated 8 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆259Feb 4, 2026Updated 3 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆190Mar 27, 2026Updated 2 months ago
- [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆1,071Apr 15, 2026Updated last month
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆50Mar 31, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- SSRL: Self-Search Reinforcement Learning☆208Aug 20, 2025Updated 9 months ago
- ☆12Apr 18, 2025Updated last year
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆36Oct 26, 2025Updated 7 months ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27May 13, 2025Updated last year
- A version of verl to support diverse tool use☆984Mar 2, 2026Updated 2 months ago
- Official Repository of Absolute Zero Reasoner☆1,860Aug 24, 2025Updated 9 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆226Nov 27, 2025Updated 6 months ago
- ☆21Dec 14, 2024Updated last year
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆149Apr 9, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Agent0 Series: Self-Evolving Agents from Zero Data☆1,195Feb 17, 2026Updated 3 months ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,753Nov 13, 2025Updated 6 months ago
- Democratizing Reinforcement Learning for LLMs☆5,548May 20, 2026Updated last week
- [ICML 2026] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆213Apr 30, 2026Updated 3 weeks ago
- [NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning☆69Oct 31, 2025Updated 6 months ago
- ☆64Mar 30, 2026Updated last month
- ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …☆44Aug 6, 2025Updated 9 months ago
- Self-Questioning Language Models☆56Mar 30, 2026Updated last month
- XmodelLM☆38Nov 19, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [ICLR 2026] Learning to Reason without External Rewards☆409Jan 26, 2026Updated 4 months ago
- ☆1,415Sep 12, 2025Updated 8 months ago
- [CVPR'26] VisPlay: Self-Evolving Vision-Language Models☆57Feb 25, 2026Updated 3 months ago
- ☆31Sep 12, 2025Updated 8 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆268May 5, 2025Updated last year
- verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework☆21,514Updated this week
- ☆41Oct 28, 2025Updated 7 months ago
- Code for "Variational Reasoning for Language Models"☆60Sep 29, 2025Updated 8 months ago
- ☆26Feb 20, 2026Updated 3 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Towards a Unified View of Large Language Model Post-Training☆211Sep 8, 2025Updated 8 months ago
- Model souping for LLMs☆73Nov 18, 2025Updated 6 months ago
- Official Repo for Open-Reasoner-Zero☆2,091Jun 2, 2025Updated 11 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆571Sep 8, 2025Updated 8 months ago
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆694Mar 16, 2025Updated last year
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆451Mar 20, 2026Updated 2 months ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆80Dec 8, 2025Updated 5 months ago