Chengsong-Huang / R-ZeroView external linksLinks
codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆757Feb 4, 2026Updated last week
Alternatives and similar repositories for R-Zero
Users that are interested in R-Zero are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆989Sep 26, 2025Updated 4 months ago
- codes for Efficient Test-Time Scaling via Self-Calibration☆19Sep 13, 2025Updated 5 months ago
- SSRL: Self-Search Reinforcement Learning☆206Aug 20, 2025Updated 5 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆256Feb 4, 2026Updated last week
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆175Sep 18, 2025Updated 4 months ago
- Official Repository of Absolute Zero Reasoner☆1,813Aug 24, 2025Updated 5 months ago
- XmodelLM☆38Nov 19, 2024Updated last year
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆147Apr 9, 2025Updated 10 months ago
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 3 months ago
- Democratizing Reinforcement Learning for LLMs☆5,106Updated this week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,021Nov 13, 2025Updated 3 months ago
- ☆60Jan 12, 2026Updated last month
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆48Oct 16, 2025Updated 4 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆218Nov 27, 2025Updated 2 months ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆262May 5, 2025Updated 9 months ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27May 13, 2025Updated 9 months ago
- A version of verl to support diverse tool use☆868Jan 6, 2026Updated last month
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆676Mar 16, 2025Updated 10 months ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆100Jan 26, 2026Updated 3 weeks ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 7 months ago
- ☆14Jan 24, 2025Updated last year
- [ICLR 2026] Learning to Reason without External Rewards☆391Jan 26, 2026Updated 2 weeks ago
- a survey on deep research☆47Sep 9, 2025Updated 5 months ago
- ☆145Sep 12, 2025Updated 5 months ago
- Official Repo for Open-Reasoner-Zero☆2,085Jun 2, 2025Updated 8 months ago
- ☆1,391Sep 12, 2025Updated 5 months ago
- ☆17Aug 1, 2025Updated 6 months ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆66Dec 8, 2025Updated 2 months ago
- ☆31Sep 12, 2025Updated 5 months ago
- [NeurIPS 2025] Efficient Reasoning Vision Language Models☆448Sep 18, 2025Updated 4 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆536Sep 8, 2025Updated 5 months ago
- Official repository for K-EXAONE built by LG AI Research☆66Feb 6, 2026Updated last week
- 🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]☆1,170Nov 17, 2025Updated 2 months ago
- Deep Reasoning Translation (DRT) Project☆241Sep 1, 2025Updated 5 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆19,132Updated this week
- The official implementation of the ECCV'24 paper MC-CoT: Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models w…☆26May 19, 2024Updated last year
- Pretraining and inference code for a large-scale depth-recurrent language model☆859Dec 29, 2025Updated last month
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆416Oct 4, 2025Updated 4 months ago