Chengsong-Huang / R-ZeroLinks
codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆618Updated last week
Alternatives and similar repositories for R-Zero
Users that are interested in R-Zero are comparing it to the libraries listed below
Sorting:
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆340Updated 2 months ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆342Updated last week
- A Scientific Multimodal Foundation Model☆567Updated 2 weeks ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆147Updated this week
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆665Updated last month
- Self-Adapting Language Models☆790Updated last month
- ☆797Updated this week
- OpenCUA: Open Foundations for Computer-Use Agents☆471Updated 2 weeks ago
- ☆166Updated last month
- ☆404Updated last week
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆406Updated last week
- Scaling RL on advanced reasoning models☆585Updated last month
- Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs☆1,446Updated this week
- 🐉 Loong: Synthesize Long CoTs at Scale through Verifiers.☆429Updated 2 weeks ago
- Atom of Thoughts for Markov LLM Test-Time Scaling☆586Updated 3 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆354Updated 2 months ago
- ☆1,233Updated last week
- [Preprint 2025] Thinkless: LLM Learns When to Think☆225Updated 2 months ago
- Dream 7B, a large diffusion language model☆970Updated 3 weeks ago
- Tina: Tiny Reasoning Models via LoRA☆282Updated last month
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆160Updated 3 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation☆443Updated last month
- ☆214Updated 6 months ago
- MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents☆423Updated this week
- ☆619Updated 3 weeks ago
- [EMNLP 2025] Awesome RAG Reasoning Resources☆295Updated last month
- TTRL: Test-Time Reinforcement Learning☆806Updated last month
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆220Updated last week
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆458Updated last month
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆261Updated 4 months ago