Chengsong-Huang / R-ZeroLinks
codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆530Updated this week
Alternatives and similar repositories for R-Zero
Users that are interested in R-Zero are comparing it to the libraries listed below
Sorting:
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆335Updated 2 months ago
- A Scientific Multimodal Foundation Model☆541Updated last week
- Self-Adapting Language Models☆777Updated 3 weeks ago
- Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs☆357Updated this week
- OpenCUA: Open Foundations for Computer-Use Agents☆398Updated last week
- ☆746Updated this week
- ☆341Updated this week
- Scaling RL on advanced reasoning models☆574Updated 2 weeks ago
- Atom of Thoughts for Markov LLM Test-Time Scaling☆585Updated 2 months ago
- A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.☆625Updated last month
- ☆403Updated this week
- 🐉 Loong: Synthesize Long CoTs at Scale through Verifiers.☆316Updated last month
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆305Updated last week
- ☆213Updated 6 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆349Updated last month
- Code and data for the Chain-of-Draft (CoD) paper☆320Updated 5 months ago
- Tina: Tiny Reasoning Models via LoRA☆278Updated 2 weeks ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆853Updated 2 months ago
- Build your own visual reasoning model☆407Updated last week
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆350Updated this week
- [Up-to-date] Awesome RAG Reasoning Resources☆266Updated last month
- [Preprint 2025] Thinkless: LLM Learns When to Think☆219Updated 2 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆449Updated last month
- ☆174Updated 3 weeks ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆319Updated 10 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆220Updated 2 months ago
- Official implementation of the paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆213Updated last week
- Dream 7B, a large diffusion language model☆938Updated last week
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆239Updated 3 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆242Updated this week