microsoft / rStarLinks
☆569Updated 2 months ago
Alternatives and similar repositories for rStar
Users that are interested in rStar are comparing it to the libraries listed below
Sorting:
- ☆773Updated last month
- ☆938Updated 4 months ago
- A series of technical report on Slow Thinking with LLM☆699Updated 2 weeks ago
- Large Reasoning Models☆804Updated 6 months ago
- TTRL: Test-Time Reinforcement Learning☆637Updated 2 weeks ago
- LIMO: Less is More for Reasoning☆960Updated 2 months ago
- ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning☆936Updated last month
- Understanding R1-Zero-Like Training: A Critical Perspective☆988Updated 3 weeks ago
- ☆297Updated 3 weeks ago
- ReasonFlux Series - Open-Sourced LLM Family for Reasoning, Coding, Reward Modeling and Data Selection☆406Updated last week
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆545Updated 3 months ago
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning☆422Updated this week
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆639Updated 5 months ago
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆263Updated 4 months ago
- Recipes to scale inference-time compute of open models☆1,095Updated last month
- Automatic evals for LLMs☆429Updated 2 weeks ago
- Code for the paper: "Learning to Reason without External Rewards"☆295Updated this week
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models☆451Updated this week
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,349Updated last month
- ☆331Updated 2 weeks ago
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆357Updated 9 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆561Updated 3 weeks ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆290Updated this week
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆236Updated last month
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆220Updated last month
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆450Updated 2 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆379Updated last week
- [ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆534Updated last month
- Official Repo for Open-Reasoner-Zero☆1,967Updated 2 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆337Updated 6 months ago