PRIME-RL / TTRLLinks
TTRL: Test-Time Reinforcement Learning
☆650Updated 2 weeks ago
Alternatives and similar repositories for TTRL
Users that are interested in TTRL are comparing it to the libraries listed below
Sorting:
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,364Updated last month
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models☆464Updated last week
- A series of technical report on Slow Thinking with LLM☆699Updated 2 weeks ago
- ☆570Updated 2 months ago
- ☆782Updated last month
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆638Updated 5 months ago
- Large Reasoning Models☆804Updated 6 months ago
- Awesome RL Reasoning Recipes ("Triple R")☆706Updated last week
- Understanding R1-Zero-Like Training: A Critical Perspective☆991Updated last month
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆264Updated 4 months ago
- Official Repo for Open-Reasoner-Zero☆1,969Updated 3 weeks ago
- Explore the Multimodal “Aha Moment” on 2B Model☆594Updated 3 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆304Updated last week
- ☆300Updated 3 weeks ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆238Updated last month
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆665Updated last month
- Awesome RL-based LLM Reasoning☆526Updated last month
- LIMO: Less is More for Reasoning☆963Updated 2 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆306Updated last week
- ☆939Updated 5 months ago
- Dream 7B, a large diffusion language model☆774Updated last week
- A fork to add multimodal model training to open-r1☆1,309Updated 4 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆566Updated last month
- ReasonFlux Series - Open-Sourced LLM Family for Reasoning, Coding, Reward Modeling and Data Selection☆409Updated 2 weeks ago
- Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training☆281Updated last month
- Latest Advances on Long Chain-of-Thought Reasoning☆390Updated 3 weeks ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆240Updated 3 weeks ago
- Scalable RL solution for advanced reasoning of language models☆1,622Updated 3 months ago
- Tina: Tiny Reasoning Models via LoRA☆260Updated 3 weeks ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆222Updated last month