GAIR-NLP / OctoThinkerLinks
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆167Updated last month
Alternatives and similar repositories for OctoThinker
Users that are interested in OctoThinker are comparing it to the libraries listed below
Sorting:
- General Reasoner: Advancing LLM Reasoning Across All Domains☆163Updated 2 months ago
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]☆170Updated last month
- RL Scaling and Test-Time Scaling (ICML'25)