RUCAIBox / Slow_Thinking_with_LLMsLinks

A series of technical report on Slow Thinking with LLM

☆739

Alternatives and similar repositories for Slow_Thinking_with_LLMs

Users that are interested in Slow_Thinking_with_LLMs are comparing it to the libraries listed below

Sorting:

lqtrung1998 / mwp_ReFT
☆548Updated 9 months ago
THUDM / ReST-MCTS
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆671Updated 9 months ago
0russwest0 / Agent-R1
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
☆840Updated 3 months ago
huggingface / Math-Verify
☆971Updated 3 months ago
RUCAIBox / R1-Searcher
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
☆649Updated 2 months ago
TIGER-AI-Lab / verl-tool
A version of verl to support diverse tool use
☆607Updated this week
Eclipsess / Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
☆651Updated this week
dvlab-research / Step-DPO
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
☆384Updated 9 months ago
qiancheng0 / ToolRL
☆365Updated this week
0russwest0 / Awesome-Agent-RL
☆415Updated last week
wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey
The related works and background techniques about Openai o1
☆222Updated 9 months ago
princeton-nlp / LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
☆496Updated last year
SimpleBerry / LLaMA-O1
Large Reasoning Models
☆805Updated 10 months ago
zhentingqi / rStar
☆963Updated 8 months ago
pengr / LLM-Synthetic-Data
A live reading list for LLM data synthesis (Updated to July, 2025).
☆387Updated last month
thinkwee / AgentsMeetRL
An Awesome List of Agentic Model trained with Reinforcement Learning
☆519Updated last week
ElliottYan / LUFFY
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆348Updated 2 weeks ago
GAIR-NLP / DeepResearcher
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
☆630Updated last week
eddycmu / demystify-long-cot
☆323Updated 4 months ago
GAIR-NLP / ToRL
☆300Updated 4 months ago
RUC-NLPIR / ARPO
✨ Agentic Reinforced Policy Optimization
☆654Updated this week
sail-sg / oat-zero
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
☆247Updated 6 months ago
Qihoo360 / Light-R1
☆748Updated last month
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆369Updated this week
princeton-nlp / SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
☆923Updated 8 months ago
XiaoYee / Awesome_Efficient_LRM_Reasoning
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond
☆308Updated this week
Hongcheng-Gao / Awesome-Long2short-on-LRMs
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…
☆252Updated 2 months ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆258Updated 5 months ago
QwenLM / AutoIF
☆312Updated last year
langfengQ / verl-agent
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…
☆1,045Updated last week