Yu-Fangxu / FoRLinks

[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples

☆107

Alternatives and similar repositories for FoR

Users that are interested in FoR are comparing it to the libraries listed below

Sorting:

jwhj / OREO
☆116Updated 9 months ago
kyegomez / Lets-Verify-Step-by-Step
"Improving Mathematical Reasoning with Process Supervision" by OPENAI
☆111Updated this week
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆48Updated 11 months ago
THUDM / T1
RL Scaling and Test-Time Scaling (ICML'25)
☆111Updated 9 months ago
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆64Updated 8 months ago
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆61Updated 10 months ago
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆114Updated 5 months ago
rxlqn / awesome-llm-self-reflection
augmented LLM with self reflection
☆132Updated last year
TIGER-AI-Lab / CritiqueFineTuning
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
☆178Updated 3 months ago
THU-KEG / Agentic-Reward-Modeling
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
☆108Updated 4 months ago
Berkeley-NLP / Agent-Eval-Refine
Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]
☆146Updated 10 months ago
SalesforceAIResearch / LaTRO
☆122Updated 8 months ago
Gen-Verse / CURE
[NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning
☆126Updated last month
ScalerLab / JudgeBench
☆102Updated 11 months ago
hkust-nlp / B-STaR
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
☆85Updated 5 months ago
Yifan-Song793 / ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆151Updated 11 months ago
architsharma97 / dpo-rlaif
☆100Updated last year
LAMDASZ-ML / Self-Backtracking
☆50Updated 8 months ago
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
mukhal / ThinkPRM
Process Reward Models That Think
☆59Updated last week
GuanghaoYe / Emergence-of-Thinking
☆53Updated 8 months ago
Open-Source-O1 / o1_Reasoning_Patterns_Study
☆104Updated 10 months ago
YangLing0818 / SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆82Updated 7 months ago
thunlp / Optima
Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"
☆65Updated 11 months ago
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆246Updated 5 months ago
GAIR-NLP / AIME-Preview
☆73Updated 7 months ago
da03 / Internalize_CoT_Step_by_Step
☆195Updated 6 months ago
Parallel-Reasoning / APR
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆132Updated 2 months ago
thu-coai / SPaR
☆46Updated 4 months ago
MLE-Dojo / MLE-Dojo
☆76Updated last month