YangLing0818 / SuperCorrect-llmLinks

[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction

☆76

Alternatives and similar repositories for SuperCorrect-llm

Users that are interested in SuperCorrect-llm are comparing it to the libraries listed below

Sorting:

GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆159Updated last week
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆125Updated 4 months ago
THUDM / T1
RL Scaling and Test-Time Scaling (ICML'25)
☆109Updated 6 months ago
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆105Updated 2 months ago
TIGER-AI-Lab / CritiqueFineTuning
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
☆169Updated 3 weeks ago
GeniusHTX / TALE
☆126Updated 2 months ago
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆48Updated 8 months ago
TIGER-AI-Lab / General-Reasoner
General Reasoner: Advancing LLM Reasoning Across All Domains
☆156Updated last month
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
NineAbyss / S2R
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆69Updated 3 months ago
ReasoningTransfer / Transferability-of-LLM-Reasoning
☆80Updated 2 weeks ago
GAIR-NLP / ReasonEval
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
☆63Updated 7 months ago
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆61Updated 8 months ago
LightChen233 / reasoning-boundary
☆67Updated last month
hkust-nlp / PreSelect
[ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches
☆53Updated 5 months ago
RM-R1-UIUC / RM-R1
RM-R1: Unleashing the Reasoning Potential of Reward Models
☆118Updated last month
chujiezheng / LLM-Extrapolation
Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"
☆75Updated 2 months ago
efficientscaling / Z1
Repo for "Z1: Efficient Test-time Scaling with Code"
☆63Updated 3 months ago
GAIR-NLP / weak-to-strong-reasoning
☆59Updated 11 months ago
hkust-nlp / dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆110Updated 7 months ago
CMU-AIRe / MRT
Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".
☆100Updated 3 weeks ago
KbsdJames / Omni-MATH
The official repository of the Omni-MATH benchmark.
☆85Updated 7 months ago
DAMO-NLP-SG / LongPO
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆38Updated 5 months ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆234Updated 2 months ago
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆160Updated 4 months ago
mukhal / ThinkPRM
Process Reward Models That Think
☆47Updated last month
StarDewXXX / O1-Pruner
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆86Updated 5 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆58Updated 2 weeks ago
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆60Updated 5 months ago
QwenLM / ProcessBench
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆166Updated 2 months ago