StarDewXXX / AdaR1Links

The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"

☆19

Alternatives and similar repositories for AdaR1

Users that are interested in AdaR1 are comparing it to the libraries listed below

Sorting:

StarDewXXX / O1-Pruner
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆93Updated 8 months ago
Raibows / CREAM
Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.
☆26Updated 8 months ago
MingyuJ666 / Rope_with_LLM
[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…
☆79Updated 4 months ago
UCSB-NLP-Chang / ThinkPrune
☆44Updated 3 weeks ago
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆114Updated 5 months ago
zjunlp / LightThinker
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
☆112Updated 6 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆85Updated 2 weeks ago
THUDM / TreeRL
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆71Updated 4 months ago
GeniusHTX / TALE
☆133Updated last month
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆69Updated 3 months ago
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆86Updated 8 months ago
GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆177Updated 3 months ago
YangLing0818 / SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆83Updated 7 months ago
KbsdJames / omni-math-rule
The rule-based evaluation subset and code implementation of Omni-MATH
☆23Updated 10 months ago
rhyang2021 / ARIA
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆23Updated 2 months ago
bigai-nlco / LatentSeek
Official Repository of LatentSeek
☆66Updated 4 months ago
hkust-nlp / GUIMid
☆21Updated 5 months ago
TingchenFu / MathIF
instruction-following benchmark for large reasoning models
☆45Updated 2 months ago
test-time-interaction / TTI
☆63Updated 4 months ago
hkust-nlp / Laser
Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
☆55Updated 5 months ago
sail-sg / ActivePRM
☆19Updated 6 months ago
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆44Updated last year
LightChen233 / reasoning-boundary
☆68Updated 4 months ago
TEAM-ARM / arm
[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model
☆56Updated 2 weeks ago
NineAbyss / S2R
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆69Updated 6 months ago
sail-sg / AnytimeReasoner
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆47Updated 3 months ago
ssmisya / PRMBench
[ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.
☆81Updated 8 months ago
zzzhr97 / SpecBench
☆22Updated 3 weeks ago
yyDing1 / ScaleQuest
[ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.
☆68Updated 11 months ago
YujunZhou / EVOL-RL
Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).
☆39Updated last week