sunblaze-ucb / IntuitorLinks

Code for the paper: "Learning to Reason without External Rewards"

☆380

Alternatives and similar repositories for Intuitor

Users that are interested in Intuitor are comparing it to the libraries listed below

Sorting:

facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆253Updated 6 months ago
vsubramaniam851 / multiagent-ft
☆226Updated 9 months ago
ypwang61 / One-Shot-RLVR
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆381Updated last week
knoveleng / open-rs
Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
☆267Updated last month
eric-ai-lab / Soft-Thinking
Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"
☆278Updated 2 weeks ago
ruixin31 / Spurious_Rewards
☆344Updated 4 months ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆257Updated 6 months ago
aakaran / reasoning-with-sampling
☆335Updated 3 weeks ago
TIGER-AI-Lab / CritiqueFineTuning
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
☆179Updated 4 months ago
eddycmu / demystify-long-cot
☆327Updated 6 months ago
kanishkg / cognitive-behaviors
☆216Updated 8 months ago
WooooDyy / AgentGym-RL
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…
☆502Updated 2 months ago
jwhj / OREO
☆117Updated 10 months ago
NVlabs / Tool-N1
☆210Updated 6 months ago
CMU-AIRe / MRT
Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".
☆116Updated 3 months ago
spiral-rl / spiral
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
☆161Updated 2 months ago
TsinghuaC3I / MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
☆357Updated 2 weeks ago
zwhe99 / DeepMath
A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
☆277Updated 2 months ago
ByteDance-Seed / Agent-R
Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"
☆161Updated last month
shangshang-wang / Tina
Tina: Tiny Reasoning Models via LoRA
☆309Updated 2 months ago
MiniMax-AI / SynLogic
[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
☆187Updated 4 months ago
GAIR-NLP / LIMR
☆213Updated 9 months ago
TIGER-AI-Lab / General-Reasoner
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆204Updated this week
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆290Updated 3 weeks ago
THUDM / T1
RL Scaling and Test-Time Scaling (ICML'25)
☆112Updated 10 months ago
LeapLabTHU / limit-of-RLVR
repo for paper https://arxiv.org/abs/2504.13837
☆271Updated 5 months ago
facebookresearch / meta-agents-research-environments
Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike stat…
☆377Updated 2 weeks ago
Gen-Verse / ReasonFlux
[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)
☆503Updated 2 months ago
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆65Updated 9 months ago
Zhiyuan-Zeng / RLVE
[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
☆148Updated 2 weeks ago