SakanaAI / RLTLinks

Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.

☆346

Alternatives and similar repositories for RLT

Users that are interested in RLT are comparing it to the libraries listed below

Sorting:

WooooDyy / AgentGym-RL
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…
☆453Updated last month
vsubramaniam851 / multiagent-ft
☆218Updated 8 months ago
shangshang-wang / Tina
Tina: Tiny Reasoning Models via LoRA
☆299Updated last month
sunblaze-ucb / Intuitor
Code for the paper: "Learning to Reason without External Rewards"
☆366Updated 3 months ago
menloresearch / visual-thinker
☆177Updated 2 months ago
eqimp / hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
☆127Updated 2 months ago
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆172Updated 9 months ago
knoveleng / open-rs
Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
☆266Updated last week
zhengkid / Parallel-R1
The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"
☆225Updated last week
StigLidu / DualDistill
[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆101Updated last month
facebookresearch / cwm
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
☆682Updated last month
Chengsong-Huang / R-Zero
codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆650Updated 3 weeks ago
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆297Updated 2 months ago
OPPO-PersonalAI / OAgents
Implementation for OAgents: An Empirical Study of Building Effective Agents
☆277Updated last week
facebookresearch / meta-agents-research-environments
Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike stat…
☆321Updated last week
SakanaAI / natural_niches
The code repository of the paper: Competition and Attraction Improve Model Fusion
☆161Updated 2 months ago
THU-KEG / Agentic-Reward-Modeling
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
☆108Updated 4 months ago
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆246Updated 5 months ago
s-sahoo / Eso-LMs
Esoteric Language Models
☆101Updated 2 weeks ago
google-deepmind / latent-multi-hop-reasoning
[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?
☆79Updated 7 months ago
letta-ai / sleep-time-compute
accompanying material for sleep-time compute paper
☆117Updated 5 months ago
ZihanWang314 / CoE
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
☆220Updated last month
Nardien / agent-distillation
Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"
☆162Updated this week
ByteDance-Seed / Agent-R
Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"
☆161Updated this week
ypwang61 / One-Shot-RLVR
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆365Updated last week
brendanhogan / DeepSeekRL-Extended
Exploring Applications of GRPO
☆248Updated 2 months ago
NVlabs / RLP
RLP: Reinforcement as a Pretraining Objective
☆192Updated 2 weeks ago
TsinghuaC3I / SSRL
SSRL: Self-Search Reinforcement Learning
☆147Updated 2 months ago
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆342Updated 10 months ago
huggingface / gpt-oss-recipes
Collection of scripts and notebooks for OpenAI's latest GPT OSS models
☆463Updated 2 months ago