PRIME-RL/RL-Compositionality

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PRIME-RL/RL-Compositionality)

PRIME-RL / RL-Compositionality

FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones

☆68

Alternatives and similar repositories for RL-Compositionality

Users that are interested in RL-Compositionality are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hkust-nlp / model-task-align-rl
View on GitHub
[ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".
☆18Feb 9, 2026Updated 5 months ago
Zhiyuan-Zeng / RLVE
View on GitHub
[ICML 2026] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
☆223Apr 30, 2026Updated 2 months ago
sunblaze-ucb / omega
View on GitHub
☆47Jun 24, 2025Updated last year
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
Interplay-LM-Reasoning / Interplay-LM-Reasoning
View on GitHub
[ICML 2026 Spotlight] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
☆162Jun 8, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
edenbiran / HoppingTooLate
View on GitHub
Exploring the Limitations of Large Language Models on Multi-Hop Queries
☆33Mar 2, 2025Updated last year
PRIME-RL / Entropy-Mechanism-of-RL
View on GitHub
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆443Jul 11, 2025Updated last year
TsinghuaC3I / Unify-Post-Training
View on GitHub
Towards a Unified View of Large Language Model Post-Training
☆211Sep 8, 2025Updated 10 months ago
EleutherAI / attribute
View on GitHub
☆16Nov 14, 2025Updated 8 months ago
SciYu / HiPhO
View on GitHub
The first high school physics Olympiad benchmark for evaluating (M)LLMs with step-level grading and human-level comparison.
☆26Dec 19, 2025Updated 7 months ago
anadim / smallest-addition-transformer-claude-code
View on GitHub
6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.
☆22Feb 19, 2026Updated 5 months ago
MARIO-Math-Reasoning / MARIO
View on GitHub
☆28May 8, 2024Updated 2 years ago
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated 11 months ago
fjzzq2002 / WeightWatch
View on GitHub
Official Repository of Paper "Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs"
☆15Sep 25, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zjunlp / ReCode
View on GitHub
[AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates
☆25Jul 1, 2025Updated last year
science-of-finetuning / sparsity-artifacts-crosscoders
View on GitHub
Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.
☆17Jul 6, 2026Updated 2 weeks ago
ShivamDuggal4 / karl
View on GitHub
Single-pass Adaptive Image Tokenization for Minimum Program Search | What's the Kolmogorov Complexity of an Image?
☆43Jul 26, 2025Updated 11 months ago
MoonshotAI / CombiBench
View on GitHub
☆52Jun 15, 2026Updated last month
facebookresearch / PhysicsLM4
View on GitHub
Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality
☆356May 20, 2026Updated 2 months ago
PRIME-RL / P1
View on GitHub
P1: Mastering Physics Olympiads with Reinforcement Learning
☆89Dec 29, 2025Updated 6 months ago
PRIME-RL / PRIME
View on GitHub
Scalable RL solution for advanced reasoning of language models
☆1,865Mar 18, 2025Updated last year
fiveai / understanding_safety_finetuning
View on GitHub
Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)
☆12Oct 31, 2024Updated last year
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆459Mar 20, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ars22 / e3
View on GitHub
☆20Sep 16, 2025Updated 10 months ago
HazyResearch / aioli
View on GitHub
Aioli: A unified optimization framework for language model data mixing
☆33Jan 17, 2025Updated last year
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆227Nov 27, 2025Updated 7 months ago
Zanette-Labs / speed-rl
View on GitHub
☆18Feb 2, 2026Updated 5 months ago
zjunlp / predict-before-execute
View on GitHub
Can We Predict Before Executing Machine Learning Agents?
☆19Jul 7, 2026Updated 2 weeks ago
brendanhogan / completion_tree_view
View on GitHub
☆15Apr 26, 2025Updated last year
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
yangzhch6 / DARS
View on GitHub
The official implemention of "Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration" (ICML 2026)
☆24Feb 4, 2026Updated 5 months ago
sileod / reasoning-core
View on GitHub
Procedural data generators for verifiable reasoning, synthetic pretraining, post-training, evaluation, and RL.
☆43Updated this week
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
roozbeh-mohit / IMO-Steps
View on GitHub
☆31Jul 16, 2025Updated last year
TsinghuaC3I / SSRL
View on GitHub
SSRL: Self-Search Reinforcement Learning
☆210Aug 20, 2025Updated 11 months ago
GavinZhengOI / LiveCodeBench-Pro
View on GitHub
☆176Dec 13, 2025Updated 7 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,100Apr 15, 2026Updated 3 months ago
wanghanbinpanda / Large-Language-Models-for-Code
View on GitHub
Large Language Models(LLMs) of Code
☆20Apr 23, 2023Updated 3 years ago
Xuekai-Zhu / FlowRL
View on GitHub
☆180Nov 24, 2025Updated 7 months ago
NEUIR / COAST
View on GitHub
Official repository for the paper "COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis".
☆18Feb 19, 2025Updated last year