yongchao98/R1-Code-Interpreter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yongchao98/R1-Code-Interpreter)

yongchao98 / R1-Code-Interpreter

R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning

☆44

Alternatives and similar repositories for R1-Code-Interpreter

Users that are interested in R1-Code-Interpreter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

weiyifan1023 / AutoTIR
View on GitHub
Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"
☆54Sep 4, 2025Updated 10 months ago
ganler / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆313May 5, 2025Updated last year
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆42Nov 11, 2025Updated 8 months ago
BaohaoLiao / SAGE
View on GitHub
Self-Hinting Language Models Enhance Reinforcement Learning
☆26Mar 28, 2026Updated 3 months ago
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆18Apr 7, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LiaoMengqi / E3-RL4LLMs
View on GitHub
[ EMNLP 2025 Main ] Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
☆17Nov 7, 2025Updated 8 months ago
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
hkust-nlp / model-task-align-rl
View on GitHub
[ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".
☆18Feb 9, 2026Updated 5 months ago
HKUNLP / critic-rl
View on GitHub
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆127May 6, 2025Updated last year
RUCAIBox / CIR
View on GitHub
☆16Nov 11, 2025Updated 8 months ago
Scientific-Computing-Lab / MPI-rigen
View on GitHub
MPI Code Generation through Domain-Specific Language Models
☆16Nov 19, 2024Updated last year
RUCKBReasoning / CodeRM
View on GitHub
Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'
☆27May 16, 2025Updated last year
QingFei1 / R-Search
View on GitHub
[ACL 2026] R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning
☆35Jan 4, 2026Updated 6 months ago
OoDBag / VisTA
View on GitHub
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
☆27May 31, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Farseer-Scaling-Law / Farseer
View on GitHub
☆21Jun 12, 2025Updated last year
qhjqhj00 / MetaAgent
View on GitHub
MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning
☆47Sep 3, 2025Updated 10 months ago
jzhou316 / Post-DeepSeek-R1_LLM-RL
View on GitHub
Learning and research after DeepSeek-R1, around test-time computing, resurgence of RL, and new LLM learning/application paradigms.
☆23Apr 23, 2026Updated 3 months ago
ExplainableML / CLEVR-X
View on GitHub
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
☆30Oct 27, 2023Updated 2 years ago
IBM / ColPret
View on GitHub
Efficient Scaling laws and collaborative pretraining.
☆23Updated this week
HowieHwong / ProbeLLM
View on GitHub
ProbeLLM: Automating Principled Diagnosis of LLM Failures
☆17Feb 11, 2026Updated 5 months ago
NVlabs / Tool-N1
View on GitHub
☆230Jun 2, 2025Updated last year
junkangwu / Dr_DPO
View on GitHub
[ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"
☆19Jun 1, 2024Updated 2 years ago
Y-Agent / STRIDE
View on GitHub
A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision Making
☆21Sep 29, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
howard-yen / SLIM
View on GitHub
☆27Jun 22, 2026Updated last month
X-LANCE / text2sql-multiturn-GPT
View on GitHub
[NAACL 2024] CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions
☆13May 7, 2024Updated 2 years ago
KnowledgeXLab / O2-Searcher
View on GitHub
[TMLR 2026] A Searching-based Agent Model for Open-Domain Open-Ended Question Answering
☆39Jun 20, 2025Updated last year
cambridgeltl / multi3woz
View on GitHub
The official repository for Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapte…
☆17Jan 15, 2024Updated 2 years ago
Batorskq / PRL-Prompts-from-Reinforcement-Learning
View on GitHub
☆15Jun 5, 2025Updated last year
uservan / ThinkPO
View on GitHub
☆17Aug 1, 2025Updated 11 months ago
cxcscmu / General-AgentBench
View on GitHub
Benchmark Test-Time Scaling of General LLM Agents
☆20Apr 14, 2026Updated 3 months ago
YujunZhou / EVOL-RL
View on GitHub
Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).
☆51Mar 31, 2026Updated 3 months ago
wutaiqiang / awesome-GNN2MLP-distillation
View on GitHub
Learning MLPs to replace GNN
☆10Jun 3, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
katiekang1998 / reasoning_generalization
View on GitHub
☆33Jan 7, 2025Updated last year
ReTool-RL / ReTool
View on GitHub
☆384Aug 12, 2025Updated 11 months ago
uclaml / COPS
View on GitHub
The official implementation of Cross-Task Experience Sharing (COPS)
☆29Oct 23, 2024Updated last year
wlzhang2020 / LLMTreeRec
View on GitHub
The implement of LLMTreeRec
☆14Dec 9, 2024Updated last year
SihengLi99 / RePO
View on GitHub
RePO: Replay-Enhanced Policy Optimization
☆24Jun 12, 2025Updated last year
Zillwang / StepSearch
View on GitHub
EMNLP MAIN 2025 StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
☆75Sep 13, 2025Updated 10 months ago
aeroplanepaper / GRPO-LEAD
View on GitHub
☆40Nov 18, 2025Updated 8 months ago