applese233/ICRL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/applese233/ICRL)

applese233 / ICRL

In-Context Reinforcement Learning for Tool Use in Large Language Models

☆48

Alternatives and similar repositories for ICRL

Users that are interested in ICRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pUmpKin-Co / ComplementaryRL
View on GitHub
Co-evolving policy actors and experience extractors for efficient experience-driven agent RL
☆51May 12, 2026Updated 2 months ago
limenlp / ExeVRM
View on GitHub
Official implementation for the paper "Video-Based Reward Modeling for Computer-Use Agents"
☆16Mar 14, 2026Updated 4 months ago
HarmanDotpy / pairwise-self-verification
View on GitHub
[ICML 2026] Code for V1: Unifying Generation and Self-Verification for Parallel Reasoners.
☆39Mar 5, 2026Updated 4 months ago
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 4 months ago
ventr1c / memma
View on GitHub
The official repository of "MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution".
☆19Mar 20, 2026Updated 4 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
JanTempus / tokenisation_lp
View on GitHub
☆15May 20, 2026Updated 2 months ago
KangsanKim07 / MemoryTransferLearning
View on GitHub
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents
☆31Apr 16, 2026Updated 3 months ago
GregxmHu / OccuBench
View on GitHub
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models
☆21Apr 14, 2026Updated 3 months ago
ShareLab-SII / CaTok
View on GitHub
[CVPR-26] Official repository of "CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization"
☆19Mar 9, 2026Updated 4 months ago
GreatX3 / ProAct
View on GitHub
ProAct is a framework designed to enable Large Language Model (LLM) agents to perform accurate, multi-turn lookahead reasoning in interac…
☆18Feb 11, 2026Updated 5 months ago
1229095296 / ResRL
View on GitHub
This repository includes code for our paper: ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning…
☆15May 2, 2026Updated 2 months ago
hainuo-wang / WiT
View on GitHub
Official project page and code repository for WiT, a pixel space diffusion
☆17May 31, 2026Updated last month
LuckyyySTA / GOLF
View on GitHub
☆18Mar 16, 2026Updated 4 months ago
sii-research / GAE
View on GitHub
Official code of Geometric Autoencoder for Diffusion Models.
☆20Mar 12, 2026Updated 4 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ZJU-REAL / InftyThink-Plus
View on GitHub
[ICML 2026] InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
☆33May 25, 2026Updated last month
xzxxntxdy / PEPO
View on GitHub
Official repo for ”Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought“
☆26Mar 29, 2026Updated 3 months ago
SKYLENAGE-AI / DeepVision-103K
View on GitHub
Codebase for DeepVision-103K
☆22Feb 21, 2026Updated 4 months ago
ZhilinGuo / matryoshka-gaussian-splatting
View on GitHub
Official Implementation of MGS: Matryoshka Gaussian Splatting
☆36Jun 11, 2026Updated last month
princeton-pli / DySCO
View on GitHub
DySCO: Dynamic Attention-Scaling Decoding for Long-Context LMs
☆17May 30, 2026Updated last month
McGill-NLP / llm2vec-gen
View on GitHub
Code for `LLM2VEC-GEN: Generative Embeddings from Large Language Models`
☆73Apr 5, 2026Updated 3 months ago
interactivebench / InteractiveBench
View on GitHub
Official Project Page for Interactive Benchmarks
☆31May 12, 2026Updated 2 months ago
liushulinle / MarsRL
View on GitHub
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism
☆18Nov 18, 2025Updated 8 months ago
lilakk / how2everything
View on GitHub
Official code for "How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs"
☆24Feb 10, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Yui010206 / Adaptive-Visual-Imagination-Control
View on GitHub
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
☆18Jun 2, 2026Updated last month
phymhan / S2D2
View on GitHub
☆16Jun 17, 2026Updated last month
GMLR-Penn / Multiplex-Thinking
View on GitHub
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
☆131May 24, 2026Updated last month
Trae1ounG / Pretrain_Space_RLVR
View on GitHub
[arxiv: 2604.14142] From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
☆17Apr 16, 2026Updated 3 months ago
ewang26 / HorizonMath
View on GitHub
A benchmark to measure AI progress on unsolved research problems in mathematics.
☆28May 6, 2026Updated 2 months ago
LINs-lab / IOMM
View on GitHub
[CVPR 2026] IOMM: Fast Pre-training of Unified Multimodal Models without Text-Image Pairs
☆26Apr 11, 2026Updated 3 months ago
openverse-ai / MEMO
View on GitHub
MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games
☆28May 10, 2026Updated 2 months ago
jmhb0 / PaperSearchQA
View on GitHub
[EACL 2026] PaperSearchQA. Data generation pipeline for QA over scientific papers, suitable for RL training search agents
☆33Feb 4, 2026Updated 5 months ago
Sta8is / Re2Pix
View on GitHub
[ECCV 2026] Official Implementation of Representations Before Pixels: Semantics-Guided Hierarchical Video Prediction
☆18Apr 26, 2026Updated 2 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
roychowdhuryresearch / gsw-memory
View on GitHub
Code corresponding to Generative Semantic Workspaces - Long term Structured Memory for Large Language Models - AAAI 26 (Oral), ICML 26
☆22Jun 2, 2026Updated last month
MYMY-young / DelimScaling
View on GitHub
[ICLR 2026] Official implementation of "Enhancing Multi-Image Understanding Through Delimiter Token Scaling"
☆15Jul 10, 2026Updated last week
yijunshens / StateFactory
View on GitHub
Official implementation of "Reward Prediction with Factorized World States"
☆20Mar 11, 2026Updated 4 months ago
OpenRewardAI / openreward-cookbook
View on GitHub
Training and evaluating with OpenReward
☆33Apr 28, 2026Updated 2 months ago
Utaotao / ProFit
View on GitHub
☆35Jan 20, 2026Updated 6 months ago
ExplainableML / finer
View on GitHub
[CVPR 2026 Oral] FINER: MLLMs Hallucinate under Fine-grained Negative Queries
☆17Jul 6, 2026Updated 2 weeks ago
yurakuratov / gradmem
View on GitHub
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent [ICML 2026]
☆38Jul 8, 2026Updated last week