Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework
β289Jan 17, 2026Updated 4 months ago
Alternatives and similar repositories for AgentRL
Users that are interested in AgentRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- β67May 7, 2026Updated 2 weeks ago
- π TrustJudge is accepted to ICLR 2026!β46Sep 27, 2025Updated 7 months ago
- Spectral Sphere Optimizerβ116Mar 23, 2026Updated 2 months ago
- Search Self-Play: Pushing the Frontier of Agent Capability without Supervisionβ100Mar 4, 2026Updated 2 months ago
- Official code for "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization"β296Updated this week
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- PyTorch implementation for the paper "The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEGβ¦β20Sep 18, 2025Updated 8 months ago
- [NeurIPS 24] Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidationβ19Jan 2, 2026Updated 4 months ago
- NaturalCodeBench (Findings of ACL 2024)β70Oct 14, 2024Updated last year
- TOKEN-IMPORTANCE GUIDED DIRECT PREFERENCE OPTIMIZATIONβ36Jan 26, 2026Updated 3 months ago
- Introduction and scripts for ACL-2020 paper "On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation"β21Jun 23, 2020Updated 5 years ago
- ASID-Caption: Attribute-Structured and Quality-Verified Audiovisual Instruction Dataset and Training Pipeline for Fine-Grained Video Undeβ¦β64Mar 3, 2026Updated 2 months ago
- β19Mar 10, 2025Updated last year
- Codes for Difflare: Removing Image Flare with Latent Diffusion Modelsβ11Dec 24, 2024Updated last year
- slime is an LLM post-training framework for RL Scaling.β5,710May 18, 2026Updated last week
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- β30Oct 8, 2025Updated 7 months ago
- CBU5201 Deception Datasetβ20Dec 10, 2024Updated last year
- [NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"β32Jul 6, 2025Updated 10 months ago
- β152Apr 8, 2026Updated last month
- (ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuningβ19Nov 22, 2025Updated 6 months ago
- PeRL: Parameter-Efficient Reinforcement Learningβ80Updated this week
- A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Languβ¦β88Dec 12, 2025Updated 5 months ago
- verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-inβ¦β1,909Feb 27, 2026Updated 2 months ago
- β20Nov 4, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- verl/HybridFlow: A Flexible and Efficient RL Post-Training Frameworkβ21,514Updated this week
- [ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoningβ387Mar 30, 2026Updated last month
- Internal utility libraries for Pklβ16May 14, 2026Updated last week
- [EMNLP 2025] Official codebase for Rearank: Reasoning Re-ranking Agentβ36Aug 20, 2025Updated 9 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Rewardβ95Aug 8, 2025Updated 9 months ago
- [SIGIR 2025] Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graphβ16Jun 6, 2025Updated 11 months ago
- [ICLR 2024] Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChainβ10Nov 24, 2025Updated 6 months ago
- β15Feb 22, 2025Updated last year
- The web front end for Overleaf, a web-based collaborative LaTeX editor. Regular pull updates from upstream. DO NOT Fork, just cherry-pickβ¦β16Mar 9, 2026Updated 2 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This repo is for source code of NeurIPS 2021 paper "Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration".β22Jan 4, 2022Updated 4 years ago
- The official implementation for IEEE-ICASSP 2024 paper "Flare-Free Vision: Empowering Uformer with Depth Insights"β17Aug 27, 2024Updated last year
- β40Mar 26, 2026Updated last month
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemenβ¦β749Feb 15, 2026Updated 3 months ago
- PyTorch implementation of FAIR's paper "End-to-End Memory Network", NIPS 2015β12Oct 19, 2017Updated 8 years ago
- β44Mar 31, 2026Updated last month
- (best/better) practices of megatron on veRL and tuning guideβ134May 12, 2026Updated last week