chenchen0103/ACEBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chenchen0103/ACEBench)

chenchen0103 / ACEBench

☆188

Alternatives and similar repositories for ACEBench

Users that are interested in ACEBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sierra-research / tau-bench
View on GitHub
Code and Data for Tau-Bench
☆1,344Mar 18, 2026Updated 4 months ago
sierra-research / tau2-bench
View on GitHub
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
☆1,657Updated this week
THUNLP-MT / StableToolBench
View on GitHub
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
☆237Apr 15, 2025Updated last year
OPPO-PersonalAI / TaskCraft
View on GitHub
[ICLR 2026] A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.
☆195Jul 6, 2025Updated last year
NVlabs / Tool-N1
View on GitHub
☆230Jun 2, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
meituan-longcat / vitabench
View on GitHub
[ICLR 2026] VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
☆159Feb 22, 2026Updated 5 months ago
qiancheng0 / ToolRL
View on GitHub
☆514Oct 16, 2025Updated 9 months ago
apple / ToolSandbox
View on GitHub
☆269Nov 7, 2025Updated 8 months ago
langfengQ / verl-agent
View on GitHub
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…
☆2,151Jun 9, 2026Updated last month
DeepExperience / LoopTool
View on GitHub
☆68Dec 10, 2025Updated 7 months ago
hkust-nlp / Toolathlon
View on GitHub
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
☆438Updated this week
yuyq18 / StepTool
View on GitHub
☆36May 24, 2025Updated last year
zai-org / ComplexFuncBench
View on GitHub
Complex Function Calling Benchmark.
☆180Jan 20, 2025Updated last year
meituan / vitabench
View on GitHub
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
☆23Oct 17, 2025Updated 9 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
swt-user / DMPO
View on GitHub
☆54Oct 10, 2024Updated last year
microsoft / Simia-Agent-Training
View on GitHub
Official Implementation of "Simulating Environments with Reasoning Models for Agent Training"
☆65Feb 18, 2026Updated 5 months ago
AgentR1 / Agent-R1
View on GitHub
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
☆1,570Updated this week
0russwest0 / Awesome-Agent-RL
View on GitHub
☆511Oct 11, 2025Updated 9 months ago
Agent-RL / ReCall
View on GitHub
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Rei…
☆1,412May 16, 2025Updated last year
duyichao / NPDA-KNN-ST
View on GitHub
Official implementation of EMNLP'2022 paper "Non-Parametric Domain Adaptation for End-to-End Speech Translation"
☆11Oct 26, 2022Updated 3 years ago
RUC-NLPIR / ARPO
View on GitHub
[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)
☆1,092Jul 13, 2026Updated last week
TheAgentArk / Toucan
View on GitHub
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
☆259Dec 16, 2025Updated 7 months ago
TIGER-AI-Lab / verl-tool
View on GitHub
A version of verl to support diverse tool use [TMLR 2026]
☆1,024Jul 15, 2026Updated last week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
quchangle1 / LLM-Tool-Survey
View on GitHub
This is the repository for the Tool Learning survey.
☆485Aug 9, 2025Updated 11 months ago
mll-lab-nu / RAGEN
View on GitHub
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
☆2,756Apr 14, 2026Updated 3 months ago
RUC-NLPIR / Tool-Star
View on GitHub
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆403Apr 3, 2026Updated 3 months ago
alibaba / ROLL
View on GitHub
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
☆3,323Updated this week
PeterGriffinJin / Search-R1
View on GitHub
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
☆5,150Nov 13, 2025Updated 8 months ago
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,649Updated this week
rllm-org / rllm
View on GitHub
Democratizing Reinforcement Learning for LLMs
☆5,727Updated this week
fairyshine / Seal-Tools
View on GitHub
The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…
☆57Nov 5, 2024Updated last year
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
bespokelabsai / verifiers
View on GitHub
Verifiers for LLM Reinforcement Learning
☆81Jul 17, 2026Updated last week
bytedance / FTRL
View on GitHub
[ACL 2026] Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
☆52Jul 10, 2026Updated 2 weeks ago
Tencent-Hunyuan / C3-Benchmark
View on GitHub
C^3-Bench: The Things Real Disturbing LLM based Agent in Multi-Tasking
☆38Mar 1, 2026Updated 4 months ago
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,845Jul 14, 2026Updated last week
GAIR-NLP / DeepResearcher
View on GitHub
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
☆783May 10, 2026Updated 2 months ago
Simple-Efficient / RL-Factory
View on GitHub
Train your Agent model via our easy and efficient framework
☆1,773Dec 5, 2025Updated 7 months ago