zjunlp / WorFBenchLinks

[ICLR 2025] Benchmarking Agentic Workflow Generation

☆143

Alternatives and similar repositories for WorFBench

Users that are interested in WorFBench are comparing it to the libraries listed below

Sorting:

open-compass / GTA
[NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents
☆133Updated 10 months ago
MIT-MI / MEM1
☆239Updated last month
KANABOON1 / MemGen
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
☆298Updated this week
RUC-NLPIR / Tool-Star
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆314Updated last month
ADaM-BJTU / OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
☆155Updated last year
ADaM-BJTU / AutoCoA
AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…
☆130Updated 10 months ago
RyanLiu112 / GenPRM
[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆95Updated 3 months ago
bingreeky / GMemory
☆193Updated 3 months ago
Yifan-Song793 / ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆159Updated last year
GAIR-NLP / ToRL
☆333Updated 8 months ago
SALT-NLP / DyLAN
Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization
☆192Updated last year
bytarnish / AGILE
☆163Updated last year
Open-Source-O1 / o1_Reasoning_Patterns_Study
☆104Updated last year
jwliao-ai / MARFT
☆76Updated 3 months ago
LightChen233 / reasoning-boundary
☆70Updated 7 months ago
qiancheng0 / ToolRL
☆432Updated 3 months ago
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆261Updated 9 months ago
ReTool-RL / ReTool
☆275Updated 5 months ago
JIA-Lab-research / ARPO
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆148Updated 8 months ago
nuster1128 / MemEngine
A Comprehensive Library for Memory of LLM-based Agents.
☆100Updated 8 months ago
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆134Updated 10 months ago
ryantzr1 / OpenAlita
Open Source Implementation of Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evo…
☆98Updated 6 months ago
OPPO-PersonalAI / OAgents
Implementation for OAgents: An Empirical Study of Building Effective Agents
☆306Updated 3 months ago
RUCAIBox / SimpleDeepSearcher
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
☆118Updated 8 months ago
Gen-Verse / CURE
[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning
☆149Updated 4 months ago
LeapLabTHU / ExpeL
☆200Updated last year
zjunlp / WKM
[NeurIPS 2024] Agent Planning with World Knowledge Model
☆162Updated last year
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆261Updated 8 months ago
Tim-Siu / reft-exp
A research repo for experiments about Reinforcement Finetuning
☆54Updated 10 months ago
ByteDance-Seed / Agent-R
Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"
☆169Updated 3 months ago