zjunlp/WorfBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zjunlp/WorfBench)

zjunlp / WorfBench

[ICLR 2025] Benchmarking Agentic Workflow Generation

☆155

Alternatives and similar repositories for WorfBench

Users that are interested in WorfBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FoundationAgents / AFlow
View on GitHub
🔥🔥🔥 ICLR 2025 Oral. Automating Agentic Workflow Generation.
☆554Dec 25, 2025Updated 7 months ago
OpenBMB / WorkflowLLM
View on GitHub
An open platform for enhancing the capability of LLMs in workflow orchestration.
☆195Mar 11, 2025Updated last year
zjunlp / WKM
View on GitHub
[NeurIPS 2024] Agent Planning with World Knowledge Model
☆167Dec 17, 2024Updated last year
zjunlp / WorldMind
View on GitHub
Aligning Agentic World Models via Knowledgeable Experience Learning
☆37May 15, 2026Updated 2 months ago
Yifan-Song793 / ETO
View on GitHub
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆168Oct 30, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
zjunlp / LookAheadTuning
View on GitHub
[WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews
☆17Dec 14, 2025Updated 7 months ago
SalesforceAIResearch / swecomm
View on GitHub
☆28Jun 2, 2026Updated last month
PKU-Baichuan-MLSystemLab / BUTTON
View on GitHub
☆28Feb 18, 2025Updated last year
hkust-nlp / AgentBoard
View on GitHub
An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]
☆427May 20, 2024Updated 2 years ago
gautierdag / plancraft
View on GitHub
Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs
☆30Nov 7, 2025Updated 8 months ago
THUDM / AgentBench
View on GitHub
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
☆3,601Feb 8, 2026Updated 5 months ago
OSU-NLP-Group / TravelPlanner
View on GitHub
[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
☆531May 24, 2026Updated 2 months ago
ulab-uiuc / ToMAP
View on GitHub
Official code repository for the paper "ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind"
☆25Sep 25, 2025Updated 10 months ago
EcthelionLiu / TodoEvolve
View on GitHub
[ICML'26] TodoEvolve
☆20May 18, 2026Updated 2 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
WooooDyy / AgentGym
View on GitHub
Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhi…
☆817May 30, 2026Updated last month
zjunlp / CaKE
View on GitHub
[EMNLP 2025] Circuit-Aware Editing Enables Generalizable Knowledge Learners
☆19Nov 17, 2025Updated 8 months ago
zjunlp / SciAtlas
View on GitHub
A Large-Scale Knowledge Graph for Automated Scientific Research
☆136Jul 16, 2026Updated last week
WeiminXiong / MPO
View on GitHub
MPO: Boosting LLM Agents with Meta Plan Optimization (EMNLP 2025 Findings)
☆81Aug 20, 2025Updated 11 months ago
zorazrw / agent-workflow-memory
View on GitHub
AWM: Agent Workflow Memory
☆447Dec 22, 2025Updated 7 months ago
open-compass / GTA
View on GitHub
[NeurIPS 2024 D&B] GTA: A Benchmark for General Tool Agents & [arXiv 2026] GTA-2
☆147Apr 20, 2026Updated 3 months ago
tongshoujie / MATCH-TUNING
View on GitHub
MATCH-TUNING
☆15Aug 6, 2022Updated 3 years ago
zjunlp / LLMAgentPapers
View on GitHub
Must-read Papers on LLM Agents.
☆3,089Jul 5, 2026Updated 2 weeks ago
alfworld / alfworld
View on GitHub
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
☆810Feb 8, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
quchangle1 / LLM-Tool-Survey
View on GitHub
This is the repository for the Tool Learning survey.
☆485Aug 9, 2025Updated 11 months ago
wenzhe-li / Self-MoA
View on GitHub
☆17Feb 4, 2025Updated last year
thu-coai / SPaR
View on GitHub
☆47Jun 11, 2025Updated last year
microsoft / Simia-Agent-Training
View on GitHub
Official Implementation of "Simulating Environments with Reasoning Models for Agent Training"
☆65Feb 18, 2026Updated 5 months ago
zjunlp / LabVLA
View on GitHub
LabVLA: Grounding Vision–Language–Action Models in Scientific Laboratories
☆91Jul 4, 2026Updated 2 weeks ago
niuzaisheng / ScreenExplorer
View on GitHub
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
☆26Jun 17, 2025Updated last year
shulin16 / MMInA
View on GitHub
[ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents
☆54Feb 27, 2025Updated last year
Wangpeiyi9979 / ACA
View on GitHub
EMNLP2022: Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation
☆15Oct 19, 2022Updated 3 years ago
Agent-RL / ReCall
View on GitHub
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Rei…
☆1,412May 16, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hrwise-nlp / AppBench
View on GitHub
This is for EMNLP 2024 Paper: AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction
☆16Nov 4, 2024Updated last year
bingreeky / MemGen
View on GitHub
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
☆406Jun 10, 2026Updated last month
DEITSP / DEITSP
View on GitHub
Code for SIGKDD2025 paper: An Efficient Diffusion-based Non-Autoregressive Solver for Traveling Salesman Problem
☆15Jan 28, 2025Updated last year
Sachin-A / TraceWeaver
View on GitHub
TraceWeaver is a research prototype for transparently tracing requests through a microservice without application instrumentation.
☆23Sep 2, 2024Updated last year
zjunlp / AutoAct
View on GitHub
[ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning
☆238Jan 13, 2025Updated last year
ai-nikolai / StateAct
View on GitHub
[REALM25 @ ACL25] - "StateAct" Official Paper Repo (SOTA LLM Agent)
☆18Feb 27, 2026Updated 4 months ago
DongsuLeeTech / AD4RL
View on GitHub
ICRA 2024
☆18Mar 13, 2024Updated 2 years ago