OpenHands / agent-sdkLinks

A clean, modular SDK for building AI agents with OpenHands V1.

☆67

Alternatives and similar repositories for agent-sdk

Users that are interested in agent-sdk are comparing it to the libraries listed below

Sorting:

InternLM / SWE-Fixer
☆121Updated 5 months ago
xlang-ai / computer-agent-arena
Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!
☆50Updated 6 months ago
xlang-ai / Spider2-V
[NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
☆132Updated last year
colonylabs / ScribeAgent
Code for ScribeAgent paper
☆62Updated 7 months ago
All-Hands-AI / openhands-aci
Agent computer interface for AI software engineer.
☆110Updated last month
aymeric-roucher / GAIA
Beating the GAIA benchmark with Transformers Agents. 🚀
☆138Updated 8 months ago
aorwall / moatless-tree-search
☆120Updated 4 months ago
SWE-bench / SWE-smith
[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents
☆432Updated this week
wang-research-lab / agentinstruct
Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"
☆116Updated this week
convergence-ai / webgames
Challenges for general-purpose web-browsing AI agents
☆64Updated 4 months ago
zai-org / ComplexFuncBench
Complex Function Calling Benchmark.
☆139Updated 9 months ago
THUDM / DeepDive
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL
☆185Updated 3 weeks ago
microsoft / lost_in_conversation
Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)
☆175Updated 4 months ago
Nardien / agent-distillation
Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"
☆162Updated this week
Ag2S1 / Sibyl-System
☆125Updated last year
THU-KEG / Agentic-Reward-Modeling
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
☆108Updated 4 months ago
GAIR-NLP / LIMI
LIMI: Less is More for Agency
☆141Updated last week
SALT-NLP / collaborative-gym
Framework and toolkits for building and evaluating collaborative agents that can work together with humans.
☆103Updated this week
allenai / IFBench
☆83Updated this week
AlexCuadron / ThinkingAgent
Systematic evaluation framework that automatically rates overthinking behavior in large language models.
☆93Updated 5 months ago
oriyor / assistantbench
Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"
☆63Updated 10 months ago
yueqis / API-Based-Agent
☆58Updated 3 months ago
zou-group / sirius
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning
☆70Updated 3 months ago
SalesforceAIResearch / MAS-Zero
Designing Multi-Agent Systems with Zero Supervision
☆99Updated 3 months ago
OSU-NLP-Group / Middleware
Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)
☆37Updated 9 months ago
THUDM / SWE-Dev
[ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.
☆55Updated 3 months ago
SWE-bench / sb-cli
Run SWE-bench evaluations remotely
☆41Updated 2 months ago
StigLidu / DualDistill
[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆101Updated last month
aymeric-roucher / agent_reasoning_benchmark
🔧 Compare how Agent systems perform on several benchmarks. 📊🚀
☆102Updated 2 months ago
MinorJerry / OpenWebVoyager
☆84Updated 11 months ago