OpenHands / agent-sdkLinks
A clean, modular SDK for building AI agents with OpenHands V1.
☆67Updated this week
Alternatives and similar repositories for agent-sdk
Users that are interested in agent-sdk are comparing it to the libraries listed below
Sorting:
- ☆121Updated 5 months ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆50Updated 6 months ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆132Updated last year
- Code for ScribeAgent paper☆62Updated 7 months ago
- Agent computer interface for AI software engineer.☆110Updated last month
- Beating the GAIA benchmark with Transformers Agents. 🚀☆138Updated 8 months ago
- ☆120Updated 4 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆432Updated this week
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆116Updated this week
- Challenges for general-purpose web-browsing AI agents☆64Updated 4 months ago
- Complex Function Calling Benchmark.☆139Updated 9 months ago
- DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL☆185Updated 3 weeks ago
- Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)☆175Updated 4 months ago
- Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"☆162Updated this week
- ☆125Updated last year
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆108Updated 4 months ago
- LIMI: Less is More for Agency☆141Updated last week
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆103Updated this week
- ☆83Updated this week
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆93Updated 5 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆63Updated 10 months ago
- ☆58Updated 3 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆70Updated 3 months ago
- Designing Multi-Agent Systems with Zero Supervision☆99Updated 3 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Updated 9 months ago
- [ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.☆55Updated 3 months ago
- Run SWE-bench evaluations remotely☆41Updated 2 months ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆101Updated last month
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆102Updated 2 months ago
- ☆84Updated 11 months ago