All-Hands-AI / agent-sdkLinks
A clean, modular SDK for building AI agents with OpenHands V1.
☆22Updated this week
Alternatives and similar repositories for agent-sdk
Users that are interested in agent-sdk are comparing it to the libraries listed below
Sorting:
- ☆117Updated 4 months ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆50Updated 5 months ago
- 🚀 SWE-bench Goes Live!☆119Updated this week
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆62Updated 11 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆159Updated 2 months ago
- ☆73Updated 6 months ago
- Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆109Updated 3 months ago
- Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆78Updated 2 months ago
- ☆89Updated 4 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆84Updated this week
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆74Updated 3 months ago
- DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL☆95Updated last week
- Code for the paper 🌳 Tree Search for Language Model Agents☆215Updated last year
- Efficient Agent Training for Computer Use☆131Updated 2 weeks ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆173Updated 2 months ago
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆40Updated 9 months ago
- ☆103Updated 9 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆84Updated 4 months ago
- ☆73Updated 3 weeks ago
- ☆73Updated 2 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆171Updated 3 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆98Updated 5 months ago
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]☆171Updated 2 months ago
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.☆59Updated 11 months ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆130Updated last year
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆143Updated 10 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Updated 8 months ago
- Complex Function Calling Benchmark.☆135Updated 8 months ago
- The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"☆64Updated last week
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆93Updated 4 months ago