All-Hands-AI / openhands-resolver
A system that tries to resolve all issues on a github repo with OpenHands.
☆103Updated 4 months ago
Alternatives and similar repositories for openhands-resolver:
Users that are interested in openhands-resolver are comparing it to the libraries listed below
- Agent computer interface for AI software engineer.☆52Updated this week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆129Updated last week
- Harness used to benchmark aider against SWE Bench benchmarks☆67Updated 9 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆74Updated last week
- Enhancing AI Software Engineering with Repository-level Code Graph☆149Updated 2 months ago
- An agent benchmark with tasks in a simulated software company.☆273Updated last week
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆92Updated 5 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 8 months ago
- ☆73Updated 2 months ago
- ☆50Updated 4 months ago
- Finetune Llama-3-8b on the MathInstruct dataset☆108Updated 5 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆77Updated last month
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 2 months ago
- ☆106Updated last week
- ☆87Updated 8 months ago
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆156Updated this week
- LLM reads a paper and produce a working prototype☆51Updated 2 weeks ago
- Zep: Long-Term Memory for AI Assistants (Python Client)☆97Updated last week
- ☆53Updated last week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆165Updated 3 weeks ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆66Updated 8 months ago
- ☆39Updated 8 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆84Updated 2 weeks ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆107Updated 6 months ago
- ☆36Updated 2 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆118Updated last year
- Open Agent Computer Interface☆59Updated 4 months ago
- AWM: Agent Workflow Memory☆252Updated last month
- Scaling inference-time compute for LLM-as-a-judge, automated evaluations, guardrails, and reinforcement learning.☆189Updated last week
- r2e: turn any github repository into a programming agent environment☆105Updated 3 weeks ago