OpenDevin / OD-SWE-bench
Enhanced fork of SWE-bench, tailored for OpenDevin's ecosystem.
☆23Updated 9 months ago
Alternatives and similar repositories for OD-SWE-bench:
Users that are interested in OD-SWE-bench are comparing it to the libraries listed below
- Agent computer interface for AI software engineer.☆46Updated this week
- Harness used to benchmark aider against SWE Bench benchmarks☆66Updated 8 months ago
- Aider's refactoring benchmark exercises based on popular python repos☆61Updated 5 months ago
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Updated 8 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- ☆50Updated 3 months ago
- LangChain + LiteLLM that works☆39Updated 2 weeks ago
- RepoQA: Evaluating Long-Context Code Understanding☆105Updated 4 months ago
- ☆35Updated 2 months ago
- LLM finetuning☆42Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆25Updated 4 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 9 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆52Updated 3 months ago
- ☆28Updated 10 months ago
- ☆78Updated 3 weeks ago
- Easiest way to build custom agents, in a no-code notion style editor, using simple macros.☆24Updated 4 months ago
- ☆20Updated last year
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated 9 months ago
- LLMs as Collaboratively Edited Knowledge Bases☆44Updated last year
- ☆70Updated last month
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆51Updated last month
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆74Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 3 months ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆23Updated last year
- A better way of testing, inspecting, and analyzing AI Agent traces.☆29Updated this week
- Hub for Open Source AGiXT Extensions, Chains, Prompts, and Agents.☆17Updated last year
- 🧠 Mindstorm in Natural Language-based Societies of Mind☆54Updated this week
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆98Updated this week