aorwall/SWE-bench-docker

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aorwall/SWE-bench-docker)

aorwall / SWE-bench-docker

☆106

Alternatives and similar repositories for SWE-bench-docker

Users that are interested in SWE-bench-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Aider-AI / aider-swe-bench
View on GitHub
Harness used to benchmark aider against SWE Bench benchmarks
☆87Jun 27, 2024Updated 2 years ago
SWE-bench / experiments
View on GitHub
Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
☆274Mar 29, 2026Updated 3 months ago
aorwall / moatless-tools
View on GitHub
☆641Sep 1, 2025Updated 10 months ago
NL2Code / CodeR
View on GitHub
☆158Aug 27, 2024Updated last year
OpenDevin / OD-SWE-bench
View on GitHub
Enhanced fork of SWE-bench, tailored for OpenDevin's ecosystem.
☆30May 26, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SWE-Gym / SWE-Bench-Fork
View on GitHub
☆13Mar 5, 2025Updated last year
mariushobbhahn / SWEBench-verified-mini
View on GitHub
☆38Jan 8, 2025Updated last year
OpenAutoCoder / Agentless
View on GitHub
Agentless🐱: an agentless approach to automatically solve software development problems
☆2,085Dec 22, 2024Updated last year
OpenCSGs / csghub-sdk
View on GitHub
The CSGHub SDK is a powerful Python client specifically designed to interact seamlessly with the CSGHub server. This toolkit is engineere…
☆21Jul 17, 2026Updated last week
aorwall / moatless-tree-search
View on GitHub
☆141Jun 6, 2025Updated last year
yingweima2022 / CodeLLM
View on GitHub
☆12Jan 31, 2024Updated 2 years ago
SWE-bench / SWE-bench
View on GitHub
SWE-bench: Can Language Models Resolve Real-world Github Issues?
☆5,483Apr 1, 2026Updated 3 months ago
SalesforceAIResearch / swecomm
View on GitHub
☆28Jun 2, 2026Updated last month
SWE-bench / sb-cli
View on GitHub
Run SWE-bench evaluations remotely
☆78Aug 14, 2025Updated 11 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
InternLM / SWE-Fixer
View on GitHub
☆139May 8, 2025Updated last year
codefuse-ai / RepoFuse
View on GitHub
☆66Jan 16, 2025Updated last year
AutoCodeRoverSG / auto-code-rover
View on GitHub
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-be…
☆3,096Apr 24, 2025Updated last year
Qurrent-AI / RES-Q
View on GitHub
RES-Q: Evaluating the Code-Editing Capability of Large Language Model Systems at the Repository Scale
☆28Jun 24, 2026Updated last month
ozyyshr / RepoGraph
View on GitHub
Enhancing AI Software Engineering with Repository-level Code Graph
☆289Apr 1, 2025Updated last year
SalesforceAIResearch / SweRank
View on GitHub
☆23Jun 2, 2026Updated last month
aorwall / moatless-testbeds
View on GitHub
Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git…
☆14Apr 9, 2025Updated last year
FSoft-AI4Code / RepoHyper
View on GitHub
[FORGE 2025] Graph-based method for end-to-end code completion with context awareness on repository
☆74Sep 3, 2024Updated last year
commit-0 / commit0
View on GitHub
Commit0: Library Generation from Scratch
☆189Feb 24, 2026Updated 5 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
All-Hands-AI / SWE-bench
View on GitHub
Enhanced fork of SWE-bench, tailored for OpenDevin's ecosystem.
☆33Mar 20, 2025Updated last year
seketeam / EvoCodeBench
View on GitHub
An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories
☆71Aug 15, 2024Updated last year
AlignmentResearch / learned-planner
View on GitHub
Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban
☆21Jun 29, 2025Updated last year
multi-swe-bench / MopenHands
View on GitHub
☆17Apr 9, 2025Updated last year
R2E-Gym / R2E-Gym
View on GitHub
[COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
☆310Jul 13, 2025Updated last year
d223302 / Over-Reasoning-of-LLMs
View on GitHub
Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models
☆11Jan 23, 2024Updated 2 years ago
amazon-science / cceval
View on GitHub
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)
☆181Aug 15, 2025Updated 11 months ago
Aider-AI / refactor-benchmark
View on GitHub
Aider's refactoring benchmark exercises based on popular python repos
☆87Oct 10, 2024Updated last year
noahshinn / erica
View on GitHub
Erica will always know your context because she can see everything on your screen.
☆15May 21, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
FanZT6 / FairMT-bench
View on GitHub
☆14Mar 7, 2025Updated last year
open-compass / DevEval
View on GitHub
A Comprehensive Benchmark for Software Development.
☆138May 30, 2024Updated 2 years ago
defog-ai / defog-desktop
View on GitHub
A desktop compatible version of the Defog app
☆14Aug 20, 2024Updated last year
grantjenks / py-tree-sitter-languages
View on GitHub
Binary Python wheels for all tree sitter languages.
☆277Feb 17, 2025Updated last year
SWE-Gym / SWE-Gym
View on GitHub
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆711Jul 29, 2025Updated 11 months ago
msv-lab / f1x
View on GitHub
Efficient patch generation engine for C/C++ programs
☆19Dec 23, 2022Updated 3 years ago
multi-swe-bench / multi-swe-bench
View on GitHub
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
☆354Dec 18, 2025Updated 7 months ago