THUDM/SWE-Dev

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/THUDM/SWE-Dev)

THUDM / SWE-Dev

[ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.

☆59

Alternatives and similar repositories for SWE-Dev

Users that are interested in SWE-Dev are comparing it to the libraries listed below

Sorting:

JetBrains-Research / EnvBench
View on GitHub
[DL4C @ ICLR 2025] A Benchmark for Automated Environment Setup
☆34Nov 9, 2025Updated 3 months ago
microsoft / FEA-Bench
View on GitHub
[ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation
☆44Jan 28, 2026Updated last month
BitteProtocol / make-agent
View on GitHub
Make Agent CLI is a powerful command-line tool designed to streamline the management and deployment of AI agents across multiple chains. …
☆15Sep 3, 2025Updated 5 months ago
Junjie-Ye / RoTBench
View on GitHub
[EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
☆15May 13, 2025Updated 9 months ago
THUDM / T1
View on GitHub
RL Scaling and Test-Time Scaling (ICML'25)
☆114Jan 23, 2025Updated last year
xiye17 / torchASN
View on GitHub
A pytorch implementation of Abstract Syntax Networks
☆13Jun 27, 2025Updated 8 months ago
PKU-Baichuan-MLSystemLab / CFBench
View on GitHub
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
☆47Aug 26, 2024Updated last year
sail-sg / odc
View on GitHub
On demand communication
☆32Feb 12, 2026Updated 2 weeks ago
tongyx361 / symeval
View on GitHub
Evaluation utilities based on SymPy.
☆21Dec 12, 2024Updated last year
abdelfattah-lab / nitro
View on GitHub
Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs
☆27Dec 17, 2024Updated last year
multi-swe-bench / multi-swe-bench
View on GitHub
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
☆323Dec 18, 2025Updated 2 months ago
Watchfulio / dataset-generator
View on GitHub
A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …
☆23Oct 1, 2024Updated last year
CUHK-Shenzhen-SE / UTBoost
View on GitHub
[ACL'25] UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
☆35Aug 12, 2025Updated 6 months ago
pku-liang / OriGen
View on GitHub
OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection(ICCAD 2024)
☆29Oct 20, 2024Updated last year
AstraBert / resume-matcher
View on GitHub
Match your resume with a job, effortlessly
☆27Apr 23, 2025Updated 10 months ago
zhenyuhe00 / SWE-Swiss
View on GitHub
SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution
☆104Sep 24, 2025Updated 5 months ago
ganler / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆290May 5, 2025Updated 9 months ago
rlite-project / RLite
View on GitHub
A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…
☆101Aug 25, 2025Updated 6 months ago
DSL-Lab / aops
View on GitHub
☆39Feb 7, 2025Updated last year
fanyin3639 / Rethinking-instruction-effectiveness
View on GitHub
The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…
☆30Jul 16, 2023Updated 2 years ago
Samuel-IG16 / alx-higher_level_programming
View on GitHub
Learning the basic fundamentals of high level programming with Python and JavaScript
☆11Mar 10, 2023Updated 2 years ago
princeton-pli / LongProc
View on GitHub
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆33Updated this week
VMnK-Run / MARVEL
View on GitHub
[ASE2024] Mutual Learning-Based Framework for Enhancing Robustness of Code Models via Adversarial Training
☆11Sep 13, 2024Updated last year
r2e-project / r2e
View on GitHub
[ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment
☆140Apr 20, 2025Updated 10 months ago
laude-institute / terminal-bench-2
View on GitHub
☆99Jan 26, 2026Updated last month
GraphPKU / Case_or_Rule
View on GitHub
exploring whether LLMs perform case-based or rule-based reasoning
☆30Mar 2, 2024Updated last year
ack-sec / toyberry
View on GitHub
Toy implementation of Strawberry
☆33Sep 24, 2024Updated last year
bigcode-project / bigcodebench
View on GitHub
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
☆483Jan 3, 2026Updated last month
ZebinYang / exnn
View on GitHub
Enhanced Explainable Neural Network
☆10Dec 25, 2021Updated 4 years ago
manueldeprada / Pretraining-T5-PyTorch-Lightning
View on GitHub
Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.
☆32Apr 26, 2021Updated 4 years ago
fbarberis / tiktok-tts
View on GitHub
Text to audio with Tik-Tok Voices
☆13Apr 6, 2023Updated 2 years ago
gianluigilopardo / smace
View on GitHub
Code for the paper "SMACE: A New Method for the Interpretability of Composite Decision Systems", ECML 2022
☆15Apr 17, 2023Updated 2 years ago
devburaq / hashlips_art_engine
View on GitHub
HashLips Art Engine is a tool used to create multiple different instances of artworks based on provided layers.
☆11Nov 21, 2021Updated 4 years ago
KempnerInstitute / llm_uncertainty
View on GitHub
Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"
☆11Apr 15, 2024Updated last year
princeton-pli / MeCo
View on GitHub
Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"
☆50Jun 30, 2025Updated 8 months ago
fangyuan-ksgk / CoT-Reasoning-without-Prompting
View on GitHub
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆35Mar 19, 2024Updated last year
OpenAutoCoder / Agentless
View on GitHub
Agentless🐱: an agentless approach to automatically solve software development problems
☆2,010Dec 22, 2024Updated last year
facebookresearch / cruxeval
View on GitHub
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆166Oct 11, 2024Updated last year
stanford-oval / chainlite
View on GitHub
LangChain + LiteLLM that works
☆50Sep 1, 2025Updated 6 months ago