[ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.
☆59Jul 21, 2025Updated 7 months ago
Alternatives and similar repositories for SWE-Dev
Users that are interested in SWE-Dev are comparing it to the libraries listed below
Sorting:
- [DL4C @ ICLR 2025] A Benchmark for Automated Environment Setup☆34Nov 9, 2025Updated 3 months ago
- [ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation☆44Jan 28, 2026Updated last month
- Make Agent CLI is a powerful command-line tool designed to streamline the management and deployment of AI agents across multiple chains. …☆15Sep 3, 2025Updated 5 months ago
- [EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning☆15May 13, 2025Updated 9 months ago
- RL Scaling and Test-Time Scaling (ICML'25)☆114Jan 23, 2025Updated last year
- A pytorch implementation of Abstract Syntax Networks☆13Jun 27, 2025Updated 8 months ago
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMs☆47Aug 26, 2024Updated last year
- On demand communication☆32Feb 12, 2026Updated 2 weeks ago
- Evaluation utilities based on SymPy.☆21Dec 12, 2024Updated last year
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆27Dec 17, 2024Updated last year
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving☆323Dec 18, 2025Updated 2 months ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆23Oct 1, 2024Updated last year
- [ACL'25] UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench☆35Aug 12, 2025Updated 6 months ago
- OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection(ICCAD 2024)☆29Oct 20, 2024Updated last year
- Match your resume with a job, effortlessly☆27Apr 23, 2025Updated 10 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Sep 24, 2025Updated 5 months ago
- Reproducing R1 for Code with Reliable Rewards☆290May 5, 2025Updated 9 months ago
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆101Aug 25, 2025Updated 6 months ago
- ☆39Feb 7, 2025Updated last year
- The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…☆30Jul 16, 2023Updated 2 years ago
- Learning the basic fundamentals of high level programming with Python and JavaScript☆11Mar 10, 2023Updated 2 years ago
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation☆33Updated this week
- [ASE2024] Mutual Learning-Based Framework for Enhancing Robustness of Code Models via Adversarial Training☆11Sep 13, 2024Updated last year
- [ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment☆140Apr 20, 2025Updated 10 months ago
- ☆99Jan 26, 2026Updated last month
- exploring whether LLMs perform case-based or rule-based reasoning☆30Mar 2, 2024Updated last year
- Toy implementation of Strawberry☆33Sep 24, 2024Updated last year
- [ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI☆483Jan 3, 2026Updated last month
- Enhanced Explainable Neural Network☆10Dec 25, 2021Updated 4 years ago
- Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.☆32Apr 26, 2021Updated 4 years ago
- Text to audio with Tik-Tok Voices☆13Apr 6, 2023Updated 2 years ago
- Code for the paper "SMACE: A New Method for the Interpretability of Composite Decision Systems", ECML 2022☆15Apr 17, 2023Updated 2 years ago
- HashLips Art Engine is a tool used to create multiple different instances of artworks based on provided layers.☆11Nov 21, 2021Updated 4 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆50Jun 30, 2025Updated 8 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆35Mar 19, 2024Updated last year
- Agentless🐱: an agentless approach to automatically solve software development problems☆2,010Dec 22, 2024Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆166Oct 11, 2024Updated last year
- LangChain + LiteLLM that works☆50Sep 1, 2025Updated 6 months ago