[ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.
☆60Jul 21, 2025Updated 10 months ago
Alternatives and similar repositories for SWE-Dev
Users that are interested in SWE-Dev are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [DL4C @ ICLR 2025] A Benchmark for Automated Environment Setup☆35Nov 9, 2025Updated 6 months ago
- [ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation☆52Jan 28, 2026Updated 3 months ago
- RL Scaling and Test-Time Scaling (ICML'25)☆116Jan 23, 2025Updated last year
- A pytorch implementation of Abstract Syntax Networks☆12Jun 27, 2025Updated 10 months ago
- Evaluation utilities based on SymPy.☆22Dec 12, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- On demand communication☆34Apr 16, 2026Updated last month
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving☆333Dec 18, 2025Updated 5 months ago
- a datastructure for scalable combinatorial syntax☆18Apr 29, 2026Updated 3 weeks ago
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMs☆52Aug 26, 2024Updated last year
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆105Sep 24, 2025Updated 7 months ago
- OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection(ICCAD 2024)☆32Oct 20, 2024Updated last year
- CVE-Factory☆106Mar 27, 2026Updated last month
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Apr 22, 2025Updated last year
- Submission to ICLR☆47Jun 19, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment☆148Apr 20, 2025Updated last year
- exploring whether LLMs perform case-based or rule-based reasoning☆31Mar 2, 2024Updated 2 years ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆23Oct 1, 2024Updated last year
- ☆13Mar 5, 2025Updated last year
- [EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning☆15May 13, 2025Updated last year
- Structured, temporal memory for AI agents.☆79Updated this week
- Environments, tools, and benchmarks for general computer agents☆15Dec 3, 2024Updated last year
- ☆19Updated this week
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆104Aug 25, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆170Oct 11, 2024Updated last year
- The benchmark proposed in paper: GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability☆25Aug 12, 2025Updated 9 months ago
- ☆41Jun 19, 2024Updated last year
- ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World☆25Jun 17, 2025Updated 11 months ago
- ☆346May 24, 2025Updated 11 months ago
- Universal LLM security auditor with automated jailbreak testing, DSPy optimization, and OWASP 2025-aligned attack patterns☆21Oct 23, 2025Updated 6 months ago
- This is the Placeholder for Llama. Starting with Llama 3☆11May 20, 2024Updated 2 years ago
- the implementation of Embedding API Dependency Graph for Neural Code Generation☆12Jun 6, 2021Updated 4 years ago
- ☆16Dec 25, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An LLM leaderboard for stateful agents☆21Oct 16, 2025Updated 7 months ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆68Apr 16, 2026Updated last month
- Toy implementation of Strawberry☆33Sep 24, 2024Updated last year
- ☆11Oct 11, 2023Updated 2 years ago
- Python powered music controlling webpage with websockets and bottle py (works with spotify, vlc, audacious, and others)☆11Jun 9, 2017Updated 8 years ago
- [ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI☆500Jan 3, 2026Updated 4 months ago
- ☆110Jul 15, 2025Updated 10 months ago