[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents
☆672Jun 8, 2026Updated last week
Alternatives and similar repositories for SWE-smith
Users that are interested in SWE-smith are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆290Jul 13, 2025Updated 11 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆690Jul 29, 2025Updated 10 months ago
- [NeurIPS 2025 D&B] 🚀 SWE-bench Goes Live!☆199Updated this week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆528Jun 8, 2026Updated last week
- Run SWE-bench evaluations remotely☆70Aug 14, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving☆339Dec 18, 2025Updated 5 months ago
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆699Mar 16, 2025Updated last year
- ☆45Mar 6, 2026Updated 3 months ago
- ☆139May 8, 2025Updated last year
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Sep 24, 2025Updated 8 months ago
- SWE-bench: Can Language Models Resolve Real-world Github Issues?☆5,175Apr 1, 2026Updated 2 months ago
- ☆13Mar 5, 2025Updated last year
- A benchmark for LLMs on complicated tasks in the terminal☆2,342Jan 22, 2026Updated 4 months ago
- Agentless🐱: an agentless approach to automatically solve software development problems☆2,068Dec 22, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆641Sep 1, 2025Updated 9 months ago
- [ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment☆148Apr 20, 2025Updated last year
- [ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation☆53Jan 28, 2026Updated 4 months ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,993Updated this week
- ☆46May 3, 2026Updated last month
- ☆28Jun 2, 2026Updated last week
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆83Apr 28, 2026Updated last month
- Democratizing Reinforcement Learning for LLMs☆5,608Updated this week
- ☆137Jun 6, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation☆322Feb 24, 2025Updated last year
- Reproducing R1 for Code with Reliable Rewards☆310May 5, 2025Updated last year
- Benchmarking Goal-Oriented Software Engineering☆165May 5, 2026Updated last month
- ✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024☆208Aug 16, 2024Updated last year
- Measuring agents' ability to get work done on a computer☆231Updated this week
- Enhancing AI Software Engineering with Repository-level Code Graph☆278Apr 1, 2025Updated last year
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆81May 2, 2025Updated last year
- SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersec…☆19,496Updated this week
- AutoLibra: Metric Induction for Agents from Open-Ended Human Feedback☆19Apr 23, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software E…☆1,439Jul 18, 2025Updated 10 months ago
- [ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI☆507Jan 3, 2026Updated 5 months ago
- ☆33Jan 8, 2025Updated last year
- [NeurIPS '25] GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents☆85Apr 27, 2026Updated last month
- ☆78Feb 28, 2026Updated 3 months ago
- Commit0: Library Generation from Scratch☆192Feb 24, 2026Updated 3 months ago
- [ESEC/FSE'23] Hue: A User-Adaptive Parser for Hybrid Logs☆10Aug 24, 2023Updated 2 years ago