A benchmarking tool for evaluating AI coding assistants on real-world software engineering tasks from the SWE-Bench dataset.
☆62Jan 22, 2026Updated 3 months ago
Alternatives and similar repositories for refact-bench
Users that are interested in refact-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICSE 2025] The Seeds of the FUTURE Sprout from History: Fuzzing for Unveiling Vulnerabilities in Prospective Deep-Learning Libraries (AC…☆20Dec 22, 2025Updated 4 months ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆17Feb 26, 2026Updated 2 months ago
- This is the tool released in the ASE'23 paper "Generative Type Inference for Python".☆28Sep 12, 2023Updated 2 years ago
- ☆13May 19, 2024Updated last year
- A fork of HumanEval-Java from the paper "Impact of Code Language Models on Automated Program Repair"☆14Dec 11, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Semi-automated modelling and Model-Based Testing for CosmWasm contracts☆17Jun 28, 2024Updated last year
- Reproducing BugsInPy: Benchmarking Bugs in Python Projects☆14Sep 4, 2023Updated 2 years ago
- Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …☆14Aug 25, 2023Updated 2 years ago
- BiasFinder | IEEE TSE | Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems☆11Jan 18, 2022Updated 4 years ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Aug 24, 2022Updated 3 years ago
- ☆20Jun 9, 2023Updated 2 years ago
- ☆10Oct 20, 2023Updated 2 years ago
- SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution☆27Nov 11, 2025Updated 5 months ago
- A Repository of Real, Recent Java Bugs☆22Jan 6, 2026Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Feb 28, 2024Updated 2 years ago
- A Natural Language Generation System☆14Apr 21, 2026Updated 2 weeks ago
- Nyx: Detecting Exploitable Front-Running Vulnerabilities in Smart Contracts☆23May 11, 2024Updated last year
- Efficient APR with LLMs http://arxiv.org/pdf/2402.06598☆16May 28, 2024Updated last year
- Color palette and swatches for macOS's color picker.☆20Jun 9, 2020Updated 5 years ago
- The TacTok automated Coq proof script synthesis tool☆17Jan 9, 2024Updated 2 years ago
- 🌟 SwarmAgent: A framework for simulating social group dynamics using multi-agent collaboration, aiding insights into collective behavior…☆13Dec 5, 2023Updated 2 years ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆103Sep 24, 2025Updated 7 months ago
- ☆27Apr 7, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Reflection library for Coq☆12Sep 26, 2019Updated 6 years ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆20Sep 18, 2025Updated 7 months ago
- Siren: Byzantine-robust Federated Learning via Proactive Alarming (SoCC '21)☆11Mar 28, 2024Updated 2 years ago
- ☆35Mar 6, 2026Updated 2 months ago
- A naive interpreter for IR of NJU compiler principle lab3, to accelerate interpretation, the ir will be compiled to machine-friendly bina…☆16Jun 17, 2020Updated 5 years ago
- SnapDocs - A Modern, Open-Source Document Workspace☆25Sep 7, 2025Updated 7 months ago
- Improving Machine Translation Systems via Isotopic Replacement☆12Apr 14, 2023Updated 3 years ago
- ☆16Jul 11, 2023Updated 2 years ago
- ☆29Mar 18, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings☆20Sep 1, 2025Updated 8 months ago
- ☆16May 17, 2025Updated 11 months ago
- Automated enforcement of net-negative LOC, complexity constraints, and quality standards for Claude code☆48Jul 29, 2025Updated 9 months ago
- AI Personas☆45Updated this week
- OpenCopilot flows editor☆12Oct 31, 2023Updated 2 years ago
- [ICML 2025 Poster] Official PyTorch Implementation of "Habitizing Diffusion Planning for Efficient and Effective Decision Making"☆36May 26, 2025Updated 11 months ago
- simple ansible playbook to take clean ubuntu 18.04 to CUDA 10, PyTorch 1.0, fastai, miniconda heaven☆12Dec 16, 2018Updated 7 years ago