SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents
☆83Apr 28, 2026Updated this week
Alternatives and similar repositories for SWE-PolyBench
Users that are interested in SWE-PolyBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation for the paper, StackEval: Benchmarking LLMs in Coding Assistance, https://arxiv.org/abs/2412.05288☆20Oct 30, 2024Updated last year
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving☆333Dec 18, 2025Updated 4 months ago
- ☆11Sep 7, 2023Updated 2 years ago
- Agentless Lite: RAG-based SWE-Bench software engineering scaffold☆45Apr 15, 2025Updated last year
- ML-Dev-Bench is a benchmark for evaluating AI agents against various ML development tasks.☆42Mar 10, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆38May 15, 2025Updated 11 months ago
- ☆28Aug 13, 2025Updated 8 months ago
- 《CMake Practice》☆15Oct 28, 2019Updated 6 years ago
- Run SWE-bench evaluations remotely☆62Aug 14, 2025Updated 8 months ago
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆260Mar 29, 2026Updated last month
- Redux bindings for HTML5 audio elements☆12May 29, 2017Updated 8 years ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆671Jul 29, 2025Updated 9 months ago
- CLARA: Confidence of Labels and Raters☆10Jun 3, 2023Updated 2 years ago
- A collection of scripts and tools for analyzing SWE agents.☆16May 7, 2025Updated 11 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- The official repository of the Omni-MATH benchmark.☆93Dec 22, 2024Updated last year
- Universal skill installer for AI coding agents☆34Apr 6, 2026Updated 3 weeks ago
- This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software E…☆1,440Jul 18, 2025Updated 9 months ago
- SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner☆39Jun 29, 2025Updated 10 months ago
- ☆15Aug 7, 2014Updated 11 years ago
- A High Performance Library for Time-Series Featurization.☆25Aug 29, 2023Updated 2 years ago
- ☆11Mar 15, 2024Updated 2 years ago
- A System for Debloating C/C++ Programs☆32Jul 16, 2021Updated 4 years ago
- Mass Android app vulnerability analysis toolkit☆13Dec 6, 2016Updated 9 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Code for our paper "Learning to Generate Unit Tests for Automated Debugging"☆18Mar 7, 2025Updated last year
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆76Mar 23, 2026Updated last month
- Code for "Beyond Individual Input for Deep Anomaly Detection on Tabular Data"☆15Nov 17, 2025Updated 5 months ago
- ☆12Oct 10, 2022Updated 3 years ago
- Open-source coding assistant for Visual Studio Code. Connect to LLMs from OpenAI or Google.☆18Aug 14, 2023Updated 2 years ago
- KubeCon-CloudNativeCon-OpenSourceSummit-AI_dev-China-2024's slides. / 2024中国(香港)CNCF大会PPT。☆12Aug 31, 2024Updated last year
- Pinecone text client library☆68Aug 11, 2025Updated 8 months ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- Simple pub/sub architecture with AWS Copilot☆10Feb 20, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- simple, self-contained, relocatable emacs☆18Mar 16, 2010Updated 16 years ago
- Minimum DevSecOps with Monitoring Options on Amazon EKS☆13Mar 27, 2026Updated last month
- Interpretable Deep Clustering for Tabular Data (ICML 2024)☆18Aug 26, 2025Updated 8 months ago
- leveldb backed mail repl.☆10May 5, 2015Updated 10 years ago
- Post processing library used to analyze memory snapshots☆29Apr 8, 2026Updated 3 weeks ago
- ☆11Nov 22, 2021Updated 4 years ago
- Multilingual Code Co-Evolution Using Large Language Models☆14Dec 8, 2024Updated last year