SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents
☆84May 13, 2026Updated 3 weeks ago
Alternatives and similar repositories for SWE-PolyBench
Users that are interested in SWE-PolyBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving☆337Dec 18, 2025Updated 5 months ago
- Building an Intelligent AWS Cloud Engineer Agent with Strands Agents SDK☆27Dec 16, 2025Updated 5 months ago
- Curated resources related Strands Agents.☆50May 20, 2026Updated 2 weeks ago
- Cloud serverless implementations of AWS Strands Agents.☆17Jun 20, 2025Updated 11 months ago
- Agentless Lite: RAG-based SWE-Bench software engineering scaffold☆48Apr 15, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆40May 15, 2025Updated last year
- Viewer for text datasets in formats like HuggingFace, JSONL, etc.☆15Feb 25, 2025Updated last year
- Run SWE-bench evaluations remotely☆69Aug 14, 2025Updated 9 months ago
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆265Mar 29, 2026Updated 2 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆668Jun 1, 2026Updated last week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆686Jul 29, 2025Updated 10 months ago
- CLARA: Confidence of Labels and Raters☆10Jun 3, 2023Updated 3 years ago
- ☆11May 6, 2019Updated 7 years ago
- Script that makes it easy to login over ssh to your local machine☆16Dec 13, 2017Updated 8 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The official repository of the Omni-MATH benchmark.☆93Dec 22, 2024Updated last year
- LLM benchmarks☆13Feb 22, 2024Updated 2 years ago
- transparent HTTP cache proxy with Redis — deduplicate API calls, save costs☆12Feb 21, 2026Updated 3 months ago
- This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software E…☆1,439Jul 18, 2025Updated 10 months ago
- Digital Ocean CLI Uygulaması Ders İçeriği☆12Mar 17, 2018Updated 8 years ago
- Monash Scalable Time Series Evaluation Repository☆20Aug 28, 2025Updated 9 months ago
- Easy to use continuous 3D touch gesture recognizer.☆13Dec 29, 2016Updated 9 years ago
- Kafka sink connector for Amazon EventBridge to send events (records) from Kafka topic(s) to the specified EventBridge event bus☆78Jun 2, 2026Updated last week
- Emscripten WASM build of libutp☆16Oct 24, 2019Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆81Apr 28, 2026Updated last month
- ☆17May 19, 2026Updated 2 weeks ago
- MCP server for AI coding agents. Instead of reading files one by one, your agent gets dependency graphs, git intent, blast radius, and ch…☆57May 31, 2026Updated last week
- Minimalist AI agent that fixes itself when things break.☆40May 11, 2026Updated 3 weeks ago
- ☆12Oct 10, 2022Updated 3 years ago
- CDK construct to deploy an Ethereum node running on Amazon Managed Blockchain☆15Updated this week
- Tamper Monkey in Steroids works in Electron☆11Feb 21, 2017Updated 9 years ago
- KubeCon-CloudNativeCon-OpenSourceSummit-AI_dev-China-2024's slides. / 2024中国(香港)CNCF大会PPT。☆12Aug 31, 2024Updated last year
- The very simple ETS wrapper simplifying cross-process ETS handling (like `Agent`, but `:ets`).☆13Jun 7, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Cool nightmarejs browser automation examples.☆14Apr 11, 2022Updated 4 years ago
- Post processing library used to analyze memory snapshots☆32May 29, 2026Updated last week
- ☆11Nov 22, 2021Updated 4 years ago
- ☆13Aug 12, 2022Updated 3 years ago
- A specification for OpenInference, a semantic mapping of ML inferences☆47Apr 17, 2024Updated 2 years ago
- Harness used to benchmark aider against SWE Bench benchmarks☆84Jun 27, 2024Updated last year
- ☆14May 20, 2022Updated 4 years ago