[DL4C @ ICLR 2025] A Benchmark for Automated Environment Setup
☆35Nov 9, 2025Updated 5 months ago
Alternatives and similar repositories for EnvBench
Users that are interested in EnvBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ⚙️ A tool for collecting executable code datasets with GitHub Actions ⚙️☆23Apr 28, 2026Updated last week
- From "A comprehensive evaluation of SZZ Variants through a developer-informed oracle" (pdf open-access at https://doi.org/10.1016/j.jss.2…☆18Aug 25, 2023Updated 2 years ago
- Java bindings for tree-sitter☆60Feb 19, 2026Updated 2 months ago
- ☆18Jan 17, 2022Updated 4 years ago
- [ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.☆59Jul 21, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 筑波大学にいるかいないかを記録・公開する Web サイト☆25Jul 2, 2025Updated 10 months ago
- This repo contains all the codes for SEScore implementation☆15Mar 3, 2025Updated last year
- [ESEC/FSE'23] Hue: A User-Adaptive Parser for Hybrid Logs☆10Aug 24, 2023Updated 2 years ago
- Train Ticket - A Benchmark Microservice System☆15Updated this week
- ☆11Jan 19, 2025Updated last year
- ☆25Oct 2, 2024Updated last year
- ☆15Jan 7, 2023Updated 3 years ago
- [ACL 2024] On the Multi-turn Instruction Following for Conversational Web Agents☆17Oct 12, 2024Updated last year
- Evaluation of source authorship attribution tool☆23Jun 5, 2021Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆17Mar 3, 2025Updated last year
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"☆24Mar 18, 2025Updated last year
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆76Apr 28, 2026Updated last week
- Leaderboard of Frontier Models for Program Repair https://repairbench.github.io/☆11Oct 26, 2025Updated 6 months ago
- Sharable Grakn knowledge graphs☆14Dec 28, 2022Updated 3 years ago
- IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents (NeurIPS 2024)☆17Jul 14, 2025Updated 9 months ago
- ☆14May 20, 2022Updated 3 years ago
- JooFlux is a Java agent for dynamic aspect-oriented middlewares.☆28Apr 7, 2015Updated 11 years ago
- introduction to dataflow analysis using julia☆14Oct 26, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- (TIP 23) Boosting Night-time Scene Parsing with Learnable Frequency☆14Mar 6, 2026Updated last month
- A lightweight tool for detecting bugs on Graph Database Management Systems☆15Jan 9, 2024Updated 2 years ago
- [ICLR 2025] "GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation", Tao Feng, Yihang Sun, Jiaxuan You☆18Mar 18, 2025Updated last year
- [EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning☆15May 13, 2025Updated 11 months ago
- ☆38Jan 24, 2022Updated 4 years ago
- 功能丰富的在线文本清理工具,可用于 PDF、PPT、CAJ 等文字复制格式化,去除多余的空格与换行☆19Jan 23, 2023Updated 3 years ago
- A framework for the large scale analysis of programming language usage.☆30Jun 27, 2023Updated 2 years ago
- ☆11Mar 27, 2021Updated 5 years ago
- A Helix plugin to display pressed keys on screen☆32Sep 20, 2025Updated 7 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆39Oct 28, 2025Updated 6 months ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆22May 15, 2025Updated 11 months ago
- Highly-customizable dotfiles manager☆14Feb 19, 2023Updated 3 years ago
- An ANTLR4 grammar for ECMAScript 5.1☆16Jul 13, 2017Updated 8 years ago
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆39Jan 16, 2026Updated 3 months ago
- Flamegraph (Iciclegraph) swing component☆23Updated this week
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated last year