itbench-hub / ITBench-ScenariosLinks
Code repository for scenarios and environment setup as part of ITBench
☆13Updated this week
Alternatives and similar repositories for ITBench-Scenarios
Users that are interested in ITBench-Scenarios are comparing it to the libraries listed below
Sorting:
- A holistic framework to enable the design, development, and evaluation of autonomous AIOps agents.☆12Updated 6 months ago
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆84Updated last year
- LILAC: Log Parsing using LLMs with Adaptive Parsing Cache [FSE'24]☆60Updated last year
- ☆11Updated 10 months ago
- [ICSE'25] Aligning the Objective of LLM-based Program Repair☆22Updated 8 months ago
- [ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?☆216Updated 2 weeks ago
- This repo is for our submission for ICSE 2025.☆20Updated last year
- Pip compatible CodeBLEU metric implementation available for linux/macos/win☆125Updated 8 months ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆62Updated 2 months ago
- Reinforcement Learning for Repository-Level Code Completion☆42Updated last year
- Benchmark ClassEval for class-level code generation.☆145Updated last year
- AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection [ASE'23]☆40Updated last year
- [NeurIPS'25] Official Implementation of RISE (Reinforcing Reasoning with Self-Verification)☆30Updated 3 months ago
- Cloud incidents/failures related work.☆19Updated 10 months ago
- DeepCrime - Mutation Testing Tool for Deep Learning Systems☆15Updated 2 years ago
- EvoEval: Evolving Coding Benchmarks via LLM☆80Updated last year
- [ICSE 2023] Differentiable interpretation and failure-inducing input generation for neural network numerical bugs.☆13Updated last year
- Enhancing AI Software Engineering with Repository-level Code Graph☆228Updated 7 months ago
- An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories☆66Updated last year
- Large Language Models for Software Engineering☆255Updated 4 months ago
- A toolkit for hybrid log parsing☆18Updated 2 years ago
- Dataflow-guided retrieval augmentation for repository-level code completion, ACL 2024 (main)☆29Updated 8 months ago
- Repoformer: Selective Retrieval for Repository-Level Code Completion (ICML 2024)☆61Updated 5 months ago
- A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.☆155Updated 11 months ago
- A Large-scale Evaluation for Log Parsing Techniques: How Far are We? [ISSTA'24]☆126Updated last month
- A Code Efficiency Benchmark for Code Generation☆11Updated 6 months ago
- ☆15Updated 7 months ago
- CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context☆18Updated 3 months ago
- TDD-Bench-Verified is a new benchmark for generating test cases for test-driven development (TDD)☆25Updated 2 months ago
- Log Parsing: How Far Can ChatGPT Go? (ASE 2023 - NIER Track)☆21Updated last year