autoiac-project / iac-evalLinks
[NeurIPS 24] IaC-Eval: A Code Generation Benchmark for Cloud Infrastructure-as-Code programs
☆31Updated 11 months ago
Alternatives and similar repositories for iac-eval
Users that are interested in iac-eval are comparing it to the libraries listed below
Sorting:
- Zodiac: Unearthing Semantic Checks for Cloud Infrastructure-as-Code Programs, SOSP 2024☆14Updated last year
- A holistic framework to enable the design, development, and evaluation of autonomous AIOps agents.☆12Updated 6 months ago
- [ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment☆135Updated 7 months ago
- Serverless LLM Serving for Everyone.☆611Updated this week
- A Framework for Automated Validation of Deep Learning Training Tasks☆53Updated 2 months ago
- Course information for CS598-Topics in LLM Agents(25'Spring) under the direction of Prof. Jiaxuan You ( jiaxuan@illinois.edu ).☆41Updated 7 months ago
- [NAACL 2025] Benchmark for Repository-Level Code Generation, focus on Executability, Correctness from Test Cases and Usage of Contexts fr…☆36Updated 8 months ago
- ☁️ Benchmarking LLMs for Cloud Config Generation | 云场景下的大模型基准测试☆37Updated last year
- Simulator for the datacenter, including power, cooling, server and other components☆16Updated 9 months ago
- ☆101Updated last year
- ☆297Updated 4 months ago
- RouterArena: An open framework for evaluating LLM routers with standardized datasets, metrics, an automated framework, and a live leaderb…☆39Updated last week
- ☆31Updated 2 months ago
- Predict the performance of LLM inference services☆20Updated 2 months ago
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆207Updated 6 months ago
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆94Updated 2 years ago
- ☆34Updated 9 months ago
- The official repository of ICCV 2025 paper "CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning".☆16Updated 3 weeks ago
- Must-read papers on improving efficiency for LLM serving clusters☆31Updated 6 months ago
- ☆197Updated 4 months ago
- Reproducing R1 for Code with Reliable Rewards☆272Updated 6 months ago
- Code repository for scenarios and environment setup as part of ITBench☆13Updated this week
- A Comprehensive Benchmark for Software Development.☆119Updated last year
- Modular and structured prompt caching for low-latency LLM inference☆103Updated last year
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆197Updated 4 months ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆196Updated last year
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation☆241Updated 11 months ago
- Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving☆282Updated 2 weeks ago
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆64Updated last year
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆84Updated last year