xlab-uiuc / AIOpsLab
A holistic framework to enable the design, development, and evaluation of autonomous AIOps agents.
☆12Updated this week
Alternatives and similar repositories for AIOpsLab
Users that are interested in AIOpsLab are comparing it to the libraries listed below
Sorting:
- Code repository for SRE agent as part of ITBench☆10Updated last week
- Cloud incidents/failures related work.☆17Updated 4 months ago
- ☆12Updated 2 years ago
- Zodiac: Unearthing Semantic Checks for Cloud Infrastructure-as-Code Programs, SOSP 2024☆12Updated 5 months ago
- Code repository for ITBench☆35Updated last week
- This is the repo for remote direct memory introspection.☆21Updated last year
- A Framework for Automated Validation of Deep Learning Training Tasks☆15Updated last week
- [ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?☆55Updated 2 weeks ago
- ☆19Updated 9 months ago
- ☆30Updated 11 months ago
- ☆11Updated 6 months ago
- Predict the performance of LLM inference services☆18Updated last week
- Graph based Incident Extraction and Diagnosis in Large-Scale Online Systems (ASE'22)☆9Updated 4 months ago
- ☆11Updated 4 months ago
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆167Updated 7 months ago
- [ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo☆21Updated last week
- Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …☆33Updated 11 months ago
- Expressive, Easy to Build, and High-Performance Application Networks☆17Updated last week
- Serverless LLM Serving for Everyone.☆465Updated 3 weeks ago
- A Reading List of System Configuration Management☆56Updated 8 months ago
- ☆21Updated 11 months ago
- This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.☆10Updated 2 months ago
- ☆37Updated 6 months ago
- A series of work towards achieving ACV.☆17Updated last month
- This repository includes the VM request traces of the top 20 tenants in 4 months of 2021 in Huawei Cloud.☆23Updated this week
- Legolas: A Fault Injection Framework for Efficient Exposure of Partial Failures in Distributed Systems☆12Updated last year
- ☆21Updated last year
- Must-read papers on improving efficiency for LLM serving clusters☆29Updated last week
- AutoLog: A Log Sequence Synthesis Framework for Anomaly Detection [ASE'23]☆38Updated last year
- Code for ASE'21 paper "AID: Efficient Prediction of Aggregated Intensity of Dependency in Large-scale Cloud Systems"☆15Updated 3 years ago