ntt-dkiku / chaos-eaterLinks
An LLM-based system that fully automates Chaos Engineering (ASE 2025, NIER track)
☆22Updated this week
Alternatives and similar repositories for chaos-eater
Users that are interested in chaos-eater are comparing it to the libraries listed below
Sorting:
- [IWQoS 2025] eACGM: An eBPF-based Automated Comprehensive Governance and Monitoring framework for AI/ML systems.☆20Updated 5 months ago
- Awesome resources for failure diagnosis research.☆52Updated 6 months ago
- Observability Volume Management☆41Updated 10 months ago
- An eBPF kernel Observable Agent To Spy Performance Issue On OS.☆13Updated 2 months ago
- ☆10Updated last year
- TraceWeaver is a research prototype for transparently tracing requests through a microservice without application instrumentation.☆23Updated last year
- ☆22Updated 2 years ago
- Awesome-papers is a collection of awesome papers about cloud computing including resource management, serverless, microservice, observer…☆126Updated last year
- ☆15Updated 3 years ago
- Real-Time Intrusion Detection and Prevention with Neural Network in Kernel using eBPF☆21Updated last year
- Blueprint Microservices Compiler: Flexible and Configurable Open-Source Microservice Benchmark Applications☆33Updated this week
- Push-Button End-to-End Testing of Kubernetes Operators and Controllers☆130Updated last week
- Train Ticket Auto Query Python Scripts☆29Updated 3 years ago
- Kubernetes cluster simulator for evaluating schedulers.☆127Updated 5 years ago
- [ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?☆242Updated last week
- Cloud incidents/failures related work.☆20Updated last year
- This repository contains experimental tools we developed to forecast a clusters' resource (CPU or memory) usage.☆44Updated 4 years ago
- ☆15Updated 2 years ago
- A K8s ClusterIP HTTP monitoring library based on eBPF☆19Updated 4 years ago
- A curated list of awesome academic researches and industrial materials about Artificial Intelligence for IT Operations (AIOps).☆298Updated 11 months ago
- Code repository for SRE agent as part of ITBench☆19Updated 4 months ago
- HydraGen: A Microservice Benchmark Generator☆20Updated 4 months ago
- [WIP] Simple scheduler and scenario system for learning Kubernetes Scheduler☆53Updated 3 years ago
- DeepPerf is an end-to-end deep learning based solution that can train a software performance prediction model from a limited number of sa…☆15Updated 4 years ago
- Automatic Reliability Testing for Kubernetes Controllers and Operators☆342Updated last year
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆34Updated last month
- Simulator for the datacenter, including power, cooling, server and other components☆17Updated 11 months ago
- ☁️ Benchmarking LLMs for Cloud Config Generation | 云场景下的大模型基准测试☆39Updated last year
- [PoC] A socket-based tracing system for discovering network service dependencies. (renamed from transtracer)☆56Updated this week
- Prometheus service discovery using with HTTP API and file_sd_config.☆24Updated 4 years ago