IBM/autopilot

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IBM/autopilot)

IBM / autopilot

A tool to detect infrastructure issues on cloud native AI systems

☆54

Alternatives and similar repositories for autopilot

Users that are interested in autopilot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fmperf-project / fmperf
View on GitHub
Cloud Native Benchmarking of Foundation Models
☆46Jul 31, 2025Updated 11 months ago
dessertlab / Fault-Injection-Dataset
View on GitHub
Failure dataset accompanying the paper "How Bad Can a Bug Get? An Empirical Analysis of Software Failures in the OpenStack Cloud Computi…
☆10Jun 12, 2020Updated 6 years ago
uiuc-hpc / Recorder
View on GitHub
Comprehensive Parallel I/O Tracing and Analysis
☆52Apr 16, 2025Updated last year
IntelligentDDS / LogReducer
View on GitHub
☆15Jan 7, 2023Updated 3 years ago
project-codeflare / multi-cluster-app-dispatcher
View on GitHub
Holistic job manager on Kubernetes
☆117Feb 20, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
IBM / knative-quarkus-bench
View on GitHub
Knative benchmark suite for Quarkus
☆11Feb 5, 2026Updated 5 months ago
IBM / solsa
View on GitHub
Solution Service Architecture
☆26Jun 5, 2024Updated 2 years ago
hpc-io / dxt-explorer
View on GitHub
DXT Explorer is an interactive web-based log analysis tool for Darshan DXT logs.
☆18Feb 19, 2026Updated 5 months ago
IntelligentDDS / NN-eBPF
View on GitHub
Real-Time Intrusion Detection and Prevention with Neural Network in Kernel using eBPF
☆25Apr 9, 2024Updated 2 years ago
leptonai / gpud
View on GitHub
GPUd automates monitoring, diagnostics, and issue identification for GPUs
☆485Updated this week
olcf / hpc-system-test-wg
View on GitHub
hosted by HPC System Test Working Group collaboration
☆17Jun 26, 2026Updated 3 weeks ago
IBM / LLM-performance-prediction
View on GitHub
Predict the performance of LLM inference services
☆23Sep 18, 2025Updated 10 months ago
llm-d / llm-d-benchmark
View on GitHub
llm-d benchmark scripts and tooling
☆62Updated this week
hariharan-devarajan / dlio_benchmark
View on GitHub
This is repository for a I/O benchmark which represents Scientific Deep Learning Workloads.
☆24Dec 6, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hpc-io / drishti-io
View on GitHub
Drishti provides I/O insights to help you improve your application's I/O performance.
☆26Mar 3, 2026Updated 4 months ago
openshift / secondary-scheduler-operator
View on GitHub
Red Hat Certified optional operator for secondary schedulers
☆21Updated this week
berkmancenter / adf
View on GitHub
Augmented Dickey-Fuller implementation in Go
☆12Mar 15, 2019Updated 7 years ago
gardener-attic / vpa-exporter
View on GitHub
[DEPRECATED] Prometheus exporter for VPA recommendations
☆12Aug 22, 2023Updated 2 years ago
MolSSI-Education / S2I2
View on GitHub
Code and other materials for the S2I2 Software Summer School
☆12Mar 11, 2017Updated 9 years ago
MetaX-MACA / FlashMLA
View on GitHub
Fast and efficient attention method exploration and implementation.
☆27Jul 6, 2026Updated 2 weeks ago
NVIDIA / NVSentinel
View on GitHub
NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated compu…
☆349Updated this week
uwplse / stng
View on GitHub
compiler for fortran stencils using verified lifting,
☆20Apr 5, 2022Updated 4 years ago
fujitsu / pytorch
View on GitHub
Tensors and Dynamic neural networks in Python with strong GPU acceleration
☆11Jun 2, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gridsim / gridsim
View on GitHub
Gridsim simulator
☆12May 12, 2017Updated 9 years ago
kubernetes-sigs / kjob
View on GitHub
KJob: Tool for CLI-loving ML researchers
☆44Jun 1, 2026Updated last month
gashcrumb / dynamic-plugins-getting-started
View on GitHub
A standalone set of backstage plugins intended to be converted to dynamic plugins to run in Red Hat Developer Hub
☆16Updated this week
apache / openwhisk-composer
View on GitHub
Apache OpenWhisk Composer provides a high-level programming model in JavaScript for composing serverless functions
☆68Sep 24, 2024Updated last year
nabla-containers / nabla-containers.github.io
View on GitHub
Nabla Containers blog
☆12May 26, 2021Updated 5 years ago
nl2logql / LogQLLM
View on GitHub
☆10Dec 10, 2024Updated last year
Azure / Moneo
View on GitHub
Distributed AI/HPC Monitoring Framework
☆29Apr 11, 2025Updated last year
OStars / KeyEE
View on GitHub
Official repository for paper "KeyEE: Enhancing Low-resource Generative Event Extraction with Auxiliary Keyword Sub-Prompt"
☆10Jun 5, 2024Updated 2 years ago
LMCache / LMBenchmark
View on GitHub
Systematic and comprehensive benchmarks for LLM systems.
☆62Jan 28, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
oracle-devrel / oci-generative-ai
View on GitHub
oci-generative-ai
☆14Jan 21, 2025Updated last year
yosuke-furukawa / must-call
View on GitHub
example.on('end', mustCall(() => {})); Check the callback function is called.
☆11Nov 20, 2022Updated 3 years ago
oracle-quickstart / oci-arch-adw-oac
View on GitHub
Deploy Autonomous Data Warehouse and Oracle Analytics Cloud
☆10Oct 13, 2021Updated 4 years ago
lanl / libquo
View on GitHub
Dynamic execution environments for coupled, thread-heterogeneous MPI+X applications
☆23Mar 3, 2025Updated last year
flux-framework / flux-k8s
View on GitHub
Project to manage Flux tasks needed to standardize kubernetes HPC scheduling interfaces
☆30Jan 9, 2026Updated 6 months ago
MoinDalvs / Time_Series_Forecasting_From_Scratch
View on GitHub
☆12Aug 27, 2022Updated 3 years ago
cloud-native-toolkit / site-developer-guide
View on GitHub
This repository will host the Developer Guide for the IBM Garage Cloud Native Toolkit
☆31Apr 24, 2023Updated 3 years ago