nebius/soperator

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nebius/soperator)

nebius / soperator

Run Slurm in Kubernetes

☆358Updated this week

Alternatives and similar repositories for soperator

Users that are interested in soperator are comparing it to the libraries listed below

Sorting:

nebius / nebius-solutions-library
View on GitHub
☆79Updated this week
SlinkyProject / slurm-operator
View on GitHub
Run Slurm on Kubernetes. A Slinky project.
☆237Updated this week
converged-computing / slurm-operator
View on GitHub
Testing if I can implement slurm in an operator
☆15Nov 3, 2024Updated last year
NVIDIA / KAI-Scheduler
View on GitHub
KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale
☆1,144Updated this week
kubeflow / mpi-operator
View on GitHub
Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
☆515Feb 17, 2026Updated last week
rackslab / Slurm-web
View on GitHub
Open source web interface for Slurm HPC & AI clusters
☆545Feb 13, 2026Updated 2 weeks ago
NVIDIA / topograph
View on GitHub
A toolkit for discovering cluster network topology.
☆99Feb 19, 2026Updated last week
interlink-hq / interLink
View on GitHub
InterLink aims to provide an abstraction for the execution of a Kubernetes pod on any remote resource capable of managing a Container exe…
☆103Jan 28, 2026Updated last month
NVIDIA / knavigator
View on GitHub
knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.
☆74Jul 18, 2025Updated 7 months ago
kubernetes-sigs / jobset
View on GitHub
JobSet: a k8s native API for distributed ML training and HPC workloads
☆314Updated this week
NVIDIA / k8s-operator-libs
View on GitHub
A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.
☆29Feb 15, 2026Updated 2 weeks ago
NVIDIA / mig-parted
View on GitHub
MIG Partition Editor for NVIDIA GPUs
☆241Updated this week
SlinkyProject / slurm-client
View on GitHub
OpenAPI Golang client library for Slurm REST API. A Slinky project.
☆21Feb 20, 2026Updated last week
SlinkyProject / slurm-bridge
View on GitHub
Run Slurm as a Kubernetes scheduler. A Slinky project.
☆66Updated this week
Mellanox / network-operator
View on GitHub
NVIDIA Network Operator
☆325Updated this week
kubernetes-sigs / dranet
View on GitHub
DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding ap…
☆58Updated this week
kubernetes-sigs / kueue
View on GitHub
Kubernetes-native Job Queueing
☆2,329Updated this week
stackhpc / ansible-slurm-appliance
View on GitHub
A Slurm-based HPC workload management environment, driven by Ansible.
☆67Updated this week
nebuly-ai / nos
View on GitHub
Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elas…
☆684Apr 21, 2024Updated last year
NVIDIA / pyxis
View on GitHub
Container plugin for Slurm Workload Manager
☆416Feb 18, 2026Updated last week
SlinkyProject / slurm-exporter
View on GitHub
Prometheus collector and exporter for Slurm cluster metrics. A Slinky project.
☆16Nov 7, 2025Updated 3 months ago
NVIDIA / k8s-dra-driver-gpu
View on GitHub
NVIDIA DRA Driver for GPUs
☆574Updated this week
vultr / slik
View on GitHub
Slurm in Kubernetes
☆43Nov 20, 2025Updated 3 months ago
stackabletech / agent
View on GitHub
Stackable Agent - a kubelet written in Rust which uses systemd as its backend
☆15Dec 21, 2021Updated 4 years ago
cea-hpc / auks
View on GitHub
Kerberos credential support for batch environments
☆16Jul 24, 2024Updated last year
NVIDIA / cloud-native-stack
View on GitHub
Run cloud native workloads on NVIDIA GPUs
☆225Jan 22, 2026Updated last month
NVIDIA / kubevirt-gpu-device-plugin
View on GitHub
NVIDIA k8s device plugin for Kubevirt
☆278Updated this week
google / dranet
View on GitHub
DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding ap…
☆161Dec 9, 2025Updated 2 months ago
stanford-rc / slurm-spank-lua
View on GitHub
Slurm Lua SPANK plugin
☆17Jan 30, 2025Updated last year
hhu-bsinfo / ib-scanner
View on GitHub
A terminal based monitoring tool for InfiniBand networks using Detector (https://github.com/hhu-bsinfo/detector)
☆15Aug 7, 2019Updated 6 years ago
NVIDIA / gpu-operator
View on GitHub
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
☆2,549Updated this week
dell / omnia
View on GitHub
An open-source toolkit for deploying and managing high performance clusters for HPC, AI, and data analytics workloads.
☆290Updated this week
NVIDIA / deepops
View on GitHub
Tools for building GPU clusters
☆1,421Updated this week
coreweave / tensorizer
View on GitHub
Module, Model, and Tensor Serialization/Deserialization
☆289Feb 6, 2026Updated 3 weeks ago
PKUHPC / scow-slurm-adapter
View on GitHub
☆17Jul 25, 2025Updated 7 months ago
Mellanox / ib-kubernetes
View on GitHub
☆74Updated this week
NVIDIA / NVSentinel
View on GitHub
NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated compu…
☆191Updated this week
GoogleCloudPlatform / slurm-gcp
View on GitHub
☆61Updated this week
containerd / nri
View on GitHub
Node Resource Interface
☆364Feb 18, 2026Updated last week