SlinkyProject / slurm-operator
This project provides a framework that runs Slurm in Kubernetes.
☆77Updated this week
Alternatives and similar repositories for slurm-operator:
Users that are interested in slurm-operator are comparing it to the libraries listed below
- Slurm in Kubernetes☆41Updated 4 months ago
- A Slurm cluster for Kubernetes☆55Updated 8 months ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆92Updated this week
- InterLink aims to provide an abstraction for the execution of a Kubernetes pod on any remote resource capable of managing a Container exe…☆64Updated this week
- ☆24Updated 3 weeks ago
- A Lustre container storage interface that allows Kubernetes to mount/unmount provisioned Lustre filesystems into containers.☆33Updated last week
- Enabling Kubernetes to make pod placement decisions with platform intelligence.☆174Updated 2 months ago
- ☆248Updated 2 weeks ago
- Run Slurm in Kubernetes☆208Updated this week
- ☆62Updated last week
- InstaSlice Operator facilitates slicing of accelerators using stable APIs☆33Updated this week
- K8s device plugin for GPU sharing☆100Updated last year
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆65Updated last week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆69Updated last month
- KJob: Tool for CLI-loving ML researchers☆27Updated last week
- ☆107Updated 3 weeks ago
- ☆85Updated 7 months ago
- JobSet: a k8s native API for distributed ML training and HPC workloads☆219Updated this week
- A toolkit for discovering cluster network topology.☆45Updated this week
- This repo includes everything you need to know about deploying GPU nodes on OCI☆26Updated this week
- Deploy a Flux MiniCluster to Kubernetes with the operator☆32Updated last month
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes☆348Updated this week
- The kernel module management operator builds, signs and loads kernel modules in Kubernetes clusters.☆100Updated this week
- Holistic job manager on Kubernetes☆115Updated last year
- GenAI inference performance benchmarking tool☆39Updated 3 weeks ago
- MIG Partition Editor for NVIDIA GPUs☆196Updated last week
- ☆143Updated last week
- NVIDIA Network Operator☆246Updated this week
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆93Updated last week
- A collection of community maintained NRI plugins☆79Updated last week