SlinkyProject / slurm-bridgeLinks
Run Slurm as a Kubernetes scheduler. A Slinky project.
☆48Updated this week
Alternatives and similar repositories for slurm-bridge
Users that are interested in slurm-bridge are comparing it to the libraries listed below
Sorting:
- Run Slurm on Kubernetes. A Slinky project.☆182Updated this week
- A Slurm cluster for Kubernetes☆65Updated last year
- Singularity implementation of k8s operator for interacting with SLURM.☆117Updated 4 years ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆110Updated 3 weeks ago
- OCI-compatible engine to deploy Linux containers on HPC environments.☆139Updated last year
- The Singularity implementation of the Kubernetes Container Runtime Interface☆114Updated 4 years ago
- MIG Partition Editor for NVIDIA GPUs☆224Updated this week
- Run Slurm in Kubernetes☆311Updated this week
- ☆267Updated 3 weeks ago
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆131Updated this week
- Holistic job manager on Kubernetes☆116Updated last year
- Deploy a Flux MiniCluster to Kubernetes with the operator☆36Updated this week
- Container plugin for Slurm Workload Manager☆392Updated last month
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆138Updated 2 weeks ago
- Slurm in Kubernetes☆43Updated last month
- Cluster Toolkit is an open-source software offered by Google Cloud which makes it easy for customers to deploy AI/ML and HPC environments…☆295Updated this week
- A Lustre container storage interface that allows Kubernetes to mount/unmount provisioned Lustre filesystems into containers.☆38Updated 2 weeks ago
- Run cloud native workloads on NVIDIA GPUs☆204Updated last month
- A tool to detect infrastructure issues on cloud native AI systems☆49Updated last month
- NVIDIA NCCL Tests for Distributed Training☆121Updated last week
- Carbon Limiting Auto Tuning for Kubernetes☆37Updated last year
- NVIDIA Network Operator☆289Updated this week
- An open-source toolkit for deploying and managing high performance clusters for HPC, AI, and data analytics workloads.☆280Updated this week
- A toolkit for discovering cluster network topology.☆76Updated last week
- Prometheus collector and exporter for Slurm cluster metrics. A Slinky project.☆14Updated this week
- llm-d benchmark scripts and tooling☆31Updated this week
- KJob: Tool for CLI-loving ML researchers☆39Updated last week
- OpenAPI Golang client library for Slurm REST API. A Slinky project.☆15Updated last week
- GPU plugin to the node feature discovery for Kubernetes☆306Updated last year
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆497Updated this week