SlinkyProject / slurm-bridgeLinks
Run Slurm as a Kubernetes scheduler. A Slinky project.
☆44Updated this week
Alternatives and similar repositories for slurm-bridge
Users that are interested in slurm-bridge are comparing it to the libraries listed below
Sorting:
- Run Slurm on Kubernetes. A Slinky project.☆175Updated this week
- A Slurm cluster for Kubernetes☆65Updated last year
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆110Updated 2 weeks ago
- ☆264Updated last week
- A Lustre container storage interface that allows Kubernetes to mount/unmount provisioned Lustre filesystems into containers.☆38Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆131Updated last week
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆138Updated this week
- Deploy a Flux MiniCluster to Kubernetes with the operator☆35Updated this week
- MIG Partition Editor for NVIDIA GPUs☆218Updated this week
- NVIDIA Network Operator☆284Updated this week
- KJob: Tool for CLI-loving ML researchers☆39Updated last week
- JobSet: a k8s native API for distributed ML training and HPC workloads☆266Updated last week
- Holistic job manager on Kubernetes☆116Updated last year
- GenAI inference performance benchmarking tool☆105Updated this week
- Slurm in Kubernetes☆43Updated last week
- ☆26Updated last month
- Run cloud native workloads on NVIDIA GPUs☆201Updated 2 weeks ago
- A toolkit for discovering cluster network topology.☆72Updated last week
- Helm charts for llm-d☆50Updated 3 months ago
- OCI-compatible engine to deploy Linux containers on HPC environments.☆138Updated 11 months ago
- ☆38Updated last week
- Container plugin for Slurm Workload Manager☆386Updated 2 weeks ago
- Singularity implementation of k8s operator for interacting with SLURM.☆117Updated 4 years ago
- ☆65Updated last week
- CUDA checkpoint and restore utility☆376Updated last month
- NVIDIA NCCL Tests for Distributed Training☆114Updated last week
- InterLink aims to provide an abstraction for the execution of a Kubernetes pod on any remote resource capable of managing a Container exe…☆90Updated last week
- ☆174Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆48Updated last month
- Run Slurm in Kubernetes☆296Updated this week