everpeace / kube-openmpi
Open MPI jobs on Kubernetes
☆115Updated 6 years ago
Alternatives and similar repositories for kube-openmpi:
Users that are interested in kube-openmpi are comparing it to the libraries listed below
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆469Updated 2 weeks ago
- Singularity implementation of k8s operator for interacting with SLURM.☆117Updated 4 years ago
- Project to manage Flux tasks needed to standardize kubernetes HPC scheduling interfaces☆24Updated 3 months ago
- The Singularity implementation of the Kubernetes Container Runtime Interface☆114Updated 4 years ago
- GPU plugin to the node feature discovery for Kubernetes☆299Updated 10 months ago
- OCI-compatible engine to deploy Linux containers on HPC environments.☆135Updated 5 months ago
- MPI Cluster Automation Solution using Docker, based on Alpine Linux with MPICH (see IEEE paper)☆134Updated 11 months ago
- A Slurm cluster for Kubernetes☆56Updated 8 months ago
- Slurm in Docker - Exploring Slurm using CentOS 7 based Docker images☆127Updated 5 years ago
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆88Updated 2 years ago
- Kubernetes Rdma SRIOV device plugin☆110Updated 4 years ago
- A scalable OpenMPI runtime container for Docker☆92Updated 3 years ago
- HPC Container Maker☆472Updated this week
- RDMA device plugin for Kubernetes☆210Updated last year
- MIG Partition Editor for NVIDIA GPUs☆192Updated last week
- A Slurm cluster using docker-compose☆358Updated 6 months ago
- NVIDIA Network Operator☆245Updated this week
- NVIDIA NCCL Tests for Distributed Training☆85Updated 2 weeks ago
- Now hosted on GitLab.☆313Updated 5 months ago
- MPI Microbenchmarks☆37Updated 8 years ago
- core services for the Flux resource management framework☆177Updated this week
- A Kubernetes operator for mxnet jobs☆53Updated 3 years ago
- Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster☆317Updated last week
- Common APIs and libraries shared by other Kubeflow operator repositories.☆52Updated last year
- Container plugin for Slurm Workload Manager☆329Updated 4 months ago
- SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability…☆102Updated 2 weeks ago
- MPI Testing Tool☆63Updated 3 months ago
- Python bindings for UCX☆126Updated last week
- Deploy Dask using MPI4Py☆53Updated 3 weeks ago
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆227Updated this week