kubeflow / mpi-operatorLinks

Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)

☆487

Alternatives and similar repositories for mpi-operator

Users that are interested in mpi-operator are comparing it to the libraries listed below

Sorting:

NVIDIA / gpu-feature-discovery
GPU plugin to the node feature discovery for Kubernetes
☆302Updated last year
kubedl-io / kubedl
Run your deep learning workloads on Kubernetes more easily and efficiently.
☆526Updated last year
volcano-sh / devices
Device plugins for Volcano, e.g. GPU
☆126Updated 4 months ago
kubeflow / common
Common APIs and libraries shared by other Kubeflow operator repositories.
☆52Updated 2 years ago
Mellanox / k8s-rdma-shared-dev-plugin
☆283Updated last week
elastic-ai / elastic-gpu-scheduler
elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.
☆142Updated 2 years ago
NVIDIA / k8s-dra-driver-gpu
NVIDIA DRA Driver for GPUs
☆402Updated this week
Deepomatic / shared-gpu-nvidia-k8s-device-plugin
Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times
☆88Updated 3 years ago
AliyunContainerService / et-operator
Kubernetes Operator for AI and Bigdata Elastic Training
☆87Updated 6 months ago
AliyunContainerService / gpushare-device-plugin
GPU Sharing Device Plugin for Kubernetes Cluster
☆487Updated 2 years ago
kubeflow / arena
A CLI for Kubeflow.
☆786Updated last week
NVIDIA / mig-parted
MIG Partition Editor for NVIDIA GPUs
☆207Updated this week
pokerfaceSad / GPUMounter
A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod
☆127Updated 3 years ago
NTHU-LSALAB / KubeShare
Share GPU between Pods in Kubernetes
☆211Updated 2 years ago
awslabs / aws-virtual-gpu-device-plugin
AWS virtual gpu device plugin provides capability to use smaller virtual gpus for your machine learning inference workloads
☆205Updated last year
skai-x / elastic-jupyter-operator
Cloud-native way to provide elastic Jupyter Notebooks on Kubernetes. Run remote kernels, natively.
☆201Updated 3 years ago
kubernetes-retired / kube-batch
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
☆1,091Updated 2 years ago
kube-queue / kube-queue
☆119Updated 2 years ago
kubernetes-sigs / lws
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
☆529Updated this week
tkestack / gpu-admission
☆132Updated 4 years ago
NVIDIA / go-gpuallocator
Go Abstraction for Allocating NVIDIA GPUs with Custom Policies
☆116Updated last month
kubedl-io / morphling
Automatic tuning for ML model deployment on Kubernetes
☆80Updated 9 months ago
kubernetes-sigs / jobset
JobSet: a k8s native API for distributed ML training and HPC workloads
☆246Updated this week
Project-HAMi / HAMi-core
HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container
☆195Updated this week
project-codeflare / multi-cluster-app-dispatcher
Holistic job manager on Kubernetes
☆117Updated last year
Mellanox / network-operator
NVIDIA Network Operator
☆268Updated last week
tkestack / vcuda-controller
☆533Updated last year
kserve / modelmesh-serving
Controller for ModelMesh
☆239Updated last month
kleveross / ormb
Docker for Your ML/DL Models Based on OCI Artifacts
☆471Updated last year
hustcat / k8s-rdma-device-plugin
RDMA device plugin for Kubernetes
☆217Updated last year