kubedl-io / morphlingLinks

Automatic tuning for ML model deployment on Kubernetes

☆81

Alternatives and similar repositories for morphling

Users that are interested in morphling are comparing it to the libraries listed below

Sorting:

AliyunContainerService / et-operator
Kubernetes Operator for AI and Bigdata Elastic Training
☆89Updated 10 months ago
alibaba / GPU-scheduler-for-deep-learning
GPU-scheduler-for-deep-learning
☆210Updated 5 years ago
microsoft / hivedscheduler
Kubernetes Scheduler for Deep Learning
☆262Updated 3 years ago
pokerfaceSad / GPUMounter
A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod
☆129Updated 3 years ago
kubedl-io / kubedl
Run your deep learning workloads on Kubernetes more easily and efficiently.
☆532Updated last year
kleveross / ftlib
Fault-tolerant for DL frameworks
☆70Updated 2 years ago
NTHU-LSALAB / KubeShare
Share GPU between Pods in Kubernetes
☆214Updated 2 years ago
volcano-sh / devices
Device plugins for Volcano, e.g. GPU
☆130Updated 7 months ago
elastic-ai / elastic-gpu-scheduler
elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.
☆145Updated 2 years ago
heyfey / vodascheduler
GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)
☆34Updated 2 years ago
kubeflow / common
Common APIs and libraries shared by other Kubeflow operator repositories.
☆53Updated 2 years ago
NTHU-LSALAB / Gemini
An efficient GPU resource sharing system with fine-grained control for Linux platforms.
☆85Updated last year
tkestack / gpu-admission
☆132Updated 4 years ago
tkestack / vcuda-controller
☆538Updated last year
PaddleFlow / paddle-operator
Elastic Deep Learning Training based on Kubernetes by Leveraging EDL and Volcano
☆32Updated 2 years ago
king-jingxiang / pod-gpushare-metrics-exporter
Forked form
☆11Updated 4 years ago
sgl-project / rbg
A workload for deploying LLM inference services on Kubernetes
☆99Updated this week
Mellanox / k8s-rdma-shared-dev-plugin
☆305Updated this week
Project-HAMi / HAMi-core
HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container
☆247Updated last month
hustcat / k8s-rdma-device-plugin
RDMA device plugin for Kubernetes
☆222Updated last year
Mr-Linus / SCV
SCV is a distributed cluster GPU sniffer. SCV是一个分布式GPU嗅探器
☆21Updated 2 years ago
sakjain92 / Fractional-GPUs
Splits single Nvidia GPU into multiple partitions with complete compute and memory isolation (wrt to performace) between the partitions
☆160Updated 6 years ago
Mellanox / k8s-rdma-sriov-dev-plugin
Kubernetes Rdma SRIOV device plugin
☆111Updated 4 years ago
petuum / adaptdl
Resource-adaptive cluster scheduler for deep learning training.
☆448Updated 2 years ago
NVIDIA / go-gpuallocator
Go Abstraction for Allocating NVIDIA GPUs with Custom Policies
☆117Updated last month
SymbioticLab / Salus
Fine-grained GPU sharing primitives
☆146Updated 3 months ago
skai-x / elastic-jupyter-operator
Cloud-native way to provide elastic Jupyter Notebooks on Kubernetes. Run remote kernels, natively.
☆204Updated 3 years ago
kubeflow / mpi-operator
Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
☆497Updated last week
kube-queue / kube-queue
☆122Updated 3 years ago
kzhang28 / Optimus
An Efficient Dynamic Resource Scheduler for Deep Learning Clusters
☆42Updated 8 years ago