4paradigm / k8s-vgpu-scheduler
OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow applications to access larger memory space than its physical capacity. It is designed for ease of use of extended device memory for AI workloads.
☆562Updated 11 months ago
Alternatives and similar repositories for k8s-vgpu-scheduler:
Users that are interested in k8s-vgpu-scheduler are comparing it to the libraries listed below
- ☆527Updated 11 months ago
- ☆875Updated last year
- GPU Sharing Device Plugin for Kubernetes Cluster☆480Updated 2 years ago
- Heterogeneous AI Computing Virtualization Middleware☆1,554Updated this week
- Using CRDs to manage GPU resources in Kubernetes.☆199Updated 2 years ago
- HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container☆162Updated last week
- ☆132Updated 4 years ago
- GPU Sharing Scheduler for Kubernetes Cluster☆1,463Updated last year
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆125Updated 3 years ago
- Run your deep learning workloads on Kubernetes more easily and efficiently.☆521Updated last year
- Device plugins for Volcano, e.g. GPU☆119Updated last month
- ☆51Updated last month
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.☆140Updated 2 years ago
- kubeflow国内一键安装文件☆347Updated 2 years ago
- ☆274Updated last year
- ☆250Updated this week
- Kubeflow helm chart☆144Updated last year
- Device-plugin for volcano vgpu which support hard resource isolation☆73Updated last week
- NVIDIA k8s device plugin for Kubevirt☆252Updated 3 weeks ago
- Large language model fine-tuning capabilities based on cloud native and distributed computing.☆92Updated last year
- Kubernetes Operator for AI and Bigdata Elastic Training☆85Updated 3 months ago
- Share GPU between Pods in Kubernetes☆209Updated 2 years ago
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes☆352Updated last week
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆475Updated 3 weeks ago
- GPU plugin to the node feature discovery for Kubernetes☆300Updated 11 months ago
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆421Updated last week
- A CLI for Kubeflow.☆60Updated last year
- ☆35Updated 4 years ago
- Unified resource orchestration, unified scheduling, unified traffic management and unified telemetry for distributed cloud☆249Updated 6 months ago
- cloud-native local storage management system for stateful workload, low-latency with simplicity☆487Updated 4 months ago