cnvrg / metagpu
K8s device plugin for GPU sharing
☆96Updated last year
Related projects ⓘ
Alternatives and complementary repositories for metagpu
- JobSet: a k8s native API for distributed ML training and HPC workloads☆150Updated this week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆140Updated this week
- Example DRA driver that developers can fork and modify to get them started writing their own.☆53Updated 2 weeks ago
- CAPK is a provider for Cluster API (CAPI) that allows users to deploy fake, Kubemark-backed machines to their clusters.☆64Updated 3 weeks ago
- Kubernetes Image Puller is used for caching images on a cluster. It creates a DaemonSet downloading and running the relevant container im…☆209Updated last week
- Kubernetes-in-Kubernetes Made Simple☆83Updated last year
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆57Updated this week
- Smart Kubernetes Scheduling☆69Updated this week
- Sidecar container that watches Kubernetes PersistentVolumeClaims objects and triggers controller side expansion operation against a CSI e…☆126Updated last week
- ☆54Updated 2 weeks ago
- CAAPH uses Helm charts to manage the installation and lifecycle of Cluster API add-ons.☆125Updated this week
- ☆48Updated 8 months ago
- LLM Instance gateway implementation.☆76Updated this week
- K8s Node Health Check Operator☆96Updated last week
- Operator for Multi-Cluster Monitoring with Thanos.☆125Updated this week
- Manage admission policies in your Kubernetes cluster with ease☆195Updated this week
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes☆270Updated this week
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆52Updated 2 weeks ago
- Automatic repair for unhealthy Kubernetes nodes☆45Updated last week
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.☆135Updated 2 years ago
- Metal³ integration with https://github.com/kubernetes-sigs/cluster-api☆211Updated this week
- The kernel module management operator builds, signs and loads kernel modules in Kubernetes clusters.☆91Updated this week
- This repo contains sidecar controller and agent for volume health monitoring.☆66Updated this week
- The Sail Operator is able to install and manage the lifecycle of the Istio control plane in an Kubernetes & OpenShift cluster.☆40Updated this week
- ☆52Updated last week
- A collection of community maintained NRI plugins☆63Updated this week
- A Topology-Aware Custom Scheduler For Kubernetes☆62Updated last year
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆24Updated 3 months ago
- Kubernetes in Kubernetes☆183Updated last week
- ☆88Updated this week