coreweave / kubernetes-cloud
Getting Started with the CoreWeave Kubernetes GPU Cloud
☆68Updated last week
Related projects ⓘ
Alternatives and complementary repositories for kubernetes-cloud
- Module, Model, and Tensor Serialization/Deserialization☆188Updated last month
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆78Updated this week
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆74Updated this week
- ☆25Updated this week
- GPU plugin to the node feature discovery for Kubernetes☆293Updated 5 months ago
- ⚡Instant Stable Diffusion on k8s(Kubernetes) with Helm☆90Updated last year
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆57Updated this week
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆54Updated 7 months ago
- ☆64Updated this week
- GPU Environment Management for Visual Studio Code☆35Updated last year
- Infrastructure as code for GPU accelerated managed Kubernetes clusters.☆47Updated 6 months ago
- A top-like tool for monitoring GPUs in a cluster☆81Updated 9 months ago
- NVIDIA NCCL Tests for Distributed Training☆70Updated 2 weeks ago
- MIG Partition Editor for NVIDIA GPUs☆174Updated this week
- The NVIDIA Driver Manager is a Kubernetes component which assist in seamless upgrades of NVIDIA Driver on each node of the cluster.☆33Updated 3 weeks ago
- JobSet: a k8s native API for distributed ML training and HPC workloads☆150Updated this week
- Distributed Model Serving Framework☆154Updated last month
- ☆38Updated 2 months ago
- Run cloud native workloads on NVIDIA GPUs☆134Updated this week
- Google TPU optimizations for transformers models☆75Updated this week
- Pipeline is an open source python SDK for building AI/ML workflows☆130Updated 2 months ago
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆140Updated this week
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆215Updated this week
- ☆120Updated this week
- Running Stable Diffusion with Metaflow☆33Updated 9 months ago
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆29Updated this week
- NVIDIA k8s device plugin for Kubevirt☆232Updated last month
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆88Updated 2 years ago
- Karras et al. (2022) diffusion models for PyTorch☆19Updated 5 months ago
- Holistic job manager on Kubernetes☆108Updated 9 months ago