coreweave / kubernetes-cloud
Getting Started with the CoreWeave Kubernetes GPU Cloud
☆71Updated 2 months ago
Alternatives and similar repositories for kubernetes-cloud
Users that are interested in kubernetes-cloud are comparing it to the libraries listed below
Sorting:
- ☆31Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆227Updated this week
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆96Updated this week
- Infrastructure as code for GPU accelerated managed Kubernetes clusters.☆54Updated 2 weeks ago
- ☆207Updated last week
- Pipeline is an open source python SDK for building AI/ML workflows☆133Updated 7 months ago
- ⚡Instant Stable Diffusion on k8s(Kubernetes) with Helm☆92Updated last year
- NVIDIA NCCL Tests for Distributed Training☆90Updated last week
- GPU plugin to the node feature discovery for Kubernetes☆300Updated 11 months ago
- A top-like tool for monitoring GPUs in a cluster☆86Updated last year
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆111Updated last week
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆230Updated last week
- Run Slurm on Kubernetes. A Slinky project.☆88Updated 2 weeks ago
- ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing☆78Updated 9 months ago
- Horizontal Pod Autoscaling for Kubernetes using Nvidia GPU Metrics☆32Updated 4 years ago
- NVIDIA device plugin for Kubernetes☆48Updated last year
- Kuberentes LangChain Agent - Interact with Kubernetes Clusters using LLMs☆26Updated 2 years ago
- Cortex-compatible model server for Python and TensorFlow☆17Updated 2 years ago
- A Slurm cluster for Kubernetes☆57Updated 9 months ago
- ☆92Updated last year
- ☆12Updated last year
- Holistic job manager on Kubernetes☆115Updated last year
- markdown docs☆88Updated last week
- JobSet: a k8s native API for distributed ML training and HPC workloads☆226Updated last week
- Argoflow has been superseded by deployKF☆137Updated last year
- ☆24Updated last week
- Deploy your HPC Cluster on AWS in 20min. with just 1-Click.☆55Updated last month
- ☆248Updated last week
- Container plugin for Slurm Workload Manager☆340Updated 6 months ago
- This repo includes everything you need to know about deploying GPU nodes on OCI☆29Updated last week