This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆816Oct 3, 2022Updated 3 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Toolkit for Kubernetes☆15,552Jan 5, 2026Updated 3 months ago
- A GPU / device extension framework for Kubernetes☆364Jun 27, 2023Updated 2 years ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,702Updated this week
- An on-premises, bare-metal solution for deploying GPU-powered applications in containers☆259Jun 2, 2016Updated 9 years ago
- Build and run Docker containers leveraging NVIDIA GPUs☆17,526Dec 6, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,686Dec 1, 2025Updated 4 months ago
- Integration of TensorFlow with other open-source frameworks☆1,374Sep 25, 2024Updated last year
- A batch-optimized scaling manager for Kubernetes☆872Jun 7, 2019Updated 6 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,074Updated this week
- 👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure☆289Nov 13, 2020Updated 5 years ago
- How to setup a production-grade Kubernetes GPU cluster on Paperspace in 10 minutes for $10☆38Oct 23, 2020Updated 5 years ago
- An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)☆3,864Aug 21, 2019Updated 6 years ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,092May 22, 2023Updated 2 years ago
- Setup and customize deep learning environment in seconds.☆6,278Mar 25, 2026Updated 2 weeks ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Docker, Kubernetes and Gravity Trainings by Gravitational☆2,027Jun 28, 2023Updated 2 years ago
- ☆61Oct 16, 2017Updated 8 years ago
- NVIDIA device plugin for Kubernetes☆3,716Apr 2, 2026Updated last week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,532Dec 29, 2023Updated 2 years ago
- Easy benchmarking of all publicly accessible implementations of convnets☆2,689Jun 9, 2017Updated 8 years ago
- Data-Centric Pipelines and Data Versioning☆6,293Feb 3, 2025Updated last year
- The hypervisor-based container runtime for Kubernetes.☆676Dec 15, 2020Updated 5 years ago
- Tools for building GPU clusters☆1,430Feb 23, 2026Updated last month
- Deep learning with dynamic computation graphs in TensorFlow☆1,823Jun 26, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Kubernetes + TensorFlow Workshop☆20Apr 14, 2017Updated 8 years ago
- Kubernetes cluster federation tutorial☆453Sep 16, 2017Updated 8 years ago
- Tutorials and implementations for "Self-normalizing networks"☆1,588Dec 9, 2025Updated 4 months ago
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆251Updated this week
- Simplified building of NVIDIA drivers for CoreOS Linux☆61Jun 22, 2019Updated 6 years ago
- TensorFlow tutorials and best practices.☆8,601Oct 22, 2020Updated 5 years ago
- Distributed Deep learning with Keras & Spark☆1,578May 1, 2023Updated 2 years ago
- ☆1,652Sep 11, 2018Updated 7 years ago
- Visualizations for machine learning datasets☆7,361May 24, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Bare metal, self-hosted, self-healing/provisioning, mesh network kubernetes cluster☆452Mar 12, 2020Updated 6 years ago
- Caffe2 is a lightweight, modular, and scalable deep learning framework.☆8,390Feb 7, 2023Updated 3 years ago
- Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.☆3,186Dec 21, 2019Updated 6 years ago
- The most cited deep learning papers☆26,105Jan 18, 2024Updated 2 years ago
- Tutorial on how to deploy Deep Learning models on GPU enabled Kubernetes cluster☆77Feb 1, 2019Updated 7 years ago
- TensorFlow-based neural network library☆9,910Feb 10, 2026Updated last month
- TensorFlow - A curated list of dedicated resources http://tensorflow.org☆17,712Feb 8, 2026Updated 2 months ago