This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆814Oct 3, 2022Updated 3 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Toolkit for Kubernetes☆15,700May 24, 2026Updated 2 weeks ago
- A GPU / device extension framework for Kubernetes☆364Jun 27, 2023Updated 2 years ago
- Open Source AI Infra & Engineering Control Plane☆3,708May 29, 2026Updated last week
- An on-premises, bare-metal solution for deploying GPU-powered applications in containers☆259Jun 2, 2016Updated 10 years ago
- Build and run Docker containers leveraging NVIDIA GPUs☆17,550Dec 6, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,693Dec 1, 2025Updated 6 months ago
- Integration of TensorFlow with other open-source frameworks☆1,377Sep 25, 2024Updated last year
- A batch-optimized scaling manager for Kubernetes☆872Jun 7, 2019Updated 7 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,110Updated this week
- 👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure☆289Nov 13, 2020Updated 5 years ago
- How to setup a production-grade Kubernetes GPU cluster on Paperspace in 10 minutes for $10☆38Oct 23, 2020Updated 5 years ago
- An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)☆3,856Aug 21, 2019Updated 6 years ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,093May 22, 2023Updated 3 years ago
- Setup and customize deep learning environment in seconds.☆6,275Mar 25, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Docker, Kubernetes and Gravity Trainings by Gravitational☆2,030Jun 28, 2023Updated 2 years ago
- ☆61Oct 16, 2017Updated 8 years ago
- NVIDIA device plugin for Kubernetes☆3,782Updated this week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,533Dec 29, 2023Updated 2 years ago
- Easy benchmarking of all publicly accessible implementations of convnets☆2,690Jun 9, 2017Updated 8 years ago
- Data-Centric Pipelines and Data Versioning☆6,293Feb 3, 2025Updated last year
- The hypervisor-based container runtime for Kubernetes.☆675Dec 15, 2020Updated 5 years ago
- Tools for building GPU clusters☆1,442Updated this week
- Deep learning with dynamic computation graphs in TensorFlow☆1,819Jun 26, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Note about run distributed GPU enabled tensorflow program on kubernetes☆19Sep 5, 2017Updated 8 years ago
- Kubernetes cluster federation tutorial☆453Sep 16, 2017Updated 8 years ago
- Tutorials and implementations for "Self-normalizing networks"☆1,588May 12, 2026Updated 3 weeks ago
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆254Jun 2, 2026Updated last week
- TensorFlow tutorials and best practices.☆8,589Oct 22, 2020Updated 5 years ago
- Distributed Deep learning with Keras & Spark☆1,580May 1, 2023Updated 3 years ago
- ☆1,649Sep 11, 2018Updated 7 years ago
- Visualizations for machine learning datasets☆7,342May 24, 2023Updated 3 years ago
- Bare metal, self-hosted, self-healing/provisioning, mesh network kubernetes cluster☆452Mar 12, 2020Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Caffe2 is a lightweight, modular, and scalable deep learning framework.☆8,377Feb 7, 2023Updated 3 years ago
- Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.☆3,184Dec 21, 2019Updated 6 years ago
- The most cited deep learning papers☆26,153Jan 18, 2024Updated 2 years ago
- Tutorial on how to deploy Deep Learning models on GPU enabled Kubernetes cluster☆76Feb 1, 2019Updated 7 years ago
- TensorFlow-based neural network library☆9,920May 6, 2026Updated last month
- TensorFlow - A curated list of dedicated resources http://tensorflow.org☆17,539Feb 8, 2026Updated 4 months ago
- Learning to Learn in TensorFlow☆4,069Jun 29, 2021Updated 4 years ago