This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆816Oct 3, 2022Updated 3 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below
Sorting:
- Machine Learning Toolkit for Kubernetes☆15,519Jan 5, 2026Updated 2 months ago
- A GPU / device extension framework for Kubernetes☆364Jun 27, 2023Updated 2 years ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,697Mar 9, 2026Updated last week
- An on-premises, bare-metal solution for deploying GPU-powered applications in containers☆259Jun 2, 2016Updated 9 years ago
- Build and run Docker containers leveraging NVIDIA GPUs☆17,511Dec 6, 2023Updated 2 years ago
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,679Dec 1, 2025Updated 3 months ago
- Integration of TensorFlow with other open-source frameworks☆1,374Sep 25, 2024Updated last year
- A batch-optimized scaling manager for Kubernetes☆871Jun 7, 2019Updated 6 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,056Updated this week
- 👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure☆289Nov 13, 2020Updated 5 years ago
- An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)☆3,863Aug 21, 2019Updated 6 years ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,092May 22, 2023Updated 2 years ago
- Setup and customize deep learning environment in seconds.☆6,280Jan 29, 2023Updated 3 years ago
- Docker, Kubernetes and Gravity Trainings by Gravitational☆2,028Jun 28, 2023Updated 2 years ago
- ☆61Oct 16, 2017Updated 8 years ago
- NVIDIA device plugin for Kubernetes☆3,699Updated this week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,530Dec 29, 2023Updated 2 years ago
- Easy benchmarking of all publicly accessible implementations of convnets☆2,688Jun 9, 2017Updated 8 years ago
- Data-Centric Pipelines and Data Versioning☆6,290Feb 3, 2025Updated last year
- The hypervisor-based container runtime for Kubernetes.☆676Dec 15, 2020Updated 5 years ago
- Tools for building GPU clusters☆1,424Feb 23, 2026Updated 3 weeks ago
- Deep learning with dynamic computation graphs in TensorFlow☆1,823Jun 26, 2021Updated 4 years ago
- Note about run distributed GPU enabled tensorflow program on kubernetes☆19Sep 5, 2017Updated 8 years ago
- Kubernetes + TensorFlow Workshop☆20Apr 14, 2017Updated 8 years ago
- Kubernetes cluster federation tutorial☆453Sep 16, 2017Updated 8 years ago
- Tutorials and implementations for "Self-normalizing networks"☆1,589Dec 9, 2025Updated 3 months ago
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆250Updated this week
- Simplified building of NVIDIA drivers for CoreOS Linux☆61Jun 22, 2019Updated 6 years ago
- TensorFlow tutorials and best practices.☆8,609Oct 22, 2020Updated 5 years ago
- Distributed Deep learning with Keras & Spark☆1,578May 1, 2023Updated 2 years ago
- ☆1,653Sep 11, 2018Updated 7 years ago
- Visualizations for machine learning datasets☆7,366May 24, 2023Updated 2 years ago
- Bare metal, self-hosted, self-healing/provisioning, mesh network kubernetes cluster☆452Mar 12, 2020Updated 6 years ago
- Caffe2 is a lightweight, modular, and scalable deep learning framework.☆8,396Feb 7, 2023Updated 3 years ago
- Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.☆3,189Dec 21, 2019Updated 6 years ago
- The most cited deep learning papers☆26,097Jan 18, 2024Updated 2 years ago
- Tutorial on how to deploy Deep Learning models on GPU enabled Kubernetes cluster☆77Feb 1, 2019Updated 7 years ago
- TensorFlow-based neural network library☆9,906Feb 10, 2026Updated last month
- TensorFlow - A curated list of dedicated resources http://tensorflow.org☆17,737Feb 8, 2026Updated last month