This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆816Oct 3, 2022Updated 3 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Toolkit for Kubernetes☆15,600Jan 5, 2026Updated 3 months ago
- A GPU / device extension framework for Kubernetes☆364Jun 27, 2023Updated 2 years ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,701Updated this week
- An on-premises, bare-metal solution for deploying GPU-powered applications in containers☆259Jun 2, 2016Updated 9 years ago
- Build and run Docker containers leveraging NVIDIA GPUs☆17,540Dec 6, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,692Dec 1, 2025Updated 4 months ago
- Integration of TensorFlow with other open-source frameworks☆1,376Sep 25, 2024Updated last year
- A batch-optimized scaling manager for Kubernetes☆872Jun 7, 2019Updated 6 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,094Updated this week
- 👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure☆289Nov 13, 2020Updated 5 years ago
- An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)☆3,861Aug 21, 2019Updated 6 years ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,094May 22, 2023Updated 2 years ago
- Setup and customize deep learning environment in seconds.☆6,277Mar 25, 2026Updated last month
- Docker, Kubernetes and Gravity Trainings by Gravitational☆2,028Jun 28, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆61Oct 16, 2017Updated 8 years ago
- NVIDIA device plugin for Kubernetes☆3,729Updated this week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,531Dec 29, 2023Updated 2 years ago
- Easy benchmarking of all publicly accessible implementations of convnets☆2,688Jun 9, 2017Updated 8 years ago
- Data-Centric Pipelines and Data Versioning☆6,297Feb 3, 2025Updated last year
- The hypervisor-based container runtime for Kubernetes.☆676Dec 15, 2020Updated 5 years ago
- Tools for building GPU clusters☆1,430Feb 23, 2026Updated 2 months ago
- Deep learning with dynamic computation graphs in TensorFlow☆1,817Jun 26, 2021Updated 4 years ago
- Note about run distributed GPU enabled tensorflow program on kubernetes☆19Sep 5, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Kubernetes + TensorFlow Workshop☆20Apr 14, 2017Updated 9 years ago
- Kubernetes cluster federation tutorial☆453Sep 16, 2017Updated 8 years ago
- Tutorials and implementations for "Self-normalizing networks"☆1,588Dec 9, 2025Updated 4 months ago
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆251Apr 22, 2026Updated last week
- Simplified building of NVIDIA drivers for CoreOS Linux☆61Jun 22, 2019Updated 6 years ago
- TensorFlow tutorials and best practices.☆8,597Oct 22, 2020Updated 5 years ago
- Distributed Deep learning with Keras & Spark☆1,579May 1, 2023Updated 2 years ago
- ☆1,651Sep 11, 2018Updated 7 years ago
- Visualizations for machine learning datasets☆7,356May 24, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Bare metal, self-hosted, self-healing/provisioning, mesh network kubernetes cluster☆452Mar 12, 2020Updated 6 years ago
- Caffe2 is a lightweight, modular, and scalable deep learning framework.☆8,387Feb 7, 2023Updated 3 years ago
- Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.☆3,187Dec 21, 2019Updated 6 years ago
- The most cited deep learning papers☆26,121Jan 18, 2024Updated 2 years ago
- Tutorial on how to deploy Deep Learning models on GPU enabled Kubernetes cluster☆76Feb 1, 2019Updated 7 years ago
- TensorFlow-based neural network library☆9,918Feb 10, 2026Updated 2 months ago
- TensorFlow - A curated list of dedicated resources http://tensorflow.org☆17,573Feb 8, 2026Updated 2 months ago