This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆816Oct 3, 2022Updated 3 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Learning Toolkit for Kubernetes☆15,639May 7, 2026Updated last week
- A GPU / device extension framework for Kubernetes☆364Jun 27, 2023Updated 2 years ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,706Apr 26, 2026Updated 3 weeks ago
- An on-premises, bare-metal solution for deploying GPU-powered applications in containers☆259Jun 2, 2016Updated 9 years ago
- Build and run Docker containers leveraging NVIDIA GPUs☆17,544Dec 6, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,690Dec 1, 2025Updated 5 months ago
- Integration of TensorFlow with other open-source frameworks☆1,376Sep 25, 2024Updated last year
- A batch-optimized scaling manager for Kubernetes☆872Jun 7, 2019Updated 6 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,100Updated this week
- 👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure☆289Nov 13, 2020Updated 5 years ago
- How to setup a production-grade Kubernetes GPU cluster on Paperspace in 10 minutes for $10☆38Oct 23, 2020Updated 5 years ago
- An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)☆3,858Aug 21, 2019Updated 6 years ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,094May 22, 2023Updated 2 years ago
- Setup and customize deep learning environment in seconds.☆6,277Mar 25, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Docker, Kubernetes and Gravity Trainings by Gravitational☆2,029Jun 28, 2023Updated 2 years ago
- ☆61Oct 16, 2017Updated 8 years ago
- NVIDIA device plugin for Kubernetes☆3,755Updated this week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,531Dec 29, 2023Updated 2 years ago
- Easy benchmarking of all publicly accessible implementations of convnets☆2,688Jun 9, 2017Updated 8 years ago
- Data-Centric Pipelines and Data Versioning☆6,294Feb 3, 2025Updated last year
- The hypervisor-based container runtime for Kubernetes.☆675Dec 15, 2020Updated 5 years ago
- Tools for building GPU clusters☆1,435Apr 27, 2026Updated 3 weeks ago
- Deep learning with dynamic computation graphs in TensorFlow☆1,818Jun 26, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Note about run distributed GPU enabled tensorflow program on kubernetes☆19Sep 5, 2017Updated 8 years ago
- Kubernetes + TensorFlow Workshop☆20Apr 14, 2017Updated 9 years ago
- Kubernetes cluster federation tutorial☆453Sep 16, 2017Updated 8 years ago
- Tutorials and implementations for "Self-normalizing networks"☆1,587May 12, 2026Updated last week
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆251May 12, 2026Updated last week
- TensorFlow tutorials and best practices.☆8,591Oct 22, 2020Updated 5 years ago
- Distributed Deep learning with Keras & Spark☆1,579May 1, 2023Updated 3 years ago
- ☆1,650Sep 11, 2018Updated 7 years ago
- Visualizations for machine learning datasets☆7,352May 24, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Bare metal, self-hosted, self-healing/provisioning, mesh network kubernetes cluster☆452Mar 12, 2020Updated 6 years ago
- Caffe2 is a lightweight, modular, and scalable deep learning framework.☆8,382Feb 7, 2023Updated 3 years ago
- Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.☆3,187Dec 21, 2019Updated 6 years ago
- The most cited deep learning papers☆26,133Jan 18, 2024Updated 2 years ago
- Tutorial on how to deploy Deep Learning models on GPU enabled Kubernetes cluster☆76Feb 1, 2019Updated 7 years ago
- TensorFlow-based neural network library☆9,917May 6, 2026Updated last week
- TensorFlow - A curated list of dedicated resources http://tensorflow.org☆17,531Feb 8, 2026Updated 3 months ago