Langhalsdino / Kubernetes-GPU-Guide
This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆817Updated 2 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide:
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below
- A batch-optimized scaling manager for Kubernetes☆868Updated 5 years ago
- Distributed TensorFlow basics and examples of training algorithms☆643Updated 6 years ago
- Automated Machine Learning on Kubernetes☆1,550Updated this week
- A GPU / device extension framework for Kubernetes☆363Updated last year
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,615Updated 3 weeks ago
- PyTorch on Kubernetes☆309Updated 3 years ago
- Distributed ML Training and Fine-Tuning on Kubernetes☆1,729Updated this week
- Compilation of Dockerfiles with automated builds enabled on the Docker Registry☆503Updated 5 years ago
- 👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure☆291Updated 4 years ago
- A CLI for Kubeflow.☆760Updated last week
- A TensorBoard plugin for visualizing arbitrary tensors in a video as your network trains.☆461Updated 6 years ago
- Simple wrapper for docker-compose to use GPU enabled docker under nvidia-docker☆223Updated 6 years ago
- Tools for building GPU clusters☆1,317Updated this week
- Machine Learning Workflow, from Andrew Ng's lecture at Deep Learning Summer School 2016☆411Updated 8 years ago
- PyTorch elastic training☆730Updated 2 years ago
- Python SDK for building, training, and deploying ML models☆336Updated 2 years ago
- Start Tensorboard in Jupyter Notebook☆459Updated 2 years ago
- Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation☆435Updated 6 years ago
- A CLI-supported framework that streamlines writing and deployment of Kubernetes configurations to multiple clusters.☆1,160Updated 6 years ago
- ☆372Updated 7 years ago
- ☆1,659Updated 6 years ago
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆227Updated this week
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,086Updated last year
- Fabric for Deep Learning (FfDL, pronounced fiddle) is a Deep Learning Platform offering TensorFlow, Caffe, PyTorch etc. as a Service on K…☆692Updated last year
- Open Source ML Model Versioning, Metadata, and Experiment Management☆1,719Updated 8 months ago
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆88Updated 2 years ago
- Integration of TensorFlow with other open-source frameworks☆1,370Updated 5 months ago
- Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.☆622Updated 6 years ago
- Deep Learning Dockerfiles☆157Updated 3 years ago
- NVIDIA device plugin for Kubernetes☆3,087Updated this week