Langhalsdino / Kubernetes-GPU-GuideLinks
This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
☆817Updated 3 years ago
Alternatives and similar repositories for Kubernetes-GPU-Guide
Users that are interested in Kubernetes-GPU-Guide are comparing it to the libraries listed below
Sorting:
- A batch-optimized scaling manager for Kubernetes☆873Updated 6 years ago
- A GPU / device extension framework for Kubernetes☆365Updated 2 years ago
- Distributed TensorFlow basics and examples of training algorithms☆643Updated 7 years ago
- PyTorch on Kubernetes☆309Updated 4 years ago
- Compilation of Dockerfiles with automated builds enabled on the Docker Registry☆503Updated 6 years ago
- Simple wrapper for docker-compose to use GPU enabled docker under nvidia-docker☆225Updated 7 years ago
- Fabric for Deep Learning (FfDL, pronounced fiddle) is a Deep Learning Platform offering TensorFlow, Caffe, PyTorch etc. as a Service on K…☆692Updated 2 weeks ago
- Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation☆433Updated 7 years ago
- A TensorBoard plugin for visualizing arbitrary tensors in a video as your network trains.☆463Updated 7 years ago
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,028Updated this week
- Automated Machine Learning on Kubernetes☆1,658Updated this week
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle☆3,695Updated this week
- Start Tensorboard in Jupyter Notebook☆461Updated 3 years ago
- An on-premises, bare-metal solution for deploying GPU-powered applications in containers☆259Updated 9 years ago
- A CNN visualizer☆1,002Updated 7 years ago
- Integration of TensorFlow with other open-source frameworks☆1,374Updated last year
- A benchmark framework for Tensorflow☆1,146Updated 2 years ago
- Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.☆623Updated 7 years ago
- A multi-user, distributed computing environment for running DL model training experiments on Intel® Xeon® Scalable processor-based system…☆393Updated last year
- A REST API for Caffe using Docker and Go☆421Updated 7 years ago
- How to setup a production-grade Kubernetes GPU cluster on Paperspace in 10 minutes for $10☆38Updated 5 years ago
- Input pipeline framework☆988Updated 6 months ago
- Studio: Simplify and expedite model building process☆380Updated last year
- Machine Learning Model Deployment Made Simple☆716Updated 7 years ago
- A Tutorial for Serving Tensorflow Models using Kubernetes☆88Updated 3 months ago
- A domain specific language to express machine learning workloads.☆1,765Updated 2 years ago
- A CLI for Kubeflow.☆810Updated 2 weeks ago
- A low-latency prediction-serving system☆1,422Updated 4 years ago
- PyTorch elastic training☆728Updated 3 years ago
- DAWNBench: An End-to-End Deep Learning Benchmark and Competition☆263Updated 5 years ago