cake-lab / transient-deep-learningLinks
Repo for transient training paper at ICAC 2019.
☆11Updated 3 years ago
Alternatives and similar repositories for transient-deep-learning
Users that are interested in transient-deep-learning are comparing it to the libraries listed below
Sorting:
- My Master Thesis on Distributed Deep Learning (parallelizing gradient descent) and other concepts I did during my research.☆26Updated 8 years ago
- Simulated large clusters for Kubernetes scheduler validation.☆15Updated 3 years ago
- ☆13Updated 7 years ago
- Studying GPU Multi-tenancy☆11Updated 7 years ago
- ☆19Updated 8 years ago
- Deadline-based hyperparameter tuning on RayTune.☆32Updated 6 years ago
- HTTP Load Generator for variable load intensities. Supports request scripting and power consumption measurements.☆17Updated 6 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 3 years ago
- Machine Learning Inference Graph Spec☆21Updated 6 years ago
- Validation Generation for Kubeflow CRD on Kubernetes☆11Updated 5 years ago
- Deep Learning Benchmarking Suite☆130Updated 3 years ago
- Experiments API for Experiment Tracking on Kubernetes☆27Updated 3 years ago
- Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving☆37Updated 6 years ago
- This is the course taught by Prof.John Shen and Prof. Onur Mutlu from CMU☆11Updated 9 years ago
- Awesome Resources on Distributed Computing and Distributed Systems☆18Updated 6 years ago
- Scoreboard for ONNX Backend Compatibility☆29Updated last month
- Serverless for all computation☆42Updated 2 years ago
- GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)☆34Updated 2 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆296Updated last year
- Using deep learning and CNN model☆11Updated 6 years ago
- Run multiple independent anomaly detection (object flaws and motor defects) workloads on a single system via multiple virtual machines us…☆23Updated 3 years ago
- GraphPipe for go☆65Updated 6 years ago
- Splits single Nvidia GPU into multiple partitions with complete compute and memory isolation (wrt to performace) between the partitions☆165Updated 6 years ago
- Tools for ML/MXNet on Kubernetes.☆44Updated 7 years ago
- Find the original image of the converted image with elastic search☆22Updated 8 years ago
- Intelligent platform for AI workloads☆37Updated 3 years ago
- Static analysis framework for analyzing programs written in TVM's Relay IR.☆29Updated 6 years ago
- A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.☆132Updated 3 years ago
- An analytical performance modeling tool for deep neural networks.☆92Updated 5 years ago
- An Efficient Dynamic Resource Scheduler for Deep Learning Clusters☆41Updated 8 years ago