awslabs / dynamic-training-with-apache-mxnet-on-awsView external linksLinks
Dynamic training with Apache MXNet reduces cost and time for training deep neural networks by leveraging AWS cloud elasticity and scale. The system reduces training cost and time by dynamically updating the training cluster size during training, with minimal impact on model training accuracy.
☆56Nov 25, 2022Updated 3 years ago
Alternatives and similar repositories for dynamic-training-with-apache-mxnet-on-aws
Users that are interested in dynamic-training-with-apache-mxnet-on-aws are comparing it to the libraries listed below
Sorting:
- A Kubernetes operator for mxnet jobs☆52Dec 1, 2021Updated 4 years ago
- BytePS examples (Vision, NLP, GAN, etc)☆19Nov 24, 2022Updated 3 years ago
- Simulated large clusters for Kubernetes scheduler validation.☆15Jan 3, 2023Updated 3 years ago
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆11Apr 1, 2020Updated 5 years ago
- A fast & easy way to train ML models in your cloud, directly from your laptop.☆14Mar 28, 2022Updated 3 years ago
- A Caffe version of official PyTorch ResNeSt☆27Jul 3, 2020Updated 5 years ago
- Logging MXNet data for visualization in TensorBoard.☆324Nov 30, 2021Updated 4 years ago
- The scheduler of Volcano, built based on kubernetes-sigs/kube-batch☆14Jul 7, 2019Updated 6 years ago
- GPU analyzer for Kubernetes GPU clusters☆17Apr 11, 2020Updated 5 years ago
- NVIDIA device plugin for Kubernetes☆15Sep 9, 2019Updated 6 years ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Jan 5, 2023Updated 3 years ago
- An Efficient Dynamic Resource Scheduler for Deep Learning Clusters☆41Oct 28, 2017Updated 8 years ago
- Lightning Fast: Faiss CPU + Onnx Quantized Multilingual Embedding Model☆23Sep 13, 2024Updated last year
- Batch-scheduler based on K8s scheduling framework, related features have contributed to scheduler-plugins(Deprecated).☆25Aug 6, 2020Updated 5 years ago
- Examples for using Amazon SageMaker components in Kubeflow Pipelines☆22Jun 2, 2020Updated 5 years ago
- Static analysis framework for analyzing programs written in TVM's Relay IR.☆29Oct 31, 2019Updated 6 years ago
- Content for cloud computing workshop☆15Apr 20, 2018Updated 7 years ago
- ☆12Jan 7, 2023Updated 3 years ago
- The schedule of the seminar☆25Dec 28, 2021Updated 4 years ago
- A collection of common util libraries for Go☆25Oct 25, 2020Updated 5 years ago
- Supporting code, Dockerfile, and Jupyter notebook for an end to end tutorial on Amazon SageMaker and EMR.☆28Jan 14, 2026Updated last month
- Elastic Deep Learning for deep learning framework on Kubernetes☆175Jul 5, 2023Updated 2 years ago
- Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling☆30Feb 25, 2021Updated 4 years ago
- Deadline-based hyperparameter tuning on RayTune.☆32Jan 16, 2020Updated 6 years ago
- Design pattern for orchestrating an incremental data ingestion pipeline using AWS Step Functions from an on premise location into an Amaz…☆28Jul 24, 2019Updated 6 years ago
- ☆118Oct 18, 2023Updated 2 years ago
- ☆34May 2, 2022Updated 3 years ago
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- A simple pytorch implementation of CenterNet (CenterNet: Keypoint Triplets for Object Detection)☆33Mar 23, 2020Updated 5 years ago
- Fork of Tensorpack to make breaking performance improvements to the Mask RCNN example. Training is approximately 2x faster than the origi…☆39Feb 12, 2025Updated last year
- torch code to decode (and almost encode) latents from art-DCGAN's Portrait GAN☆38Nov 6, 2018Updated 7 years ago
- A solution describing data-processing design pattern for streaming data through Kinesis and Spark Streaming at real-time.☆39Jun 11, 2024Updated last year
- MXNet Gluon Synchronized Batch Normalization Preview☆77Jul 16, 2018Updated 7 years ago
- ☆11Dec 20, 2023Updated 2 years ago
- Codespace with Airflow and the Astro CLI☆11May 23, 2023Updated 2 years ago
- Digit classification with Convolutional Neural Networks using Keras☆20May 12, 2018Updated 7 years ago
- rabitq rust implementation☆10Feb 4, 2026Updated last week
- ☆11Oct 31, 2019Updated 6 years ago
- Fastened CROWN: Tightened Neural Network Robustness Certificates☆10Feb 10, 2020Updated 6 years ago