Dynamic training with Apache MXNet reduces cost and time for training deep neural networks by leveraging AWS cloud elasticity and scale. The system reduces training cost and time by dynamically updating the training cluster size during training, with minimal impact on model training accuracy.
☆56Nov 25, 2022Updated 3 years ago
Alternatives and similar repositories for dynamic-training-with-apache-mxnet-on-aws
Users that are interested in dynamic-training-with-apache-mxnet-on-aws are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Kubernetes operator for mxnet jobs☆52Dec 1, 2021Updated 4 years ago
- Logging MXNet data for visualization in TensorBoard.☆324Nov 30, 2021Updated 4 years ago
- BytePS examples (Vision, NLP, GAN, etc)☆19Nov 24, 2022Updated 3 years ago
- ☆118Oct 18, 2023Updated 2 years ago
- Studying GPU Multi-tenancy☆11Jan 11, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Simulated large clusters for Kubernetes scheduler validation.☆15Jan 3, 2023Updated 3 years ago
- A fast & easy way to train ML models in your cloud, directly from your laptop.☆14Mar 28, 2022Updated 4 years ago
- ☆10Jul 29, 2020Updated 5 years ago
- Elastic Deep Learning for deep learning framework on Kubernetes☆176Jul 5, 2023Updated 2 years ago
- amazon-sagemaker-cdk-examples uses AWS CDK to simplify common architectures in machine leaning operations using Sagemaker and other AWS s…☆69Mar 28, 2024Updated 2 years ago
- Intel(R) Machine Learning Scaling Library is a library providing an efficient implementation of communication patterns used in deep learn…☆108Jan 7, 2023Updated 3 years ago
- KDD18 Tutorial: Deep Learning and Natural Language Processing with Apache MXNet (Incubating) Gluon☆172Jan 15, 2019Updated 7 years ago
- Amazon SageMaker MLOps deployment pipeline for A/B Testing of machine learning models.☆45Jun 7, 2021Updated 5 years ago
- Repository for AWS DBS Reference Architectures - Enterprise Data Warehousing☆35Jan 17, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Apr 22, 2021Updated 5 years ago
- https://beta.mxnet.io/☆13Jul 25, 2019Updated 6 years ago
- Examples of building probabilistic models with MXNet linear algebra operators☆23Oct 24, 2017Updated 8 years ago
- Amazon Elastic Inference tools and utilities.☆17Apr 8, 2020Updated 6 years ago
- SDN project 2019 on Mininet☆13Aug 3, 2023Updated 2 years ago
- The objective of Cloud Builders' Day repository is to provide do-it-yourself lab guides for several AWS services including but not limite…☆11Aug 20, 2020Updated 5 years ago
- the hadoop plugin for chdfs☆15Feb 27, 2026Updated 4 months ago
- Benchmarks for NumPy compatible frameworks.☆16Jan 6, 2026Updated 5 months ago
- MXNet Gluon Synchronized Batch Normalization Preview☆77Jul 16, 2018Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A PyTorch implementation of paper "Visualizing and Understanding Recurrent Networks"☆10Mar 16, 2018Updated 8 years ago
- GPU analyzer for Kubernetes GPU clusters☆16Apr 11, 2020Updated 6 years ago
- ☆11Dec 20, 2023Updated 2 years ago
- Run shell commands becomes easy in Julia!☆12Jul 21, 2020Updated 5 years ago
- MXNet implementation of Graph Convolutional Neural Networks☆20Oct 8, 2018Updated 7 years ago
- This sample code demonstrates how to build an Amazon SageMaker environment for HPO using Optuna (an open source hyperparameter tuning fra…☆11May 21, 2024Updated 2 years ago
- ☆14May 30, 2019Updated 7 years ago
- Old Reinforcement Learning research from university☆10Jan 4, 2017Updated 9 years ago
- pre-loadable library tracking all memory allocations of a program. Simplified version of log-malloc2☆12Nov 29, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The scheduler of Volcano, built based on kubernetes-sigs/kube-batch☆14Jul 7, 2019Updated 6 years ago
- Deep learning benchmark utility and optimization tips on EKS.☆47Aug 13, 2019Updated 6 years ago
- Code for the ICASSP-2021 paper: Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer☆12Sep 2, 2021Updated 4 years ago
- ☆91Jul 21, 2022Updated 3 years ago
- SageMaker custom deployments made easy☆63Mar 31, 2025Updated last year
- ☆14Apr 16, 2021Updated 5 years ago
- 2nd place solution of ECCV 2020 workshop VIPriors Image Classification Challenge, https://arxiv.org/abs/2008.00261☆13Aug 22, 2021Updated 4 years ago