Simple Distributed Deep Learning on TensorFlow
☆134Feb 5, 2026Updated last month
Alternatives and similar repositories for autodist
Users that are interested in autodist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Hyperparameter tuning via uncertainty modeling☆49May 3, 2024Updated last year
- Resource-adaptive cluster scheduler for deep learning training.☆453Mar 5, 2023Updated 3 years ago
- An extensible framework for building visualization and annotation tools to enable better interaction with NLP and Artificial Intelligence…☆49Feb 4, 2023Updated 3 years ago
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆15Sep 18, 2020Updated 5 years ago
- Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/☆250Feb 5, 2024Updated 2 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆295Feb 23, 2024Updated 2 years ago
- Release doc/tutorial/wheels for poseidon-tf☆10Jan 18, 2018Updated 8 years ago
- ☆42Sep 8, 2023Updated 2 years ago
- This is the Group-Meeting collections of HKUST System NetworkING (SING) Research Group.☆27Oct 3, 2019Updated 6 years ago
- Analyze network performance in distributed training☆20Oct 20, 2020Updated 5 years ago
- Fine-grained GPU sharing primitives☆147Jul 28, 2025Updated 7 months ago
- Switches for HIRE: Resource Scheduling for Data Center In-Network Computing☆13Jan 18, 2021Updated 5 years ago
- A tensor-aware point-to-point communication primitive for machine learning☆284Dec 17, 2025Updated 3 months ago
- GPU-scheduler-for-deep-learning☆209Nov 5, 2020Updated 5 years ago
- A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.☆131Feb 21, 2022Updated 4 years ago
- ☆44Sep 6, 2021Updated 4 years ago
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago
- Training and serving large-scale neural networks with auto parallelization.☆3,187Dec 9, 2023Updated 2 years ago
- ☆13Mar 27, 2019Updated 6 years ago
- ☆38Jun 27, 2025Updated 8 months ago
- Implementation of Parameter Server using PyTorch communication lib☆42Apr 7, 2019Updated 6 years ago
- Some microbenchmarks and design docs before commencement☆11Feb 1, 2021Updated 5 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆125Jun 23, 2022Updated 3 years ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆11Jan 20, 2022Updated 4 years ago
- ddl-benchmarks: Benchmarks for Distributed Deep Learning☆36May 29, 2020Updated 5 years ago
- ☆24Nov 24, 2018Updated 7 years ago
- GRACE - GRAdient ComprEssion for distributed deep learning☆139Jul 23, 2024Updated last year
- PMLS-Caffe: Distributed Deep Learning Framework for Parallel ML System☆193May 10, 2018Updated 7 years ago
- ☆392Nov 4, 2022Updated 3 years ago
- Distributed DRL by Ray and TensorFlow Tutorial.☆10Dec 26, 2019Updated 6 years ago
- High Performance Grouped GEMM in PyTorch☆30May 10, 2022Updated 3 years ago
- Kubernetes Scheduler for Deep Learning☆264May 22, 2022Updated 3 years ago
- RDMA Optimization on MXNet☆14Nov 12, 2017Updated 8 years ago
- Multi-gpu/distributed training script in Tensorflow 1.x.☆17Nov 6, 2019Updated 6 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆27Nov 7, 2019Updated 6 years ago
- A fully adaptive, zero-tuning parameter manager that enables efficient distributed machine learning training☆21Feb 23, 2023Updated 3 years ago
- Deep exponential family models in MXNet/Gluon. Layers o' latents 💤☆17Oct 16, 2017Updated 8 years ago
- A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster☆160Apr 20, 2024Updated last year
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32May 15, 2024Updated last year