Simple Distributed Deep Learning on TensorFlow
☆135Feb 5, 2026Updated 3 months ago
Alternatives and similar repositories for autodist
Users that are interested in autodist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Hyperparameter tuning via uncertainty modeling☆51May 3, 2024Updated 2 years ago
- Resource-adaptive cluster scheduler for deep learning training.☆459Mar 5, 2023Updated 3 years ago
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆15Sep 18, 2020Updated 5 years ago
- Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/☆252Feb 5, 2024Updated 2 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆295Feb 23, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for Generalized Zero-Shot Text Classification for ICD Coding (IJCAI 2020)☆17Jul 27, 2020Updated 5 years ago
- ☆42Sep 8, 2023Updated 2 years ago
- This is the Group-Meeting collections of HKUST System NetworkING (SING) Research Group.☆27Oct 3, 2019Updated 6 years ago
- Analyze network performance in distributed training☆20Oct 20, 2020Updated 5 years ago
- Release doc/tutorial/wheels for poseidon-tf☆10Jan 18, 2018Updated 8 years ago
- Fine-grained GPU sharing primitives☆148Jul 28, 2025Updated 9 months ago
- Switches for HIRE: Resource Scheduling for Data Center In-Network Computing☆13Jan 18, 2021Updated 5 years ago
- A tensor-aware point-to-point communication primitive for machine learning☆286Dec 17, 2025Updated 5 months ago
- GPU-scheduler-for-deep-learning☆209Nov 5, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.☆130Feb 21, 2022Updated 4 years ago
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago
- Training and serving large-scale neural networks with auto parallelization.☆3,187Dec 9, 2023Updated 2 years ago
- ☆38Jun 27, 2025Updated 10 months ago
- Implementation of Parameter Server using PyTorch communication lib☆41Apr 7, 2019Updated 7 years ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆126Jun 23, 2022Updated 3 years ago
- Some microbenchmarks and design docs before commencement☆11Feb 1, 2021Updated 5 years ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆11Jan 20, 2022Updated 4 years ago
- ddl-benchmarks: Benchmarks for Distributed Deep Learning☆36May 29, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆24Nov 24, 2018Updated 7 years ago
- GRACE - GRAdient ComprEssion for distributed deep learning☆140Jul 23, 2024Updated last year
- ☆393Nov 4, 2022Updated 3 years ago
- High Performance Grouped GEMM in PyTorch☆30May 10, 2022Updated 4 years ago
- Kubernetes Scheduler for Deep Learning☆263May 22, 2022Updated 4 years ago
- RDMA Optimization on MXNet☆14Nov 12, 2017Updated 8 years ago
- Multi-gpu/distributed training script in Tensorflow 1.x.☆17Nov 6, 2019Updated 6 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆26Nov 7, 2019Updated 6 years ago
- A fully adaptive, zero-tuning parameter manager that enables efficient distributed machine learning training☆21Feb 23, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Deep exponential family models in MXNet/Gluon. Layers o' latents 💤☆17Oct 16, 2017Updated 8 years ago
- A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster☆162Apr 20, 2024Updated 2 years ago
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32May 15, 2024Updated 2 years ago
- ☆220Aug 17, 2023Updated 2 years ago
- A high performance and generic framework for distributed DNN training☆3,720Oct 3, 2023Updated 2 years ago
- Code for "BayesAdapter: Being Bayesian, Inexpensively and Robustly, via Bayeisan Fine-tuning"☆32Jul 25, 2024Updated last year
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,881Updated this week