A Sparse-tensor Communication Framework for Distributed Deep Learning
☆13Nov 1, 2021Updated 4 years ago
Alternatives and similar repositories for DeepReduce
Users that are interested in DeepReduce are comparing it to the libraries listed below
Sorting:
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- ☆10Jun 4, 2021Updated 4 years ago
- ☆11Oct 25, 2023Updated 2 years ago
- We present a set of all-reduce compatible gradient compression algorithms which significantly reduce the communication overhead while mai…☆10Nov 14, 2021Updated 4 years ago
- Code for reproducing experiments performed for Accoridon☆13Jun 11, 2021Updated 4 years ago
- ☆10Nov 25, 2023Updated 2 years ago
- Reducing P4 Language’s Voluminosity using Higher-Level Constructs☆15Oct 15, 2022Updated 3 years ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Dec 4, 2023Updated 2 years ago
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Sep 21, 2023Updated 2 years ago
- ☆68Mar 14, 2023Updated 2 years ago
- A parallel programming model for online applications with complex synchronization requirements.☆16Jun 8, 2022Updated 3 years ago
- Poise source code repo☆12Aug 12, 2020Updated 5 years ago
- Recycling Model Updates in Federated Learning: Are Gradient Subspaces Low-Rank?☆15Mar 24, 2022Updated 3 years ago
- ☆16Apr 22, 2025Updated 10 months ago
- Elixir: Train a Large Language Model on a Small GPU Cluster☆15Jun 8, 2023Updated 2 years ago
- THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression☆20Jul 30, 2024Updated last year
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 2 years ago
- ☆26Dec 22, 2024Updated last year
- Planter is a modular framework for realising in one-click in-network machine learning algorithms.☆25Jun 13, 2024Updated last year
- ☆19Jan 9, 2025Updated last year
- ☆64Jun 25, 2024Updated last year
- GRACE - GRAdient ComprEssion for distributed deep learning☆139Jul 23, 2024Updated last year
- Personal Digest of NAS (Under Construction 🛠)☆25Nov 24, 2020Updated 5 years ago
- Laplacian Change Point Detection for Dynamic Graphs (KDD 2020)☆29Jul 13, 2023Updated 2 years ago
- A compressed adaptive optimizer for training large-scale deep learning models using PyTorch☆25Nov 26, 2019Updated 6 years ago
- Really Elastic Ray Engine☆29Aug 8, 2024Updated last year
- Flowrest: in-switch flow-level classification with random forests☆35Feb 2, 2026Updated last month
- Artifacts of EuroSys'24 paper "Exploring Performance and Cost Optimization with ASIC-Based CXL Memory"☆31Feb 21, 2024Updated 2 years ago
- A Learnable LSH Framework for Efficient NN Training☆34Jul 22, 2021Updated 4 years ago
- Scaling Up Memory Disaggregated Applications with SMART☆34Apr 23, 2024Updated last year
- ☆37Jan 14, 2025Updated last year
- released code for the paper: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding☆31Nov 24, 2020Updated 5 years ago
- Code for the paper "FlowLens: Enabling Efficient Flow Classification for ML-based Network Security Applications" [NDSS '21]☆38Jan 16, 2021Updated 5 years ago
- Ancestral Gumbel-Top-k Sampling☆25Apr 11, 2020Updated 5 years ago
- Prefix-Aware Attention for LLM Decoding☆29Jan 23, 2026Updated last month
- μP4: A framework for programming dataplane of network devices☆34Aug 4, 2020Updated 5 years ago
- Machine learning on serverless platform☆10Jul 2, 2022Updated 3 years ago
- Mu: Microsecond Consensus for Microsecond Applications☆42Oct 12, 2020Updated 5 years ago
- ☆33Updated this week