SGD with compressed gradients and error-feedback: https://arxiv.org/abs/1901.09847
☆32Jul 25, 2024Updated last year
Alternatives and similar repositories for error-feedback-SGD
Users that are interested in error-feedback-SGD are comparing it to the libraries listed below
Sorting:
- QSGD-TF☆21May 15, 2019Updated 6 years ago
- Sparsified SGD with Memory: https://arxiv.org/abs/1809.07599☆58Oct 25, 2018Updated 7 years ago
- Code for the signSGD paper☆93Jan 12, 2021Updated 5 years ago
- gTop-k S-SGD: A Communication-Efficient Distributed Synchronous SGD Algorithm for Deep Learning☆37Aug 19, 2019Updated 6 years ago
- Simple Hierarchical Count Sketch in Python☆21Jun 3, 2021Updated 4 years ago
- PyTorch for benchmarking communication-efficient distributed SGD optimization algorithms☆78Aug 30, 2021Updated 4 years ago
- Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727☆149Oct 29, 2024Updated last year
- Decentralized SGD and Consensus with Communication Compression: https://arxiv.org/abs/1907.09356☆75Sep 10, 2020Updated 5 years ago
- YALL1: Your ALgorithms for L1☆13Jan 28, 2018Updated 8 years ago
- Code for reproducing experiments performed for Accoridon☆13Jun 11, 2021Updated 4 years ago
- Understanding Top-k Sparsification in Distributed Deep Learning☆24Nov 15, 2019Updated 6 years ago
- Stochastic Gradient Push for Distributed Deep Learning☆171Apr 5, 2023Updated 2 years ago
- Q-RR, DIANA-RR, Q-NASTYA, NASTYA-DIANA, QSGD, DIANA, FedCOM and FedPAQ on logistic loss with L2 regularization☆11Nov 1, 2022Updated 3 years ago
- [ICLR 2018] Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training☆226Jul 10, 2024Updated last year
- A compressed adaptive optimizer for training large-scale deep learning models using PyTorch☆25Nov 26, 2019Updated 6 years ago
- ☆12Mar 1, 2024Updated 2 years ago
- It is implementation of Research paper "DEEP GRADIENT COMPRESSION: REDUCING THE COMMUNICATION BANDWIDTH FOR DISTRIBUTED TRAINING". Deep g…☆18Aug 14, 2019Updated 6 years ago
- The code for the paper "QuAFL: Federated Averaging Can Be Both Asynchronous and Communication-Efficient"☆17Mar 26, 2023Updated 2 years ago
- Sketched SGD☆28Jul 4, 2020Updated 5 years ago
- Machine Learning Course From Scratch☆13Jul 24, 2024Updated last year
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- Communication-efficient decentralized SGD (Pytorch)☆24Mar 17, 2020Updated 6 years ago
- Code related to ’Beyond spectral gap: The role of the topology in decentralized learning‘.☆14Jun 7, 2022Updated 3 years ago
- Presentations of the advanced topics in optimization☆11Oct 30, 2019Updated 6 years ago
- Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.☆30Jan 14, 2021Updated 5 years ago
- Partial implementation of paper "DEEP GRADIENT COMPRESSION: REDUCING THE COMMUNICATION BANDWIDTH FOR DISTRIBUTED TRAINING"☆32Nov 20, 2020Updated 5 years ago
- Artifacts of VLDB'22 paper "COMET: A Novel Memory-Efficient Deep Learning TrainingFramework by Using Error-Bounded Lossy Compression"☆10Aug 2, 2022Updated 3 years ago
- ☆10Jun 4, 2021Updated 4 years ago
- Code for paper "Learning a Code: Machine Learning for Approximate Non-Linear Coded-Computation"☆11Dec 21, 2020Updated 5 years ago
- ☆10May 4, 2018Updated 7 years ago
- Implementation of Compressed SGD with Compressed Gradients in Pytorch☆13Jul 25, 2024Updated last year
- SC 2021, "LogECMem: Coupling Erasure-Coded In-Memory Key-Value Stores with Parity Logging"☆12Jul 12, 2021Updated 4 years ago
- An attempt to replicate the paper "Multi-shot Pedestrian Re-identification via Sequential Decision Making (CVPR2018)"☆10Nov 16, 2019Updated 6 years ago
- Certifying Some Distributional Robustness with Principled Adversarial Training (https://arxiv.org/abs/1710.10571)☆45May 1, 2018Updated 7 years ago
- ☆10Sep 3, 2017Updated 8 years ago
- Implementation of learning rate finder in TensorFlow☆12Mar 4, 2019Updated 7 years ago
- This is an implementation of ResNet-34 in TensorFlow2.0 using the Imperative API (subclassing tensorflow.keras.Model)☆12Dec 11, 2020Updated 5 years ago
- Implementation of the SuRP algorithm by the authors of the AISTATS 2022 paper "An Information-Theoretic Justification for Model Pruning".…☆14May 4, 2022Updated 3 years ago
- RP-GAN: Stable GAN Training with Random Projections☆22Jun 27, 2018Updated 7 years ago