yandex-research / moshpit-sgd
"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation
☆28Updated last year
Related projects: ⓘ
- Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.☆24Updated last year
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆54Updated 3 years ago
- Code release to reproduce ASHA experiments from "Random Search and Reproducibility for NAS."☆20Updated 4 years ago
- Deadline-based hyperparameter tuning on RayTune.☆31Updated 4 years ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆15Updated 2 years ago
- Factorized Neural Layers☆27Updated last year
- A "gym" style toolkit for building lightweight NAS systems.☆13Updated 2 years ago
- A Sparse-tensor Communication Framework for Distributed Deep Learning☆13Updated 2 years ago
- ☆22Updated 3 years ago
- NeurIPS 2021 - Few-shot learning competition☆26Updated 2 years ago
- DL Dataloader Benchmarks☆18Updated last week
- "Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implemen…☆53Updated 3 years ago
- An adaptive training algorithm for residual network☆14Updated 4 years ago
- Libraries for efficient and scalable group-structured dataset pipelines.☆22Updated 5 months ago
- ☆16Updated 2 years ago
- ☆22Updated 6 years ago
- ☆17Updated last year
- AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION☆10Updated 10 months ago
- ☆21Updated this week
- ☆13Updated 2 years ago
- Some microbenchmarks and design docs before commencement☆12Updated 3 years ago
- Code for the paper "Secure Distributed Training at Scale" (ICML 2022)☆14Updated 2 years ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Updated 9 months ago
- Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction☆35Updated 2 years ago
- Implementation of Kronecker Attention in Pytorch☆17Updated 4 years ago
- Hyperparameter tuning via uncertainty modeling☆46Updated 4 months ago
- [ICLR 2022] "Sparsity Winning Twice: Better Robust Generalization from More Efficient Training" by Tianlong Chen*, Zhenyu Zhang*, Pengjun…☆37Updated 2 years ago
- PyTorch implementation of HashedNets☆35Updated last year
- Code for "The Expressive Power of Low-Rank Adaptation".☆17Updated 5 months ago
- This code reproduces the results of the paper, "Measuring Data Leakage in Machine-Learning Models with Fisher Information"☆48Updated 3 years ago