yandex-research / moshpit-sgdLinks

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

☆29

Alternatives and similar repositories for moshpit-sgd

Users that are interested in moshpit-sgd are comparing it to the libraries listed below

Sorting:

HazyResearch / mongoose
A Learnable LSH Framework for Efficient NN Training
☆32Updated 4 years ago
microsoft / fnl_paper
Factorized Neural Layers
☆29Updated 2 years ago
mryab / learning-at-home
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implemen…
☆57Updated 4 years ago
juntang-zhuang / ACProp-Optimizer
Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)
☆16Updated 3 years ago
learning-at-home / lean_transformer
Memory-efficient transformer. Work in progress.
☆19Updated 2 years ago
automl / zero-shot-automl-with-pretrained-models
Official repository for the paper "Zero-Shot AutoML with Pretrained Models"
☆47Updated last year
DS3Lab / AC-SGD
Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.
☆28Updated 2 years ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆31Updated 2 years ago
princeton-nlp / DataMUX
[NeurIPS 2022] DataMUX: Data Multiplexing for Neural Networks
☆60Updated 2 years ago
RobertCsordas / linear_layer_as_attention
The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …
☆16Updated last month
aoiang / LaMOO
☆29Updated 2 years ago
Distributed-AI / PipeTransformer
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021
☆56Updated 4 years ago
kshitij12345 / torchnnprofiler
Context Manager to profile the forward and backward times of PyTorch's nn.Module
☆83Updated last year
lucidrains / deep-linear-network
A simple implementation of a deep linear Pytorch module
☆21Updated 4 years ago
usyd-fsalab / NeuralNetworkRandomness
☆14Updated 3 years ago
IntelLabs / DyNAS-T
Dynamic Neural Architecture Search Toolkit
☆30Updated 8 months ago
jxbz / nero
👑 Pytorch code for the Nero optimiser.
☆20Updated 2 years ago
petuum / tuun
Hyperparameter tuning via uncertainty modeling
☆47Updated last year
ChristophReich1996 / HyperMixer
PyTorch reimplementation of the paper "HyperMixer: An MLP-based Green AI Alternative to Transformers" [arXiv 2022].
☆17Updated 3 years ago
titu1994 / simple_diffusion
Simple notebooks to learn diffusion models on toy datasets
☆17Updated 2 years ago
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆66Updated 7 months ago
facebookresearch / Data_Acquisition_for_ML_Benchmark
DAM Data Acquisition for ML Benchmark, as part of the DataPerf benchmark suite, https://dataperf.org/
☆24Updated 2 years ago
YuhanLiu11 / AutoFreeze
☆22Updated 4 years ago
lxuechen / ml-swissknife
An ML research codebase built with friends :)
☆24Updated 11 months ago
mit-han-lab / neurips-micronet
[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion
☆40Updated 4 years ago
hwang595 / Cuttlefish
The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"
☆45Updated 2 years ago
lucidrains / kronecker-attention-pytorch
Implementation of Kronecker Attention in Pytorch
☆19Updated 4 years ago
jbr-ai-labs / bbo-challenge-jetbrains-research
Code for Solving Black-Box Optimization Challenge via Learning Search Space Partition for Local Bayesian Optimization.
☆21Updated 3 years ago
jiaweizzhao / ZerO-initialization
☆74Updated 2 years ago
MadryLab / modeldiff
ModelDiff: A Framework for Comparing Learning Algorithms
☆59Updated last year