yandex-research / btardLinks

Code for the paper "Secure Distributed Training at Scale" (ICML 2022)

☆16

Alternatives and similar repositories for btard

Users that are interested in btard are comparing it to the libraries listed below

Sorting:

mryab / learning-at-home
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implemen…
☆56Updated 4 years ago
skolai / fewbit
Compression schema for gradients of activations in backward pass
☆44Updated last year
yandex-research / moshpit-sgd
"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation
☆29Updated 4 months ago
learning-at-home / lean_transformer
Memory-efficient transformer. Work in progress.
☆19Updated 2 years ago
FusionBrainLab / LLM-Microscope
☆70Updated 9 months ago
epfml / REQ
☆17Updated 11 months ago
DS3Lab / CocktailSGD
☆27Updated last year
tml-epfl / understanding-sam
Towards Understanding Sharpness-Aware Minimization [ICML 2022]
☆35Updated 2 years ago
tml-epfl / sgd-sparse-features
SGD with large step sizes learns sparse features [ICML 2023]
☆32Updated 2 years ago
yandex-research / DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
☆116Updated 3 years ago
rollovd / LookSAM
This is unofficial repository for Towards Efficient and Scalable Sharpness-Aware Minimization.
☆36Updated last year
jfainberg / hashed_nets
PyTorch implementation of HashedNets
☆36Updated 2 years ago
JonasGeiping / fullbatchtraining
Training vision models with full-batch gradient descent and regularization
☆37Updated 2 years ago
zyushun / hessian-spectrum
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆59Updated 2 months ago
davda54 / ada-hessian
Easy-to-use AdaHessian optimizer (PyTorch)
☆79Updated 4 years ago
xingchenwan / MLRG_DeepCurvature
☆15Updated 5 years ago
KellerJordan / REPAIR
Code release for REPAIR: REnormalizing Permuted Activations for Interpolation Repair
☆47Updated last year
GuillaumeLeclerc / fastargs
Python library for argument and configuration management
☆54Updated 2 years ago
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆80Updated last year
epfml / powergossip
Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"
☆16Updated 4 years ago
aks2203 / deep-thinking
A centralized place for deep thinking code and experiments
☆84Updated last year
VITA-Group / SMC-Bench
[ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen…
☆28Updated last year
mccrearyd / rigl-torch
Lightweight torch implementation of rigl, a sparse-to-sparse optimizer.
☆56Updated 3 years ago
hwang595 / Cuttlefish
The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"
☆45Updated 2 years ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆30Updated last year
edwardjhu / TP4
Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)
☆62Updated 4 years ago
wtong98 / mlp-icl
☆10Updated 8 months ago
goodevening13 / aquakv
☆14Updated this week
sidak / otfusion
Model Fusion via Optimal Transport, NeurIPS 2020
☆145Updated 2 years ago
IST-DASLab / WoodFisher
Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)
☆52Updated 4 years ago