yandex-research / btard
Code for the paper "Secure Distributed Training at Scale" (ICML 2022)
☆14Updated 2 years ago
Alternatives and similar repositories for btard:
Users that are interested in btard are comparing it to the libraries listed below
- "Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implemen…☆53Updated 4 years ago
- Memory-efficient transformer. Work in progress.☆19Updated 2 years ago
- Compression schema for gradients of activations in backward pass☆44Updated last year
- "Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation☆28Updated last year
- The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"☆43Updated last year
- Latest Weight Averaging (NeurIPS HITY 2022)☆28Updated last year
- Code release for REPAIR: REnormalizing Permuted Activations for Interpolation Repair☆46Updated 11 months ago
- Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"☆15Updated 4 years ago
- ☆25Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆58Updated 3 months ago
- Spartan is an algorithm for training sparse neural network models. This repository accompanies the paper "Spartan Differentiable Sparsity…☆24Updated 2 years ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆61Updated 5 months ago
- ☆92Updated 2 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- Python library for argument and configuration management☆53Updated last year
- Libraries for efficient and scalable group-structured dataset pipelines.☆23Updated last month
- ☆16Updated 7 months ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆47Updated 8 months ago
- ☆34Updated last month
- Fast training of unitary deep network layers from low-rank updates☆28Updated 2 years ago
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)☆116Updated 3 years ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆58Updated 3 years ago
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆35Updated 2 years ago
- ☆48Updated 11 months ago
- nanoGPT-like codebase for LLM training☆83Updated this week
- A library for unit scaling in PyTorch☆118Updated last month
- An ML research codebase built with friends :)☆22Updated 4 months ago
- A centralized place for deep thinking code and experiments☆78Updated last year
- ☆23Updated 2 months ago