NUS-HPC-AI-Lab / pytorch-lambLinks
PyTorch implementation of LAMB for ImageNet/ResNet-50 training
☆13Updated 4 years ago
Alternatives and similar repositories for pytorch-lamb
Users that are interested in pytorch-lamb are comparing it to the libraries listed below
Sorting:
- Accuracy 77%. Large batch deep learning optimizer LARS for ImageNet with PyTorch and ResNet, using Horovod for distribution. Optional acc…☆38Updated 4 years ago
- ☆209Updated 2 years ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆59Updated 4 months ago
- ☆35Updated 3 years ago
- Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727☆147Updated 8 months ago
- ☆30Updated last year
- PyTorch repository for ICLR 2022 paper (GSAM) which improves generalization (e.g. +3.8% top-1 accuracy on ImageNet with ViT-B/32)☆143Updated 2 years ago
- Code for Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot☆42Updated 4 years ago
- ☆42Updated 2 years ago
- [ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennige…☆92Updated last year
- Code for "Picking Winning Tickets Before Training by Preserving Gradient Flow" https://openreview.net/pdf?id=SkgsACVKPH☆105Updated 5 years ago
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆59Updated 3 years ago
- Distributed K-FAC preconditioner for PyTorch☆87Updated last week
- Git Re-Basin: Merging Models modulo Permutation Symmetries in PyTorch☆76Updated 2 years ago
- Implementation of Effective Sparsification of Neural Networks with Global Sparsity Constraint☆31Updated 3 years ago
- Train ImageNet *fast* in 500 lines of code with FFCV☆144Updated last year
- Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"☆118Updated last year
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers☆190Updated 2 years ago
- Lightweight torch implementation of rigl, a sparse-to-sparse optimizer.☆57Updated 3 years ago
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆105Updated last week
- ☆226Updated 11 months ago
- This repository contains the implementation of the paper "MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models".☆20Updated last month
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- [Neurips 2022] “ Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropogation”, Ziyu Jiang*, Xuxi Chen*, Xueqin Huan…☆19Updated 2 years ago
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆46Updated 2 years ago
- [NeurIPS 2021] code for "Taxonomizing local versus global structure in neural network loss landscapes" https://arxiv.org/abs/2107.11228☆19Updated 3 years ago
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training☆215Updated last month
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆90Updated 2 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆200Updated 2 years ago
- ☆13Updated 2 years ago