NUS-HPC-AI-Lab / pytorch-lambLinks
PyTorch implementation of LAMB for ImageNet/ResNet-50 training
☆13Updated 4 years ago
Alternatives and similar repositories for pytorch-lamb
Users that are interested in pytorch-lamb are comparing it to the libraries listed below
Sorting:
- Accuracy 77%. Large batch deep learning optimizer LARS for ImageNet with PyTorch and ResNet, using Horovod for distribution. Optional acc…☆38Updated 4 years ago
- ☆218Updated 2 years ago
- ☆35Updated 4 years ago
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆59Updated 3 years ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆64Updated 7 months ago
- ☆193Updated 4 years ago
- Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)☆53Updated 4 years ago
- Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727☆148Updated 11 months ago
- Simple CIFAR10 ResNet example with JAX.☆23Updated 4 years ago
- Code for Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot☆41Updated 4 years ago
- Code for "Picking Winning Tickets Before Training by Preserving Gradient Flow" https://openreview.net/pdf?id=SkgsACVKPH☆105Updated 5 years ago
- Train ImageNet *fast* in 500 lines of code with FFCV☆149Updated last year
- Training vision models with full-batch gradient descent and regularization☆38Updated 2 years ago
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆35Updated 3 years ago
- [IJCAI'22 Survey] Recent Advances on Neural Network Pruning at Initialization.☆59Updated 2 years ago
- ☆42Updated 2 years ago
- PyTorch repository for ICLR 2022 paper (GSAM) which improves generalization (e.g. +3.8% top-1 accuracy on ImageNet with ViT-B/32)☆144Updated 3 years ago
- The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"☆43Updated 2 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆199Updated 2 years ago
- TPAMI 2021: NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size☆183Updated 3 years ago
- Soft Threshold Weight Reparameterization for Learnable Sparsity☆90Updated 2 years ago
- Lightweight torch implementation of rigl, a sparse-to-sparse optimizer.☆58Updated 3 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 5 months ago
- [ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen…☆28Updated 2 years ago
- A general and accurate MACs / FLOPs profiler for PyTorch models☆630Updated 2 months ago
- ☆32Updated 3 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆80Updated 2 years ago
- ☆19Updated 3 years ago
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆46Updated 3 years ago
- ☆227Updated last year