NUS-HPC-AI-Lab / pytorch-lamb
PyTorch implementation of LAMB for ImageNet/ResNet-50 training
☆13Updated 3 years ago
Alternatives and similar repositories for pytorch-lamb:
Users that are interested in pytorch-lamb are comparing it to the libraries listed below
- Accuracy 77%. Large batch deep learning optimizer LARS for ImageNet with PyTorch and ResNet, using Horovod for distribution. Optional acc…☆38Updated 3 years ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆31Updated 11 months ago
- Parameter Efficient Transfer Learning with Diff Pruning☆73Updated 3 years ago
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆91Updated last year
- ☆49Updated last year
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆48Updated 9 months ago
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆18Updated 8 months ago
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆83Updated 7 months ago
- Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"☆114Updated 10 months ago
- ☆200Updated last year
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆58Updated 3 years ago
- MLPruning, PyTorch, NLP, BERT, Structured Pruning☆21Updated 3 years ago
- [ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen…☆27Updated last year
- The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"☆43Updated last year
- This package implements THOR: Transformer with Stochastic Experts.☆61Updated 3 years ago
- Efficient 2:4 sparse training algorithms and implementations☆46Updated last month
- Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.☆27Updated last year
- ☆41Updated 2 years ago
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆12Updated last year
- PyTorch repository for ICLR 2022 paper (GSAM) which improves generalization (e.g. +3.8% top-1 accuracy on ImageNet with ViT-B/32)☆139Updated 2 years ago
- Training vision models with full-batch gradient descent and regularization☆37Updated last year
- Code for ICML 2021 submission☆35Updated 3 years ago
- ☆35Updated 3 years ago
- Block Sparse movement pruning☆78Updated 4 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆61Updated 5 months ago
- ☆40Updated 3 years ago
- ☆30Updated last year
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆38Updated 9 months ago
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆25Updated last year