Low-bit optimizers for PyTorch
β138Oct 9, 2023Updated 2 years ago
Alternatives and similar repositories for low-bit-optimizers
Users that are interested in low-bit-optimizers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-trainingβ37Jun 20, 2025Updated 9 months ago
- HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. π The official implementation of https://arxβ¦β29Feb 17, 2025Updated last year
- β157Jun 22, 2023Updated 2 years ago
- β63Jul 21, 2024Updated last year
- A Tight-fisted Optimizerβ50Mar 7, 2023Updated 3 years ago
- [Neurips 2022] β Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropogationβ, Ziyu Jiang*, Xuxi Chen*, Xueqin Huanβ¦β19Mar 14, 2023Updated 3 years ago
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".β47Jul 12, 2024Updated last year
- code for Scaling Laws of RoPE-based Extrapolationβ73Oct 16, 2023Updated 2 years ago
- β14Aug 1, 2025Updated 7 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".β281Nov 3, 2023Updated 2 years ago
- β235Jun 11, 2024Updated last year
- Implementation of NM sparsity recipe presented in the paper "Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers".β11Feb 5, 2024Updated 2 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)β81Aug 30, 2023Updated 2 years ago
- Official PyTorch implementation of CD-MOEβ12Updated this week
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projectionβ1,684Oct 28, 2024Updated last year
- Microsoft Automatic Mixed Precision Libraryβ635Dec 1, 2025Updated 3 months ago
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Trainingβ262Aug 9, 2025Updated 7 months ago
- β121Updated this week
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limitβ63Jun 21, 2023Updated 2 years ago
- LOMO: LOw-Memory Optimizationβ989Jul 2, 2024Updated last year
- Linear Attention Sequence Parallelism (LASP)β88Jun 4, 2024Updated last year
- The official implementation of the EMNLP 2023 paper LLM-FP4β222Dec 15, 2023Updated 2 years ago
- See https://github.com/cuda-mode/triton-index/ instead!β11May 8, 2024Updated last year
- Collaborative Training of Large Language Models in an Efficient Wayβ419Aug 28, 2024Updated last year
- Deep neural network framework for multiple GPUsβ34Jun 20, 2015Updated 10 years ago
- Decensoring Hentaiβ14Sep 19, 2022Updated 3 years ago
- β10Apr 24, 2023Updated 2 years ago
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)β58Nov 8, 2024Updated last year
- FLOPS counter for all your GPU benchmarking needsβ13Aug 8, 2024Updated last year
- β33Dec 17, 2025Updated 3 months ago
- AudioSR-Upsampling (any -> 48kHz)β42Feb 13, 2024Updated 2 years ago
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"β21Jun 7, 2025Updated 9 months ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β88Apr 8, 2025Updated 11 months ago
- [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantizationβ713Aug 13, 2024Updated last year
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.β492Nov 26, 2024Updated last year
- β53Jul 18, 2024Updated last year
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793β453May 13, 2025Updated 10 months ago
- Ring attention implementation with flash attentionβ996Sep 10, 2025Updated 6 months ago
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?β20Mar 9, 2025Updated last year