Hao840 / Awesome-Low-Precision-TrainingLinks
A collection of research papers on low-precision training methods
☆19Updated last month
Alternatives and similar repositories for Awesome-Low-Precision-Training
Users that are interested in Awesome-Low-Precision-Training are comparing it to the libraries listed below
Sorting:
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆46Updated last year
- ACL 2023☆39Updated 2 years ago
- Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"☆38Updated this week
- The official implementation of the ICML 2023 paper OFQ-ViT☆32Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆38Updated last year
- [ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Li…☆53Updated last year
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆33Updated 10 months ago
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)☆47Updated last month
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆37Updated 9 months ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- [NeurIPS 2024] Search for Efficient LLMs☆14Updated 5 months ago
- Structured Binary Neural Networks for Image Recognition☆18Updated 3 years ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆40Updated 2 years ago
- ☆22Updated 3 months ago
- [CVPR 2025] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers☆51Updated 10 months ago
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆12Updated 2 years ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆56Updated last year
- Pytorch implementation of our paper accepted by NeurIPS 2022 -- Learning Best Combination for Efficient N:M Sparsity☆17Updated 2 years ago
- Code for ICML 2021 submission☆34Updated 4 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Updated 2 years ago
- ☆12Updated 2 years ago
- This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is…☆25Updated 3 years ago
- ☆39Updated 7 months ago
- Implementation of PGONAS for CVPR22W and RD-NAS for ICASSP23☆22Updated 2 years ago
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆72Updated 8 months ago
- ☆10Updated 3 years ago
- ☆43Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆14Updated 5 months ago
- BESA is a differentiable weight pruning technique for large language models.☆17Updated last year
- [TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…☆45Updated 9 months ago