wimh966 / outlier_suppression
View external linksLinks

The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models

☆49

Alternatives and similar repositories for outlier_suppression

Users that are interested in outlier_suppression are comparing it to the libraries listed below

Sorting:

ModelTC / Outlier_Suppression_Plus
View on GitHub
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…
☆50Oct 21, 2023Updated 2 years ago
zhexinli / Q-ViT-DeiT
View on GitHub
DeiT implementation for Q-ViT
☆25Apr 21, 2025Updated 9 months ago
hustvl / PD-Quant
View on GitHub
[CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
☆60Mar 23, 2023Updated 2 years ago
yhhhli / BRECQ
View on GitHub
Pytorch implementation of BRECQ, ICLR 2021
☆289Aug 1, 2021Updated 4 years ago
wimh966 / QDrop
View on GitHub
The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quan…
☆127Sep 23, 2025Updated 4 months ago
Qualcomm-AI-research / FP8-quantization
View on GitHub
☆169Mar 9, 2023Updated 2 years ago
nbasyl / OFQ
View on GitHub
The official implementation of the ICML 2023 paper OFQ-ViT
☆38Oct 3, 2023Updated 2 years ago
ThisisBillhe / torch_quantizer
View on GitHub
torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.
☆23Mar 29, 2024Updated last year
enyac-group / evol-q
View on GitHub
Quantization in the Jagged Loss Landscape of Vision Transformers
☆13Oct 22, 2023Updated 2 years ago
GATECH-EIC / torchshiftadd
View on GitHub
An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.
☆14Feb 3, 2025Updated last year
csyhhu / MetaQuant
View on GitHub
Codes for Accepted Paper : "MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization" in NeurIPS 2019
☆54May 8, 2020Updated 5 years ago
xvyaward / owq
View on GitHub
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆69Mar 7, 2024Updated last year
zyxxmu / LBC
View on GitHub
Pytorch implementation of our paper accepted by NeurIPS 2022 -- Learning Best Combination for Efficient N:M Sparsity
☆22Jan 13, 2023Updated 3 years ago
megvii-research / FQ-ViT
View on GitHub
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
☆360Apr 11, 2023Updated 2 years ago
Guangxuan-Xiao / torch-int
View on GitHub
This repository contains integer operators on GPUs for PyTorch.
☆237Sep 29, 2023Updated 2 years ago
OpenBitSys / BitDistiller
View on GitHub
[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.
☆134May 16, 2024Updated last year
Cornell-RelaxML / QuIP
View on GitHub
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
☆396Feb 24, 2024Updated last year
WeixiangXu / STTN
View on GitHub
☆17Oct 25, 2022Updated 3 years ago
hahnyuan / RPTQ4LLM
View on GitHub
Reorder-based post-training quantization for large language model
☆198May 17, 2023Updated 2 years ago
zysxmu / FDDA
View on GitHub
Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization
☆15Sep 13, 2022Updated 3 years ago
Hsu1023 / DuQuant
View on GitHub
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆180Oct 3, 2024Updated last year
liuzechun / Nonuniform-to-Uniform-Quantization
View on GitHub
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.
☆138Apr 28, 2022Updated 3 years ago
clevercool / ANT-Quantization
View on GitHub
☆113Nov 17, 2023Updated 2 years ago
kssteven418 / I-BERT
View on GitHub
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
☆265Jan 29, 2023Updated 3 years ago
bytedance / MRECG
View on GitHub
☆36Mar 29, 2023Updated 2 years ago
megvii-research / IntLLaMA
View on GitHub
IntLLaMA: A fast and light quantization solution for LLaMA
☆18Jul 21, 2023Updated 2 years ago
hahnyuan / PTQ4ViT
View on GitHub
Post-Training Quantization for Vision transformers.
☆238Jul 19, 2022Updated 3 years ago
zhutmost / lsq-net
View on GitHub
Unofficial implementation of LSQ-Net, a neural network quantization framework
☆310May 8, 2024Updated last year
GATECH-EIC / ViTCoD
View on GitHub
[HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
☆127Jun 27, 2023Updated 2 years ago
ziplab / SAQ
View on GitHub
This is the official PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".
☆44Nov 25, 2021Updated 4 years ago
zysxmu / IntraQ
View on GitHub
Pytorch implementation of our paper accepted by CVPR 2022 -- IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Sh…
☆36Mar 2, 2022Updated 3 years ago
spcl / QuaRot
View on GitHub
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
☆482Nov 26, 2024Updated last year
ynahshan / nn-quantization-pytorch
View on GitHub
☆57Dec 8, 2020Updated 5 years ago
mit-han-lab / smoothquant
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆1,607Jul 12, 2024Updated last year
nbasyl / LLM-FP4
View on GitHub
The official implementation of the EMNLP 2023 paper LLM-FP4
☆220Dec 15, 2023Updated 2 years ago
PannenetsF / TQT
View on GitHub
TQT's pytorch implementation.
☆21Dec 17, 2021Updated 4 years ago
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
csyhhu / L-DNQ
View on GitHub
Codes for AAAI2019 paper: Deep Neural Network Quantization via Layer-Wise Optimization using Limited Training Data
☆41Jan 22, 2019Updated 7 years ago
zyxxmu / Bi-Mask
View on GitHub
Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"
☆12Jun 7, 2023Updated 2 years ago

wimh966 / outlier_suppressionView external linksLinks

Alternatives and similar repositories for outlier_suppression

wimh966 / outlier_suppression
View external linksLinks