LinkAnonymous / BESALinks

☆11

Alternatives and similar repositories for BESA

Users that are interested in BESA are comparing it to the libraries listed below

Sorting:

CASIA-LMC-Lab / FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
☆64Updated last year
Hsu1023 / DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆177Updated last year
luuyin / OWL
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
☆74Updated 5 months ago
biomedical-cybernetics / Relative-importance-and-activation-pruning
☆54Updated last year
Intelligent-Computing-Lab-Panda / GPTAQ
Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)
☆79Updated 5 months ago
ModelTC / Outlier_Suppression_Plus
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…
☆50Updated 2 years ago
ChenMnZ / PrefixQuant
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆169Updated last month
xvyaward / owq
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆68Updated last year
Intelligent-Computing-Lab-Panda / TesseraQ
☆24Updated last year
wimh966 / outlier_suppression
The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…
☆49Updated 3 years ago
thu-nics / qllm-eval
Code Repository of Evaluating Quantized Large Language Models
☆137Updated last year
nbasyl / OFQ
The official implementation of the ICML 2023 paper OFQ-ViT
☆35Updated 2 years ago
kssteven418 / SqueezeLLM-gradients
☆21Updated last year
WoosukKwon / retraining-free-pruning
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
☆192Updated 2 years ago
ruikangliu / IntactKV
[ACL 2024] Official PyTorch implementation of "IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact"
☆48Updated last year
ROIM1998 / APT
[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
☆46Updated last year
IST-DASLab / OBC
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
☆129Updated 2 years ago
HuangOwen / RoLoRA
[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
☆38Updated last year
yxli2123 / LoSparse
☆62Updated 2 years ago
htqin / BiBERT
This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.
☆89Updated 2 years ago
zhangsichengsjtu / AFPQ
AFPQ code implementation
☆24Updated 2 years ago
parsa-epfl / quantization-sparsity-interplay
This repo contains the code for studying the interplay between quantization and sparsity methods
☆25Updated 10 months ago
hahnyuan / ASVD4LLM
Activation-aware Singular Value Decomposition for Compressing Large Language Models
☆82Updated last year
BrotherHappy / OSTQuant
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆87Updated 8 months ago
BaiTheBest / SparseLLM
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
☆66Updated 9 months ago
htqin / IR-QLoRA
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆67Updated last year
zyxxmu / DSnoT
Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…
☆50Updated last year
ruikangliu / FlatQuant
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆202Updated last month
liyunqianggyn / Awesome-LLMs-Pruning
Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.
☆142Updated 4 months ago
AboveParadise / LLMCBench
☆26Updated last year