Efficient-ML / Awesome-Efficient-LLM-Diffusion

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

☆166

Alternatives and similar repositories for Awesome-Efficient-LLM-Diffusion:

Users that are interested in Awesome-Efficient-LLM-Diffusion are comparing it to the libraries listed below

Hsu1023 / DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆138Updated 3 months ago
pprp / Awesome-LLM-Prune
Awesome list for LLM pruning.
☆195Updated last month
thu-nics / qllm-eval
Code Repository of Evaluating Quantized Large Language Models
☆114Updated 4 months ago
thu-nics / ViDiT-Q
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
☆41Updated this week
ModelTC / TFMQ-DM
[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Mo…
☆60Updated 5 months ago
42Shawn / PTQ4DM
Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)
☆126Updated last year
adreamwu / PTQ4DiT
PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005
☆19Updated 2 months ago
wimh966 / QDrop
The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quan…
☆115Updated last year
htqin / BiBench
[ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…
☆54Updated 10 months ago
Guangxuan-Xiao / torch-int
This repository contains integer operators on GPUs for PyTorch.
☆190Updated last year
pprp / Awesome-LLM-Quantization
Awesome list for LLM quantization
☆160Updated last month
liyunqianggyn / Awesome-LLMs-Pruning
Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.
☆61Updated last month
IST-DASLab / OBC
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
☆109Updated last year
ThisisBillhe / EfficientDM
[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…
☆55Updated 7 months ago
hahnyuan / PTQ4ViT
Post-Training Quantization for Vision transformers.
☆201Updated 2 years ago
ghimiredhikura / Awasome-Pruning
Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.
☆142Updated 5 months ago
hatchetProject / QuEST
QuEST: Efficient Finetuning for Low-bit Diffusion Models
☆36Updated last week
ziplab / PTQD
The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models
☆93Updated 10 months ago
hustvl / PD-Quant
[CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
☆52Updated last year
Zhen-Dong / Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
☆514Updated last month
hrcheng1066 / awesome-pruning
☆210Updated 5 months ago
ChenMnZ / PrefixQuant
An algorithm for static activation quantization of LLMs
☆112Updated 2 weeks ago
htqin / IR-QLoRA
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆60Updated 9 months ago
DD-DuDa / awesome-vit-quantization-acceleration
List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.
☆72Updated 7 months ago
YanjingLi0202 / Q-ViT
The official implementation of the NeurIPS 2022 paper Q-ViT.
☆86Updated last year
ruikangliu / FlatQuant
Official PyTorch implementation of FlatQuant: Flatness Matters for LLM Quantization
☆95Updated last week
WoosukKwon / retraining-free-pruning
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
☆181Updated last year
facebookresearch / LLM-QAT
Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"
☆266Updated 4 months ago
htqin / BiBERT
This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.
☆85Updated last year
megvii-research / FQ-ViT
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
☆320Updated last year