Efficient-ML / Awesome-Efficient-AIGCLinks
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β203Updated 10 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β177Updated last year
- Code Repository of Evaluating Quantized Large Language Modelsβ137Updated last year
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β140Updated 2 years ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β129Updated 2 years ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ142Updated 9 months ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"β71Updated 9 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ170Updated last month
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β126Updated 3 months ago
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β44Updated last year
- [CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization forβ¦β108Updated 3 months ago
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β203Updated last month
- Awesome list for LLM quantizationβ375Updated 2 months ago
- [CVPR 2025] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersβ73Updated last year
- Awesome list for LLM pruning.β278Updated 2 months ago
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)β79Updated 5 months ago
- [ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Modelsβ28Updated 4 months ago
- Awesome Pruning. β Curated Resources for Neural Network Pruning.β172Updated last year
- Efficient Mixture of Experts for LLM Paper Listβ153Updated 3 months ago
- β26Updated last year
- This repository contains integer operators on GPUs for PyTorch.β223Updated 2 years ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β87Updated 8 months ago
- Post-Training Quantization for Vision transformers.β236Updated 3 years ago
- The official implementation of the ICML 2023 paper OFQ-ViTβ35Updated 2 years ago
- Curated list of methods that focuses on improving the efficiency of diffusion modelsβ44Updated last year
- [ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains Moreβ65Updated 10 months ago
- [ICCV 2025] QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ55Updated 6 months ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metricβ60Updated 2 years ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optiβ¦β50Updated 2 years ago
- A collection of research papers on low-precision training methodsβ55Updated 7 months ago
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β66Updated last year