Efficient-ML / Awesome-Efficient-AIGCLinks
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β186Updated 5 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β165Updated 10 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ129Updated 11 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β139Updated 2 years ago
- [CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization forβ¦β103Updated 3 weeks ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ109Updated 4 months ago
- Awesome list for LLM pruning.β246Updated 7 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ145Updated 2 months ago
- Awesome list for LLM quantizationβ260Updated last month
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β125Updated 2 years ago
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)β56Updated last week
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β151Updated 2 weeks ago
- A sparse attention kernel supporting mix sparse patternsβ262Updated 5 months ago
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β123Updated 2 years ago
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β32Updated 9 months ago
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β161Updated 11 months ago
- Efficient Mixture of Experts for LLM Paper Listβ87Updated 7 months ago
- This repository contains integer operators on GPUs for PyTorch.β211Updated last year
- β23Updated 8 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β111Updated last week
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"β50Updated 4 months ago
- [CVPR 2025] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersβ54Updated 11 months ago
- Curated list of methods that focuses on improving the efficiency of diffusion modelsβ45Updated last year
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ99Updated last year
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.β353Updated last year
- This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.β83Updated 9 months ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optiβ¦β46Updated last year
- PyTorch code for our paper "ARB-LLM: Alternating Refined Binarizations for Large Language Models"β25Updated 4 months ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β56Updated last year
- [ICCV 2025] QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ52Updated last month
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β72Updated 4 months ago