Efficient-ML / Awesome-Efficient-AIGCLinks
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β182Updated 3 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β161Updated 8 months ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ98Updated 2 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ123Updated 8 months ago
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Moβ¦β63Updated 10 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β138Updated 2 years ago
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β122Updated last year
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β56Updated last year
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ136Updated 2 weeks ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β120Updated last year
- This repository contains integer operators on GPUs for PyTorch.β205Updated last year
- Awesome list for LLM pruning.β230Updated 5 months ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ98Updated last year
- Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β133Updated 2 weeks ago
- QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ45Updated 4 months ago
- Awesome list for LLM quantizationβ223Updated 5 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.β90Updated last year
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β30Updated 6 months ago
- Post-Training Quantization for Vision transformers.β218Updated 2 years ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β88Updated 5 months ago
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β158Updated 9 months ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optiβ¦β46Updated last year
- [CVPR 2025] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersβ50Updated 9 months ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β55Updated last month
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β60Updated last year
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformersβ190Updated 2 years ago
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.β347Updated last year
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)β46Updated this week
- Pytorch implementation of BRECQ, ICLR 2021β273Updated 3 years ago
- D^2-MoE: Delta Decompression for MoE-based LLMs Compressionβ48Updated 2 months ago
- This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"β64Updated last week