Efficient-ML / Awesome-Efficient-AIGCLinks
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β187Updated 6 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β165Updated 10 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ130Updated 11 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β139Updated 2 years ago
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β166Updated last year
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β124Updated 2 years ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ115Updated 5 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β126Updated 2 years ago
- [CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization forβ¦β103Updated last month
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β32Updated 9 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β116Updated 3 weeks ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β56Updated last year
- β24Updated 8 months ago
- Awesome list for LLM pruning.β255Updated this week
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ148Updated 3 months ago
- Awesome list for LLM quantizationβ282Updated this week
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β156Updated last month
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"β55Updated 5 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.β94Updated last year
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metricβ58Updated 2 years ago
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)β58Updated last month
- Improved the performance of 8-bit PTQ4DM expecially on FID.β12Updated 2 years ago
- [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformerβ347Updated 2 years ago
- [CVPR 2025] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersβ56Updated 11 months ago
- List of papers related to neural network quantization in recent AI conferences and journals.β702Updated 5 months ago
- This repository contains integer operators on GPUs for PyTorch.β213Updated last year
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β64Updated last year
- [ICCV 2025] QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ52Updated 2 months ago
- Curated list of methods that focuses on improving the efficiency of diffusion modelsβ46Updated last year
- D^2-MoE: Delta Decompression for MoE-based LLMs Compressionβ64Updated 5 months ago
- This repo contains the code for studying the interplay between quantization and sparsity methodsβ22Updated 6 months ago