Efficient-ML / Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β180Updated 3 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β159Updated 7 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ123Updated 8 months ago
- Awesome list for LLM pruning.β224Updated 5 months ago
- Awesome list for LLM quantizationβ213Updated 4 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β138Updated 2 years ago
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Moβ¦β63Updated 9 months ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ92Updated last month
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β56Updated last year
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ133Updated 3 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β119Updated last year
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β121Updated last year
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β159Updated 8 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β86Updated 5 months ago
- Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β127Updated last week
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β30Updated 6 months ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ99Updated last year
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β60Updated 11 months ago
- A sparse attention kernel supporting mix sparse patternsβ205Updated 3 months ago
- This repository contains integer operators on GPUs for PyTorch.β204Updated last year
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Trainingβ190Updated 3 weeks ago
- Post-Training Quantization for Vision transformers.β216Updated 2 years ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β48Updated last month
- QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ44Updated 3 months ago
- Code implementation of GPTQv2 (https://arxiv.org/abs/2504.02692)β36Updated 3 weeks ago
- Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"β284Updated 2 months ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"β38Updated last month
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retentiβ¦β65Updated last year
- β21Updated 5 months ago
- Pytorch implementation of BRECQ, ICLR 2021β272Updated 3 years ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metricβ55Updated 2 years ago