Efficient-ML / Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β175Updated last month
Alternatives and similar repositories for Awesome-Efficient-AIGC:
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β147Updated 5 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ119Updated 6 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β132Updated last year
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ67Updated this week
- Awesome list for LLM pruning.β212Updated 3 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ121Updated last month
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Moβ¦β61Updated 7 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β76Updated 3 months ago
- Awesome list for LLM quantizationβ186Updated 2 months ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ96Updated last year
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β26Updated 4 months ago
- QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ41Updated 2 months ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β55Updated last year
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β115Updated last year
- Official PyTorch implementation of FlatQuant: Flatness Matters for LLM Quantizationβ110Updated 2 months ago
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Trainingβ164Updated last month
- List of papers related to neural network quantization in recent AI conferences and journals.β562Updated 3 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retentiβ¦β63Updated 11 months ago
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β59Updated 9 months ago
- Curated list of methods that focuses on improving the efficiency of diffusion modelsβ39Updated 8 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β113Updated last year
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsβ99Updated 2 months ago
- This repository contains integer operators on GPUs for PyTorch.β196Updated last year
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β148Updated 6 months ago
- A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..β179Updated 2 months ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"β35Updated this week
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metricβ52Updated 2 years ago
- π Collection of awesome generation acceleration resources.β177Updated last week
- Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"β278Updated 2 weeks ago
- The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer Lβ¦β48Updated 2 years ago