Efficient-ML / Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β177Updated 2 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC:
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β157Updated 6 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ121Updated 7 months ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ77Updated last month
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Moβ¦β62Updated 8 months ago
- Awesome list for LLM quantizationβ201Updated 4 months ago
- Awesome list for LLM pruning.β222Updated 4 months ago
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β119Updated last year
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ99Updated last year
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β27Updated 5 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ127Updated 2 months ago
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β156Updated 7 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β118Updated last year
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β84Updated 4 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β136Updated 2 years ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β55Updated last year
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Trainingβ184Updated last week
- Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β120Updated 3 weeks ago
- This repository contains integer operators on GPUs for PyTorch.β202Updated last year
- QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ41Updated 3 months ago
- D^2-MoE: Delta Decompression for MoE-based LLMs Compressionβ41Updated last month
- [CVPR 2025] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersβ47Updated 7 months ago
- A sparse attention kernel supporting mix sparse patternsβ197Updated 2 months ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β46Updated 2 weeks ago
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β59Updated 10 months ago
- πA curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. ππβ210Updated last month
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsβ104Updated last week
- π Collection of awesome generation acceleration resources.β215Updated this week
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optiβ¦β47Updated last year
- PyTorch code for our paper "ARB-LLM: Alternating Refined Binarizations for Large Language Models"β24Updated last month
- List of papers related to neural network quantization in recent AI conferences and journals.β597Updated 3 weeks ago