Efficient-ML / Awesome-Efficient-AIGCLinks
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
β184Updated 4 months ago
Alternatives and similar repositories for Awesome-Efficient-AIGC
Users that are interested in Awesome-Efficient-AIGC are comparing it to the libraries listed below
Sorting:
- Code Repository of Evaluating Quantized Large Language Modelsβ124Updated 9 months ago
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β161Updated 8 months ago
- [CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization forβ¦β65Updated last week
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ137Updated last month
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ102Updated 3 months ago
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"β138Updated last month
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β122Updated last year
- Awesome list for LLM pruning.β232Updated 6 months ago
- D^2-MoE: Delta Decompression for MoE-based LLMs Compressionβ48Updated 3 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".β122Updated last year
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β93Updated 6 months ago
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β159Updated 9 months ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β61Updated 2 months ago
- Awesome list for LLM quantizationβ238Updated 2 weeks ago
- This repository contains integer operators on GPUs for PyTorch.β205Updated last year
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optiβ¦β46Updated last year
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β139Updated 2 years ago
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005β30Updated 7 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.β91Updated last year
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarβ¦β56Updated last year
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)β47Updated 3 weeks ago
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Trainingβ210Updated last week
- πA curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Caching, Quantization, Parallelism, etc.β283Updated this week
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ99Updated last year
- β23Updated 6 months ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsβ108Updated 2 months ago
- List of papers related to neural network quantization in recent AI conferences and journals.β653Updated 2 months ago
- Efficient Mixture of Experts for LLM Paper Listβ77Updated 6 months ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Modelβ¦β63Updated last year
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.β115Updated last year