Efficient-ML / Awesome-Efficient-LLM-Diffusion
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
☆152Updated last week
Related projects ⓘ
Alternatives and complementary repositories for Awesome-Efficient-LLM-Diffusion
- Code Repository of Evaluating Quantized Large Language Models☆103Updated 2 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)☆124Updated last year
- ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆31Updated 2 months ago
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Mo…☆54Updated 3 months ago
- Awesome list for LLM quantization☆123Updated last month
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆101Updated last month
- An algorithm for static activation quantization of LLMs☆67Updated this week
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆102Updated last year
- QuEST: Efficient Finetuning for Low-bit Diffusion Models☆33Updated 3 months ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆54Updated 8 months ago
- Awesome list for LLM pruning.☆159Updated last month
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quan…☆112Updated last year
- Post-Training Quantization for Vision transformers.☆188Updated 2 years ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆36Updated last week
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…☆50Updated 5 months ago
- The official implementation of the NeurIPS 2022 paper Q-ViT.☆82Updated last year
- This repository contains integer operators on GPUs for PyTorch.☆181Updated last year
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆87Updated 8 months ago
- QAQ: Quality Adaptive Quantization for LLM KV Cache☆42Updated 7 months ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆81Updated 5 months ago
- This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.☆84Updated last year
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆53Updated 8 months ago
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…☆315Updated this week
- The official implementation of the ICML 2023 paper OFQ-ViT☆27Updated last year
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆42Updated last year
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆58Updated 6 months ago
- [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer☆308Updated last year
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric☆49Updated last year
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.☆53Updated 5 months ago
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆36Updated 7 months ago