adreamwu / PTQ4DiT
PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005
β15Updated last month
Alternatives and similar repositories for PTQ4DiT:
Users that are interested in PTQ4DiT are comparing it to the libraries listed below
- QuEST: Efficient Finetuning for Low-bit Diffusion Modelsβ35Updated this week
- ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generationβ37Updated 3 months ago
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β119Updated 2 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)β125Updated last year
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languagβ¦β160Updated last month
- [CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Moβ¦β57Updated 4 months ago
- β15Updated 2 weeks ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Modelsβ89Updated 9 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ105Updated 3 months ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metricβ52Updated last year
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.β336Updated 8 months ago
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quanβ¦β114Updated last year
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diβ¦β50Updated 6 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.β60Updated 6 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β50Updated last week
- β112Updated 2 months ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Modelβ¦β53Updated 9 months ago
- Model Compression Toolbox for Large Language Models and Diffusion Modelsβ269Updated last month
- Awesome list for LLM pruning.β180Updated this week
- SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Modelsβ26Updated 4 months ago
- πA curated list of Awesome Diffusion Inference Papers with codes, such as Sampling, Caching, Multi-GPUs, etc. ππβ111Updated this week
- π Collection of awesome generation acceleration resources.β63Updated this week
- An algorithm for static activation quantization of LLMsβ95Updated last month
- This repository contains integer operators on GPUs for PyTorch.β187Updated last year
- β13Updated 2 weeks ago
- β17Updated 9 months ago
- SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Modelsβ13Updated 2 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"β53Updated 5 months ago
- The official implementation of the NeurIPS 2022 paper Q-ViT.β85Updated last year
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsβ85Updated this week