ThisisBillhe/torch_quantizer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ThisisBillhe/torch_quantizer)

ThisisBillhe / torch_quantizer

torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.

☆25

Alternatives and similar repositories for torch_quantizer

Users that are interested in torch_quantizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ThisisBillhe / ZipCache
View on GitHub
[NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification
☆33Mar 30, 2025Updated last year
ThisisBillhe / EfficientDM
View on GitHub
[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…
☆73Jun 4, 2024Updated 2 years ago
ThisisBillhe / ZipAR
View on GitHub
[ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…
☆51Mar 25, 2025Updated last year
ThisisBillhe / BiViT
View on GitHub
The official implementation of BiViT: Extremely Compressed Binary Vision Transformers
☆16Jun 18, 2023Updated 3 years ago
ModelTC / QLLM
View on GitHub
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆39Mar 11, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HubHop / vit-attention-benchmark
View on GitHub
Benchmarking Attention Mechanism in Vision Transformers.
☆20Oct 10, 2022Updated 3 years ago
ilur98 / DGQ
View on GitHub
Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
☆14Dec 27, 2023Updated 2 years ago
ziplab / HVT
View on GitHub
[ICCV 2021] Official implementation of "Scalable Vision Transformers with Hierarchical Pooling"
☆32Dec 30, 2021Updated 4 years ago
ziplab / QLLM
View on GitHub
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆31Mar 12, 2024Updated 2 years ago
ziplab / QTool
View on GitHub
Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)
☆73Oct 7, 2021Updated 4 years ago
RongKaiWeskerMA / INSTA
View on GitHub
The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning
☆13Apr 14, 2024Updated 2 years ago
ziplab / MPVSS
View on GitHub
☆33Feb 29, 2024Updated 2 years ago
ziplab / PTQD
View on GitHub
The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models
☆103Mar 12, 2024Updated 2 years ago
yhhhli / BRECQ
View on GitHub
Pytorch implementation of BRECQ, ICLR 2021
☆300Aug 1, 2021Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
xiezheng-cs / DTQ
View on GitHub
PyTorch implementation of "Deep Transferring Quantization" (ECCV2020)
☆18Jun 22, 2022Updated 4 years ago
ziplab / FASeg
View on GitHub
[CVPR 2023] This is the official PyTorch implementation for "Dynamic Focus-aware Positional Queries for Semantic Segmentation".
☆61Mar 4, 2023Updated 3 years ago
IST-DASLab / QUIK
View on GitHub
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
☆185Apr 16, 2024Updated 2 years ago
QianyiWu / Awesome-Object-Compositional-INR
View on GitHub
A collection of object-compositional modeling by implicit neural representation.
☆59Aug 12, 2023Updated 2 years ago
weijiawu / ParaDiffusion
View on GitHub
[IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Model
☆107Mar 24, 2025Updated last year
ModelTC / Outlier_Suppression_Plus
View on GitHub
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…
☆52Oct 21, 2023Updated 2 years ago
jingjing0419 / SAQ-SAM
View on GitHub
[AAAI 2026] Implementation of SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
☆17Nov 27, 2025Updated 7 months ago
PannenetsF / TQT
View on GitHub
TQT's pytorch implementation.
☆22Dec 17, 2021Updated 4 years ago
ziplab / EcoFormer
View on GitHub
[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"
☆74Nov 15, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wimh966 / outlier_suppression
View on GitHub
The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…
☆49Oct 5, 2022Updated 3 years ago
zyxxmu / Bi-Mask
View on GitHub
Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"
☆13Jun 7, 2023Updated 3 years ago
aim-uofa / model-quantization
View on GitHub
Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)
☆45Aug 19, 2021Updated 4 years ago
BienLuky / EDA-DM
View on GitHub
[TIP 2026] The official implementation of "EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models"
☆21Jul 8, 2025Updated last year
csyhhu / MetaQuant
View on GitHub
Codes for Accepted Paper : "MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization" in NeurIPS 2019
☆54May 8, 2020Updated 6 years ago
ziplab / SPT
View on GitHub
[ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.
☆76Sep 24, 2023Updated 2 years ago
ThisisBillhe / tiny-stable-diffusion
View on GitHub
Tiny optimized Stable-diffusion that can run on GPUs with just 1GB of VRAM. (Beta)
☆182Jul 20, 2023Updated 3 years ago
facebookresearch / Ternary_Binary_Transformer
View on GitHub
ACL 2023
☆39Jun 6, 2023Updated 3 years ago
alibaba-damo-academy / K-Forcing
View on GitHub
Official implementation for "K-Forcing: Joint Next-K-Token Decoding via Push-Forward Language Modeling"
☆16Jun 14, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
bohanzhuang / Group-Net-semantic-segmentation
View on GitHub
Structured Binary Neural Networks for Image Recognition
☆16Oct 12, 2022Updated 3 years ago
HSG-AIML / NeurIPS_2022-Generative_Hyper_Representations
View on GitHub
Code Repository for the NeurIPS 2022 paper: "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights".
☆18Jul 10, 2024Updated 2 years ago
aim-uofa / OIR
View on GitHub
[ICLR 2024] Official PyTorch/Diffusers implementation of "Object-aware Inversion and Reassembly for Image Editing"
☆87Aug 23, 2024Updated last year
jakc4103 / scale-adjusted-training
View on GitHub
PyTorch implementation of Towards Efficient Training for Neural Network Quantization
☆16Jan 16, 2020Updated 6 years ago
ziplab / SPViT
View on GitHub
[TPAMI 2024] This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.
☆116Dec 30, 2023Updated 2 years ago
NVlabs / T-Stitch
View on GitHub
[ICLR 2025] Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stit…
☆107Feb 26, 2024Updated 2 years ago
apryor6 / pipcudemo
View on GitHub
An example project showing how to build a pip-installable Python package that invokes custom CUDA/C++ code
☆14Jul 12, 2017Updated 9 years ago