tigert1998 / qat
Manually implemented quantization-aware training
☆21Updated last year
Related projects: ⓘ
- PyTorch Quantization Aware Training Example☆119Updated 4 months ago
- [ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization☆96Updated 2 years ago
- ☆113Updated last year
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆94Updated 3 years ago
- CUDA Templates for Linear Algebra Subroutines☆90Updated 4 months ago
- A Out-of-box PyTorch Scaffold for Neural Network Quantization-Aware-Training (QAT) Research. Website: https://github.com/zhutmost/neuralz…☆26Updated last year
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆82Updated 6 months ago
- Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design☆157Updated 3 years ago
- ☆212Updated last year
- Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks☆66Updated 2 years ago
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆31Updated last year
- ☆186Updated 2 years ago
- FakeQuantize with Learned Step Size(LSQ+) as Observer in PyTorch☆32Updated 2 years ago
- BitSplit Post-trining Quantization☆46Updated 2 years ago
- An 8bit automated quantization conversion tool for the pytorch (Post-training quantization based on KL divergence)☆33Updated 4 years ago
- Inference of quantization aware trained networks using TensorRT☆77Updated last year
- PyTorch implementation of "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"☆53Updated 5 years ago
- Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.☆112Updated 2 years ago
- play gemm with tvm☆81Updated last year
- This repository containts the pytorch scripts to train mixed-precision networks for microcontroller deployment, based on the memory contr…☆47Updated 4 months ago
- Post-training sparsity-aware quantization☆32Updated last year
- TQT's pytorch implementation.☆20Updated 2 years ago
- Swin Transformer C++ Implementation☆53Updated 3 years ago
- Offline Quantization Tools for Deploy.☆109Updated 8 months ago
- A Winograd Minimal Filter Implementation in CUDA☆20Updated 3 years ago
- ☆92Updated 3 years ago
- ☆18Updated 5 months ago
- PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.☆256Updated 11 months ago
- ☆67Updated 2 years ago
- ☆52Updated this week