mit-han-lab / hardware-aware-transformers
[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
☆325Updated 2 months ago
Related projects: ⓘ
- Low Precision Arithmetic Simulation in PyTorch☆258Updated 3 months ago
- [ICML'21 Oral] I-BERT: Integer-only BERT Quantization☆220Updated last year
- Simple Training and Deployment of Fast End-to-End Binary Networks☆159Updated 2 years ago
- Implements quantized distillation. Code for our paper "Model compression via distillation and quantization"☆329Updated last month
- Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow☆169Updated 4 years ago
- [CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework☆274Updated 9 months ago
- Block-sparse primitives for PyTorch☆147Updated 3 years ago
- PyTorch implementation for the APoT quantization (ICLR 2020)☆265Updated last year
- Prune a model while finetuning or training.☆392Updated 2 years ago
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆596Updated 2 months ago
- Papers for deep neural network compression and acceleration☆394Updated 3 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆196Updated last year
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆410Updated last year
- papers about model compression☆164Updated last year
- A general and accurate MACs / FLOPs profiler for PyTorch models☆556Updated 4 months ago
- Slicing a PyTorch Tensor Into Parallel Shards☆295Updated 3 years ago
- [CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision☆365Updated 3 years ago
- ☆186Updated 2 years ago
- Quantization of Convolutional Neural networks.☆237Updated last month
- PyTorch library to facilitate development and standardized evaluation of neural network pruning methods.☆420Updated last year
- [CVPR 2020] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy☆156Updated 4 years ago
- Fast Block Sparse Matrices for Pytorch☆546Updated 3 years ago
- ☆212Updated 5 years ago
- A library for researching neural networks compression and acceleration methods.☆135Updated 2 weeks ago
- PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.☆256Updated 11 months ago
- ☆212Updated last year
- A Pytorch implementation of Neural Network Compression (pruning, deep compression, channel pruning)☆154Updated 2 months ago
- PyTorch layer-by-layer model profiler☆608Updated 3 years ago
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆471Updated last month
- aw_nas: A Modularized and Extensible NAS Framework☆246Updated 11 months ago