mit-han-lab / hardware-aware-transformers
[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
☆330Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for hardware-aware-transformers
- Low Precision Arithmetic Simulation in PyTorch☆265Updated 6 months ago
- [ICML'21 Oral] I-BERT: Integer-only BERT Quantization☆229Updated last year
- Simple Training and Deployment of Fast End-to-End Binary Networks☆159Updated 2 years ago
- [CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision☆370Updated 3 years ago
- Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow☆168Updated 4 years ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆413Updated last year
- ☆195Updated 3 years ago
- Implements quantized distillation. Code for our paper "Model compression via distillation and quantization"☆330Updated 3 months ago
- [CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework☆274Updated 11 months ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆201Updated last year
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆599Updated 4 months ago
- PyTorch implementation for the APoT quantization (ICLR 2020)☆268Updated 2 years ago
- Block-sparse primitives for PyTorch☆148Updated 3 years ago
- [CVPR 2020] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy☆156Updated 4 years ago
- papers about model compression☆165Updated last year
- Papers for deep neural network compression and acceleration☆396Updated 3 years ago
- Quantization of Convolutional Neural networks.☆239Updated 3 months ago
- LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks☆239Updated 2 years ago
- ☆214Updated 2 years ago
- PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.☆258Updated last year
- PyTorch layer-by-layer model profiler☆607Updated 3 years ago
- [ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark☆105Updated last year
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆491Updated 2 months ago
- Reference implementations of popular Binarized Neural Networks☆104Updated 3 weeks ago
- A general and accurate MACs / FLOPs profiler for PyTorch models☆571Updated 6 months ago
- Prune a model while finetuning or training.☆394Updated 2 years ago
- Research and development for optimizing transformers☆125Updated 3 years ago
- Slicing a PyTorch Tensor Into Parallel Shards☆296Updated 3 years ago
- code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"☆103Updated 3 years ago
- [JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion☆40Updated 3 years ago