Andrew-Tierno / QuantizedTransformerLinks

Implementation of a Quantized Transformer Model

☆19

Alternatives and similar repositories for QuantizedTransformer

Users that are interested in QuantizedTransformer are comparing it to the libraries listed below

Sorting:

houlu369 / Loss-aware-weight-quantization
Implementation of ICLR 2018 paper "Loss-aware Weight Quantization of Deep Networks"
☆26Updated 5 years ago
houlu369 / Normalized-Quantized-LSTM
Implementation of NeurIPS 2019 paper "Normalization Helps Training of Quantized LSTM"
☆31Updated last year
csyhhu / L-DNQ
Codes for AAAI2019 paper: Deep Neural Network Quantization via Layer-Wise Optimization using Limited Training Data
☆41Updated 6 years ago
kssteven418 / LTP
[KDD'22] Learned Token Pruning for Transformers
☆98Updated 2 years ago
huggingface / block_movement_pruning
Block Sparse movement pruning
☆81Updated 4 years ago
moranshkolnik / RobustQuantization
source code of the paper: Robust Quantization: One Model to Rule Them All
☆40Updated 2 years ago
elliothe / Ternarized_Neural_Network
Optimizing Deep Convolutional Neural Network with Ternarized Weights and High Accuracy
☆16Updated 6 years ago
haolibai / APS-channel-search
Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020
☆21Updated 4 years ago
Zhengyu-Li / Deep-Network-Compression-based-on-Student-Teacher-Network-
Deep Neural Network Compression based on Student-Teacher Network
☆14Updated 2 years ago
CAS-CLab / Training-Tricks-for-Binarized-Neural-Networks
The collection of training tricks of binarized neural networks.
☆72Updated 4 years ago
kcyu2014 / multi-model-forgetting
ICML2019 Accepted Paper. Overcoming Multi-Model Forgetting
☆14Updated 6 years ago
peiswang / BitSplit
BitSplit Post-trining Quantization
☆50Updated 3 years ago
csyhhu / Co-Prune
Codes for accepted paper "Cooperative Pruning in Cross-Domain Deep Neural Network Compression" in IJCAI 2019.
☆12Updated 5 years ago
lswzjuer / pytorch-quantity
An 8bit automated quantization conversion tool for the pytorch (Post-training quantization based on KL divergence)
☆33Updated 5 years ago
GATECH-EIC / E2Train
[NeurIPS 2019] E2-Train: Training State-of-the-art CNNs with Over 80% Less Energy
☆21Updated 5 years ago
MAC-AutoML / YOCO-BERT
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…
☆48Updated 4 years ago
Jangho-Kim / PSG-pytorch
Position-based Scaled Gradient for Model Quantization and Pruning Code (NeurIPS 2020)
☆26Updated 4 years ago
biswajitsc / sparse-embed
Code for paper 'Minimizing FLOPs to Learn Efficient Sparse Representations' published at ICLR 2020
☆20Updated 5 years ago
IBM / PoWER-BERT
Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…
☆61Updated 3 months ago
robeld / ERNIE
Open Source Neural Machine Translation in PyTorch
☆17Updated 6 years ago
sseung0703 / Zero-shot_Knowledge_Distillation
Zero-Shot Knowledge Distillation in Deep Networks in ICML2019
☆49Updated 6 years ago
ziplab / QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆29Updated last year
yuchaoli / CC
Towards Compact CNNs via Collaborative Compression
☆11Updated 3 years ago
yuchaoli / PST
Source code for IJCAI 2022 Long paper: Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.
☆15Updated 3 years ago
tajanthan / pmf
Proximal Mean-field for Neural Network Quantization
☆22Updated 5 years ago
yashbhalgat / QualcommAI-MicroNet-submission-MixNet
3rd place solution for NeurIPS 2019 MicroNet challenge
☆35Updated 5 years ago
polarizationpruning / PolarizationPruning
Implementation of Neuron-level Structured Pruning using Polarization Regularizer
☆82Updated 2 years ago
submission2019 / AnalyticalScaleForIntegerQuantization
Example for applying Gaussian and Laplace clipping on activations of CNN.
☆34Updated 6 years ago
htqin / BiBERT
This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.
☆88Updated 2 years ago
guoyongcs / CNAS
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search
☆16Updated last year