clovaai / group-transformerLinks

Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING-2020).

☆25

Alternatives and similar repositories for group-transformer

Users that are interested in group-transformer are comparing it to the libraries listed below

Sorting:

cloneofsimo / realformer-pytorch
Implementation of RealFormer using pytorch
☆100Updated 4 years ago
hoya012 / automatic-mixed-precision-tutorials-pytorch
Automatic Mixed Precision Tutorials using pytorch. Based on PyTorch 1.6 Official Features, implement classification codebase using custo…
☆89Updated 4 years ago
leaderj1001 / Synthesizer-Rethinking-Self-Attention-Transformer-Models
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆70Updated 5 years ago
Kirill-Kravtsov / drophead-pytorch
An implementation of drophead regularization for pytorch transformers
☆19Updated 3 years ago
sseung0703 / Zero-shot_Knowledge_Distillation
Zero-Shot Knowledge Distillation in Deep Networks in ICML2019
☆49Updated 6 years ago
ykmoon0814 / scene-text-detection-recognition-papers
☆46Updated 4 years ago
naver-ai / PfLayer
Learning Features with Parameter-Free Layers, ICLR 2022
☆84Updated 2 years ago
clovaai / length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆101Updated 4 years ago
clovaai / embedding-expansion
Official MXNet implementation of "Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning" (CVPR 2020)
☆79Updated 2 years ago
lucidrains / long-short-transformer
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
☆119Updated 4 years ago
MAC-AutoML / YOCO-BERT
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…
☆48Updated 4 years ago
SunQpark / pytorch-template
Simple project base template for PyTorch deep Learning project. Features clean implementation of DDP training and Hydra config.
☆61Updated 8 months ago
bloodwass / mixout
Implementation of Mixout with PyTorch
☆75Updated 2 years ago
10-zin / Synthesizer
A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"
☆73Updated 2 years ago
sseung0703 / Knowledge_distillation_via_TF2.0
The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API
☆111Updated 3 years ago
jaketae / g-mlp
PyTorch implementation of Pay Attention to MLPs
☆40Updated 4 years ago
dreamgonfly / transformer-pytorch
A PyTorch implementation of Transformer in "Attention is All You Need"
☆106Updated 4 years ago
vrvlive / knowlege-distillation
PyTorch, PyTorch Lightning framework for trying knowledge distillation in image classification problems
☆32Updated last year
ankandrew / online-label-smoothing-pt
Implementation of Online Label Smoothing in PyTorch
☆94Updated 2 years ago
thunlp / TR-BERT
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"
☆47Updated 3 years ago
IBM / PoWER-BERT
Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…
☆61Updated 3 months ago
affjljoo3581 / Differentiable-RandAugment
Optimize RandAugment with differentiable operations
☆25Updated 4 years ago
pkuzengqi / Skyformer
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)
☆62Updated 3 years ago
Junya-Chen / FlatCLR
FlatNCE: A Novel Contrastive Representation Learning Objective
☆90Updated 3 years ago
herobd / layoutlmv2
running LayoutLMv2
☆11Updated 3 years ago
sIncerass / powernorm
[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845
☆120Updated 4 years ago
UKPLab / MMT-Retrieval
☆131Updated 2 years ago
littleredxh / HardNegative
☆52Updated 4 years ago
huggingface / block_movement_pruning
Block Sparse movement pruning
☆81Updated 4 years ago
davidsvy / cosformer-pytorch
Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".
☆44Updated 3 years ago