vudaoanhtuan / Transformer

Transformer, Evolved Transformer Model

☆10

Alternatives and similar repositories for Transformer:

Users that are interested in Transformer are comparing it to the libraries listed below

HA-Transformer / MAT
The implementation of multi-branch attentive Transformer (MAT).
☆33Updated 4 years ago
cheneydon / efficient-bert
This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …
☆32Updated last year
ictnlp / RSI-NAT
Source code for "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation"
☆18Updated 5 years ago
xiangli13 / circle-loss
tf2.0 implementation of circle loss
☆32Updated 5 years ago
clovaai / group-transformer
Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING…
☆25Updated 4 years ago
lucidrains / distilled-retriever-pytorch
Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"
☆32Updated 4 years ago
intersun / CoDIR
Code for EMNLP 2020 paper CoDIR
☆41Updated 2 years ago
lancopku / simNet
Code for "simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions" （EMNLP 2018）
☆36Updated 6 years ago
MAC-AutoML / YOCO-BERT
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…
☆48Updated 3 years ago
wangjiongw / Knowledge-Distillation-PyTorch
Knowledge Distillation Algorithms implemented with PyTorch
☆17Updated 5 years ago
lancopku / AdaNorm
Code for "Understanding and Improving Layer Normalization"
☆46Updated 5 years ago
sseung0703 / Zero-shot_Knowledge_Distillation
Zero-Shot Knowledge Distillation in Deep Networks in ICML2019
☆49Updated 5 years ago
shengyuzhang / VideoTitling
Comprehensive Information Integration Modeling Framework for Video Titling
☆11Updated 4 years ago
cloneofsimo / realformer-pytorch
Implementation of RealFormer using pytorch
☆100Updated 4 years ago
leaderj1001 / Synthesizer-Rethinking-Self-Attention-Transformer-Models
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆70Updated 4 years ago
vanewu / CRF
Linear chain conditional random fields are implemented using Numpy and Mxnet/Gluon, and batch training is supported, not limited to train…
☆23Updated 6 years ago
ramakanth-pasunuru / CAS-MAS
Code for paper "Continual and Multi-Task Architecture Search (ACL 2019)"
☆41Updated 5 years ago
littleredxh / HardNegative
☆51Updated 4 years ago
Andrew-Tierno / QuantizedTransformer
Implementation of a Quantized Transformer Model
☆18Updated 6 years ago
sid7954 / beam-joint-attention
☆13Updated 6 years ago
VITA-Group / UMEC
[ICLR 2021] "UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems" by Jiayi Shen, Haotao Wang*, Shupeng Gui…
☆39Updated 3 years ago
ZJULearning / TreeAttention
A Better Way to Attend: Attention with Trees for Video Question Answering
☆25Updated 6 years ago
wasiahmad / transferable_sent2vec
Official code of our work, Robust, Transferable Sentence Representations for Text Classification [Arxiv 2018].
☆21Updated 6 years ago
black4321 / InterBERT
The official implementation of InterBERT
☆11Updated 2 years ago
erenup / pytorch-transformers
👾 A library of state-of-the-art pretrained models for Natural Language Processing (NLP)
☆9Updated 5 years ago
zhaocq-nlp / Attention-Visualization
Visualization for simple attention and Google's multi-head attention.
☆67Updated 7 years ago
konstantinkobs / SimLoss
A PyTorch implementation of our proposed loss function from the paper "SimLoss: Class Similarities in Cross Entropy"
☆25Updated 3 years ago
10-zin / Synthesizer
A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"
☆73Updated 2 years ago
xu-song / k-plug
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (Findings of EMNLP …
☆31Updated 2 years ago
kcyu2014 / multi-model-forgetting
ICML2019 Accepted Paper. Overcoming Multi-Model Forgetting
☆13Updated 5 years ago