ictnlp / awesome-transformerLinks

A collection of transformer's guides, implementations and variants.

☆105

Alternatives and similar repositories for awesome-transformer

Users that are interested in awesome-transformer are comparing it to the libraries listed below

Sorting:

ZhengZixiang / ATPapers
Worth-reading papers and related resources on attention mechanism, Transformer and pretrained language model (PLM) such as BERT. 值得一读的注意力…
☆132Updated 4 years ago
dreamgonfly / transformer-pytorch
A PyTorch implementation of Transformer in "Attention is All You Need"
☆106Updated 4 years ago
pmichel31415 / are-16-heads-really-better-than-1
Code for the paper "Are Sixteen Heads Really Better than One?"
☆172Updated 5 years ago
RayeRen / multilingual-kd-pytorch
ICLR2019, Multilingual Neural Machine Translation with Knowledge Distillation
☆70Updated 4 years ago
ZNLP / SOTA-MT
This project attempts to maintain the SOTA performance in machine translation
☆108Updated 4 years ago
lancopku / Prime
A simple module consistently outperforms self-attention and Transformer model on main NMT datasets with SoTA performance.
☆85Updated last year
microsoft / Unicoder
Unicoder model for understanding and generation.
☆91Updated last year
facebookresearch / DisCo
DisCo Transformer for Non-autoregressive MT
☆77Updated 2 years ago
yzh119 / BPT
Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"
☆128Updated 4 years ago
dreamgonfly / BERT-pytorch
PyTorch implementation of BERT in "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
☆107Updated 6 years ago
Sanyuan-Chen / RecAdam
Code for the RecAdam paper: Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting.
☆117Updated 4 years ago
gonglinyuan / StackingBERT
Source code for "Efficient Training of BERT by Progressively Stacking"
☆112Updated 6 years ago
lonePatient / electra_pytorch
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
☆91Updated 3 years ago
dojoteef / synst
Source code to reproduce the results in the ACL 2019 paper "Syntactically Supervised Transformers for Faster Neural Machine Translation"
☆81Updated 2 years ago
asappresearch / revisit-bert-finetuning
For the code release of our arXiv paper "Revisiting Few-sample BERT Fine-tuning" (https://arxiv.org/abs/2006.05987).
☆184Updated 2 years ago
ictnlp / OR-NMT
Source Code for ACL2019 paper <Bridging the Gap between Training and Inference for Neural Machine Translation>
☆41Updated 4 years ago
jcyk / BERT
a simple yet complete implementation of the popular BERT model
☆127Updated 5 years ago
tnq177 / transformers_without_tears
Transformers without Tears: Improving the Normalization of Self-Attention
☆132Updated last year
cyk1337 / Highway-Transformer
[ACL‘20] Highway Transformer: A Gated Transformer.
☆33Updated 3 years ago
zhaocq-nlp / Attention-Visualization
Visualization for simple attention and Google's multi-head attention.
☆67Updated 7 years ago
hfxunlp / transformer
Neutron: A pytorch based implementation of Transformer and its variants.
☆63Updated last year
AsaCooperStickland / Bert-n-Pals
Pytorch implementation of Bert and Pals: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (https://arxiv.org/ab…
☆82Updated 6 years ago
whr94621 / NJUNMT-pytorch
☆93Updated 3 years ago
MC-BERT / MC-BERT
☆96Updated 5 years ago
laiguokun / Funnel-Transformer
☆218Updated 5 years ago
sameenmaruf / selective-attn
Data and code used in our NAACL'19 paper "Selective Attention for Context-aware Neural Machine Translation"
☆30Updated 5 years ago
sid7954 / beam-joint-attention
☆13Updated 6 years ago
eaglenlp / Text-Generation
☆93Updated 5 years ago
yistLin / pytorch-dual-learning
Implementation of Dual Learning NMT on PyTorch
☆163Updated 7 years ago
yokusama / NMT_Papers
Some good(maybe) papers about NMT (Neural Machine Translation).
☆85Updated 5 years ago