vudaoanhtuan / Transformer
Transformer, Evolved Transformer Model
☆10Updated 5 years ago
Alternatives and similar repositories for Transformer:
Users that are interested in Transformer are comparing it to the libraries listed below
- The implementation of multi-branch attentive Transformer (MAT).☆33Updated 4 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆32Updated last year
- Source code for "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation"☆18Updated 5 years ago
- tf2.0 implementation of circle loss☆32Updated 5 years ago
- Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING…☆25Updated 4 years ago
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 4 years ago
- Code for EMNLP 2020 paper CoDIR☆41Updated 2 years ago
- Code for "simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions" (EMNLP 2018)☆36Updated 6 years ago
- The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…☆48Updated 3 years ago
- Knowledge Distillation Algorithms implemented with PyTorch☆17Updated 5 years ago
- Code for "Understanding and Improving Layer Normalization"☆46Updated 5 years ago
- Zero-Shot Knowledge Distillation in Deep Networks in ICML2019☆49Updated 5 years ago
- Comprehensive Information Integration Modeling Framework for Video Titling☆11Updated 4 years ago
- Implementation of RealFormer using pytorch☆100Updated 4 years ago
- Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch☆70Updated 4 years ago
- Linear chain conditional random fields are implemented using Numpy and Mxnet/Gluon, and batch training is supported, not limited to train…☆23Updated 6 years ago
- Code for paper "Continual and Multi-Task Architecture Search (ACL 2019)"☆41Updated 5 years ago
- ☆51Updated 4 years ago
- Implementation of a Quantized Transformer Model☆18Updated 6 years ago
- ☆13Updated 6 years ago
- [ICLR 2021] "UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems" by Jiayi Shen, Haotao Wang*, Shupeng Gui…☆39Updated 3 years ago
- A Better Way to Attend: Attention with Trees for Video Question Answering☆25Updated 6 years ago
- Official code of our work, Robust, Transferable Sentence Representations for Text Classification [Arxiv 2018].☆21Updated 6 years ago
- The official implementation of InterBERT☆11Updated 2 years ago
- 👾 A library of state-of-the-art pretrained models for Natural Language Processing (NLP)☆9Updated 5 years ago
- Visualization for simple attention and Google's multi-head attention.☆67Updated 7 years ago
- A PyTorch implementation of our proposed loss function from the paper "SimLoss: Class Similarities in Cross Entropy"☆25Updated 3 years ago
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆73Updated 2 years ago
- K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (Findings of EMNLP …☆31Updated 2 years ago
- ICML2019 Accepted Paper. Overcoming Multi-Model Forgetting☆13Updated 5 years ago