juntang-zhuang / fairseq-adabelief
☆9Updated 3 years ago
Related projects: ⓘ
- Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)☆29Updated 3 years ago
- Code for EMNLP 2020 paper CoDIR☆41Updated last year
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Updated last year
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Updated 2 years ago
- Some improvements on Adam☆28Updated 3 years ago
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 3 years ago
- ☆19Updated this week
- Codes for Category-aware Generative Adversarial Networks (AAAI 2020)☆18Updated 4 years ago
- ☆13Updated 4 years ago
- An adaptive training algorithm for residual network☆14Updated 4 years ago
- The implementation of paper ''Efficient Attention Network: Accelerate Attention by Searching Where to Plug''.☆20Updated last year
- Official Pytorch Implementation for the paper 'SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients'☆17Updated 2 years ago
- [AAAI'21] Modeling Deep Learning Based Privacy Attacks on Physical Mail☆12Updated 3 years ago
- Code and dataset for "Transfer Learning Between Related Tasks Using Expected Label Proportions"☆16Updated 4 years ago
- The implementation of multi-branch attentive Transformer (MAT).☆33Updated 4 years ago
- Code for our paper: "Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers".☆21Updated 2 years ago
- Curriculum Learning related papers and materials☆53Updated 3 years ago
- ☆20Updated 4 years ago
- Interpolation between Residual and Non-Residual Networks, ICML 2020. https://arxiv.org/abs/2006.05749☆26Updated 4 years ago
- ☆16Updated 3 years ago
- ☆10Updated 2 years ago
- Code for paper "Continual and Multi-Task Architecture Search (ACL 2019)"☆41Updated 5 years ago
- ☆22Updated 3 years ago
- Implementation of Mogrifier LSTM in PyTorch☆35Updated 4 years ago
- Reversible Recurrent Neural Network Pytorch Implementation☆21Updated 6 years ago
- Code for "Understanding and Improving Layer Normalization"☆44Updated 4 years ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 3 years ago
- Code for the paper "Query-Key Normalization for Transformers"☆33Updated 3 years ago
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆70Updated last year
- ICLR2020 Downloader & Search Tool☆18Updated 4 years ago