LibertFan / MAN
Mask Attention Networks: Rethinking and Strengthen Transformer in NAACL2021
☆15Updated 3 years ago
Related projects: ⓘ
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆27Updated 4 years ago
- PyTorch implementation of Pay Attention to MLPs☆39Updated 3 years ago
- Official code for the paper "Self-Distillation for Few-Shot Image Captioning"☆13Updated 3 years ago
- How Does Selective Mechanism Improve Self-attention Networks?☆27Updated 3 years ago
- code for Explicit Sparse Transformer☆57Updated last year
- Transformer are RNNs: Fast Autoregressive Transformer with Linear Attention☆17Updated 3 years ago
- custom pytorch implementation of MoCo v3☆43Updated 3 years ago
- Sparse Attention with Linear Units☆17Updated 3 years ago
- Text style transfer benchmark☆54Updated 3 years ago
- Texar (tf-backend) implementation of "GTAE: Graph-Transformer Based Auto Encoder for Text Style Transfer"☆45Updated last month
- Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".☆16Updated 2 years ago
- Mixture of Attention Heads☆36Updated last year
- Code for "Understanding and Improving Layer Normalization"☆44Updated 4 years ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆29Updated last year
- ☆21Updated 3 years ago
- Official implementation for paper "Relational Surrogate Loss Learning", ICLR 2022☆37Updated last year
- Unpaired Image Captioning☆35Updated 3 years ago
- ☆19Updated last year
- Code of our Neurips2020 paper "Auto Learning Attention", coming soon☆21Updated 3 years ago
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Updated 2 years ago
- Contrastive Learning for Image Captioning☆51Updated 6 years ago
- ☆19Updated this week
- the source code of Multi-modal Circulant Fusion (MCF) for Temporal Activity Localization☆22Updated 5 years ago
- Starter code for the VMT task and challenge☆50Updated 4 years ago
- ☆25Updated 3 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Updated last year
- Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch☆70Updated 4 years ago
- some examples for drawing illustration plots for paper using seaborn package☆13Updated 4 years ago
- Official PyTorch implementation for ECCV'20 paper: Deep Image Clustering with Category-Style Representation☆17Updated 4 years ago
- ☆20Updated 4 years ago