xyltt / Linear-Transformer
Transformer are RNNs: Fast Autoregressive Transformer with Linear Attention
☆22Updated 4 years ago
Alternatives and similar repositories for Linear-Transformer:
Users that are interested in Linear-Transformer are comparing it to the libraries listed below
- code for Explicit Sparse Transformer☆60Updated last year
- ☆33Updated 4 years ago
- Implementation of AAAI 2022 Paper: Go wider instead of deeper☆32Updated 2 years ago
- Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"☆51Updated 2 years ago
- [AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"☆96Updated 2 years ago
- Recent Advances in MLP-based Models (MLP is all you need!)☆115Updated 2 years ago
- PyTorch implementation of Pay Attention to MLPs☆40Updated 3 years ago
- Mixture of Attention Heads☆44Updated 2 years ago
- Mask Attention Networks: Rethinking and Strengthen Transformer in NAACL2021☆14Updated 3 years ago
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆79Updated last year
- ☆27Updated 2 years ago
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆28Updated 5 years ago
- [NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang…☆89Updated last year
- custom pytorch implementation of MoCo v3☆45Updated 4 years ago
- [ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention☆190Updated 2 years ago
- A repository for DenseSSMs☆87Updated last year
- Learning to Encode Position for Transformer with Continuous Dynamical Model☆59Updated 4 years ago
- Official implementation for paper "Relational Surrogate Loss Learning", ICLR 2022☆37Updated 2 years ago
- Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.☆37Updated 2 years ago
- Pytorch implementation of Performer from the paper "Rethinking Attention with Performers".☆25Updated 4 years ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆60Updated last year
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆67Updated 3 years ago
- open source the research work for published on arxiv. https://arxiv.org/abs/2106.02689☆51Updated 3 years ago
- Official implement of "CAT: Cross Attention in Vision Transformer".☆157Updated 2 years ago
- Dynamic Early Exit for Image Captioning☆17Updated 2 years ago
- FlatNCE: A Novel Contrastive Representation Learning Objective☆90Updated 3 years ago
- BM-NAS: Bilevel Multimodal Neural Architecture Search (AAAI 2022 Oral)☆17Updated 2 years ago
- CVPR2022, BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning, https://arxiv.org/abs/2203.01522☆250Updated last year
- Implementation for Context-Gated Convolution☆59Updated 3 years ago
- Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)☆76Updated 4 years ago