xyltt / Linear-Transformer
Transformer are RNNs: Fast Autoregressive Transformer with Linear Attention
☆18Updated 4 years ago
Alternatives and similar repositories for Linear-Transformer:
Users that are interested in Linear-Transformer are comparing it to the libraries listed below
- code for Explicit Sparse Transformer☆58Updated last year
- Mixture of Attention Heads☆41Updated 2 years ago
- Mask Attention Networks: Rethinking and Strengthen Transformer in NAACL2021☆14Updated 3 years ago
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆78Updated last year
- BM-NAS: Bilevel Multimodal Neural Architecture Search (AAAI 2022 Oral)☆17Updated 2 years ago
- ☆32Updated 3 years ago
- [AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"☆94Updated 2 years ago
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆70Updated 2 years ago
- [NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang…☆90Updated last year
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆29Updated 4 years ago
- Recent Advances in MLP-based Models (MLP is all you need!)☆113Updated 2 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆64Updated 3 years ago
- Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)☆72Updated 4 years ago
- [ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention☆185Updated 2 years ago
- Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.☆34Updated 2 years ago
- S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)☆63Updated 3 years ago
- [ICCV2023 Official PyTorch code] for Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution☆26Updated 10 months ago
- FlatNCE: A Novel Contrastive Representation Learning Objective☆89Updated 3 years ago
- PyTorch implementation of Pay Attention to MLPs☆40Updated 3 years ago
- Learning to Encode Position for Transformer with Continuous Dynamical Model☆59Updated 4 years ago
- Implementation of AAAI 2022 Paper: Go wider instead of deeper☆32Updated 2 years ago
- Implementation for Context-Gated Convolution☆59Updated 3 years ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆59Updated last year
- Learning with Noisy Labels, Label Noise, ICML 2021☆43Updated last year
- This is an unofficial implementation of BOAT: Bilateral Local Attention Vision Transformer☆54Updated 2 years ago
- An implementation of the efficient attention module.☆300Updated 4 years ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆80Updated last year
- official code for dynamic convolution decomposition☆131Updated 3 years ago
- Official code for paper "On the Connection between Local Attention and Dynamic Depth-wise Convolution" ICLR 2022 Spotlight☆184Updated 2 years ago
- ☆27Updated 2 years ago