xyltt / Linear-Transformer
Transformer are RNNs: Fast Autoregressive Transformer with Linear Attention
☆17Updated 3 years ago
Related projects: ⓘ
- code for Explicit Sparse Transformer☆57Updated last year
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆76Updated 8 months ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆58Updated 3 years ago
- PyTorch implementation of Pay Attention to MLPs☆39Updated 3 years ago
- Mask Attention Networks: Rethinking and Strengthen Transformer in NAACL2021☆15Updated 3 years ago
- Mixture of Attention Heads☆36Updated last year
- Recent Advances in MLP-based Models (MLP is all you need!)☆112Updated last year
- Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"☆40Updated 2 years ago
- FlatNCE: A Novel Contrastive Representation Learning Objective☆83Updated 2 years ago
- custom pytorch implementation of MoCo v3☆43Updated 3 years ago
- For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》☆27Updated 4 years ago
- ☆26Updated 2 years ago
- [NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang…☆90Updated 9 months ago
- [NeurIPS'22] What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective☆35Updated last year
- [ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Li…☆45Updated 9 months ago
- ☆32Updated 3 years ago
- Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)☆69Updated 4 years ago
- [AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"☆90Updated 2 years ago
- 记录Transformer升级的论文笔记☆17Updated last year
- Implementation of AAAI 2022 Paper: Go wider instead of deeper☆32Updated last year
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆66Updated last year
- Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".☆16Updated 2 years ago
- (CVPR 2022) Automated Progressive Learning for Efficient Training of Vision Transformers☆24Updated 2 years ago
- S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)☆63Updated 3 years ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆53Updated last year
- Learning to Encode Position for Transformer with Continuous Dynamical Model☆59Updated 4 years ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆29Updated last year
- Codes For Sharing☆36Updated 3 years ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆38Updated last year
- Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral☆89Updated 2 years ago