fawazsammani / awesome-mlp-mixer
Transformers w/o Attention, based fully on MLPs
☆93Updated 11 months ago
Alternatives and similar repositories for awesome-mlp-mixer:
Users that are interested in awesome-mlp-mixer are comparing it to the libraries listed below
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆70Updated 2 years ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆115Updated 2 years ago
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆80Updated last year
- Recent Advances in MLP-based Models (MLP is all you need!)☆114Updated 2 years ago
- A simple minimal implementation of Reversible Vision Transformers☆122Updated last year
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆125Updated 3 years ago
- A compilation of network architectures for vision and others without usage of self-attention mechanism☆77Updated 2 years ago
- Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding☆45Updated 5 months ago
- Implementation of ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks, ICML 2021.☆141Updated 3 years ago
- ☆50Updated last year
- open source the research work for published on arxiv. https://arxiv.org/abs/2106.02689☆51Updated 3 years ago
- PyTorch implementations of KMeans, Soft-KMeans and Constrained-KMeans which can be run on GPU and work on (mini-)batches of data.☆63Updated 2 years ago
- (NeurIPS 2023) PyTorch implementation of "Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation"☆18Updated 5 months ago
- ResMLP: Feedforward networks for image classification with data-efficient training☆42Updated 3 years ago
- ☆25Updated 3 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆211Updated 2 years ago
- This is a offical PyTorch/GPU implementation of SupMAE.☆77Updated 2 years ago
- An official code release of the paper RGB no more: Minimally Decoded JPEG Vision Transformers☆56Updated last year
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆29Updated last year
- ☆65Updated 4 months ago
- Differentiable Top-k Classification Learning☆80Updated 2 years ago
- ☆191Updated 2 years ago
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆86Updated last year
- A simple cross attention that updates both the source and target in one step☆164Updated 10 months ago
- More dimensions = More fun☆21Updated 7 months ago
- Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision☆217Updated 3 years ago
- [CVPR2022 - Oral] Official Jax Implementation of Learned Queries for Efficient Local Attention☆116Updated 2 years ago
- PyTorch Implementation of "Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model"☆48Updated 2 years ago
- Visualizing representations with diffusion based conditional generative model.☆90Updated last year
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆61Updated 10 months ago