yingyichen-cyy / PrimalAttention
(NeurIPS 2023) PyTorch implementation of "Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation"
☆18Updated 5 months ago
Alternatives and similar repositories for PrimalAttention:
Users that are interested in PrimalAttention are comparing it to the libraries listed below
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆80Updated last year
- ☆53Updated last month
- PyTorch Implementation of "Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model"☆48Updated 2 years ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆52Updated last year
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆71Updated 2 years ago
- ☆27Updated 2 years ago
- ResMLP: Feedforward networks for image classification with data-efficient training☆42Updated 3 years ago
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆43Updated 3 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆40Updated 11 months ago
- ☆32Updated 2 years ago
- [CVPR'23] Hard Patches Mining for Masked Image Modeling☆90Updated last year
- State Space Models☆66Updated 10 months ago
- open source the research work for published on arxiv. https://arxiv.org/abs/2106.02689☆51Updated 3 years ago
- Official pytorch implementation of NeurIPS 2022 paper, TokenMixup☆48Updated 2 years ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆40Updated 2 weeks ago
- Transformers w/o Attention, based fully on MLPs☆93Updated 11 months ago
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆29Updated last year
- ☆65Updated 4 months ago
- A repository for DenseSSMs☆87Updated 11 months ago
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆61Updated 10 months ago
- Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023☆81Updated last year
- More dimensions = More fun☆21Updated 7 months ago
- ☆47Updated 11 months ago
- code for Explicit Sparse Transformer☆60Updated last year
- Variance Covariance Regularization☆14Updated last year
- ☆84Updated last year
- ☆42Updated 2 years ago
- Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding☆45Updated 5 months ago
- Ofiicial Implementation for Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data☆58Updated 8 months ago
- [CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"☆104Updated last year