erksch / fnet-pytorch
Unofficial PyTorch implementation of Google's FNet: Mixing Tokens with Fourier Transforms. With checkpoints.
☆67Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for fnet-pytorch
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆116Updated 3 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆59Updated 2 years ago
- Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms☆251Updated 3 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆222Updated 2 years ago
- Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning☆155Updated 9 months ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 2 years ago
- Implementation of Fast Transformer in Pytorch☆171Updated 3 years ago
- Implementation of Nyström Self-attention, from the paper Nyströmformer☆122Updated 10 months ago
- ☆164Updated last year
- Sequence Modeling with Structured State Spaces☆60Updated 2 years ago
- PyTorch implementation of FNet: Mixing Tokens with Fourier transforms☆25Updated 3 years ago
- Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.☆100Updated 3 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆61Updated 2 years ago
- Implementation of Linformer for Pytorch☆257Updated 10 months ago
- An implementation of local windowed attention for language modeling☆384Updated 2 months ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 2 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆95Updated last year
- [ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention☆179Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆207Updated last year
- Code for the paper PermuteFormer☆42Updated 3 years ago
- Another attempt at a long-context / efficient transformer by me☆37Updated 2 years ago
- MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space☆40Updated 3 years ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆115Updated last year
- Sequence modeling with Mega.☆298Updated last year
- Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks, de…☆97Updated 2 years ago
- PyTorch implementation of Pay Attention to MLPs☆39Updated 3 years ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆84Updated 2 years ago
- Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."☆134Updated 3 years ago
- ☆32Updated 3 years ago
- Recent Advances in MLP-based Models (MLP is all you need!)☆112Updated last year