erksch / fnet-pytorchLinks
Unofficial PyTorch implementation of Google's FNet: Mixing Tokens with Fourier Transforms. With checkpoints.
☆74Updated 2 years ago
Alternatives and similar repositories for fnet-pytorch
Users that are interested in fnet-pytorch are comparing it to the libraries listed below
Sorting:
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆61Updated 3 years ago
- ☆163Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆118Updated 4 years ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 3 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆100Updated 2 years ago
- PyTorch implementation of FNet: Mixing Tokens with Fourier transforms☆27Updated 4 years ago
- Implementation of Nyström Self-attention, from the paper Nyströmformer☆135Updated 3 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆73Updated last year
- Implementations of various linear RNN layers using pytorch and triton☆53Updated last year
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆119Updated 3 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆214Updated 2 years ago
- Sequence Modeling with Structured State Spaces☆64Updated 2 years ago
- Axial Positional Embedding for Pytorch☆81Updated 4 months ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 3 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆225Updated 3 years ago
- ☆53Updated 8 months ago
- Relative Positional Encoding for Transformers with Linear Complexity☆64Updated 3 years ago
- Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding☆51Updated 8 months ago
- Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".☆44Updated 3 years ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆124Updated last year
- Sequence modeling with Mega.☆296Updated 2 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Updated last year
- Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.☆105Updated 4 years ago
- Transformers w/o Attention, based fully on MLPs☆93Updated last year
- User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice rou…☆21Updated last month
- ☆33Updated 4 years ago
- Unofficial PyTorch Implementation for pNLP-Mixer: an Efficient all-MLP Architecture for Language (https://arxiv.org/abs/2202.04350)☆63Updated 3 years ago
- Implementation of Fast Transformer in Pytorch☆175Updated 3 years ago
- Official Pytorch Implementation for "Continual Transformers: Redundancy-Free Attention for Online Inference" [ICLR 2023]☆28Updated last year
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆116Updated 2 years ago