lucidrains / performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
☆1,110Updated 2 years ago
Alternatives and similar repositories for performer-pytorch:
Users that are interested in performer-pytorch are comparing it to the libraries listed below
- Pytorch library for fast transformer implementations☆1,665Updated last year
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆724Updated 8 months ago
- Reformer, the efficient Transformer, in Pytorch☆2,140Updated last year
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,115Updated last year
- My take on a practical implementation of Linformer for Pytorch.☆409Updated 2 years ago
- Long Range Arena for Benchmarking Efficient Transformers☆739Updated last year
- Fully featured implementation of Routing Transformer☆288Updated 3 years ago
- Implementation of Linformer for Pytorch☆262Updated last year
- An implementation of local windowed attention for language modeling☆403Updated this week
- Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention☆256Updated 3 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆468Updated 4 years ago
- An All-MLP solution for Vision, from Google AI☆1,007Updated 4 months ago
- Longformer: The Long-Document Transformer☆2,072Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆609Updated last month
- Flexible components pairing 🤗 Transformers with Pytorch Lightning☆613Updated 2 years ago
- PyTorch implementation of some attentions for Deep Learning Researchers.☆520Updated 2 years ago
- Transformers for Longer Sequences☆585Updated 2 years ago
- Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch☆1,787Updated 6 months ago
- Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch☆426Updated 3 years ago
- ☆367Updated last year
- PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations☆1,029Updated last week
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆602Updated 6 months ago
- PyTorch implementation of Contrastive Learning methods☆1,952Updated last year
- Fast, differentiable sorting and ranking in PyTorch☆786Updated last year
- Structured state space sequence models☆2,524Updated 6 months ago
- Standalone TFRecord reader/writer with PyTorch data loaders☆872Updated 4 months ago
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,539Updated 4 years ago
- Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms☆256Updated 3 years ago
- Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models☆769Updated 6 months ago
- Pytorch Lightning code guideline for conferences☆1,245Updated last year