idiap / fast-transformers
Pytorch library for fast transformer implementations
☆1,677Updated last year
Alternatives and similar repositories for fast-transformers:
Users that are interested in fast-transformers are comparing it to the libraries listed below
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,115Updated 3 years ago
- Reformer, the efficient Transformer, in Pytorch☆2,152Updated last year
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆738Updated 9 months ago
- Long Range Arena for Benchmarking Efficient Transformers☆745Updated last year
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,120Updated last year
- An All-MLP solution for Vision, from Google AI☆1,013Updated 5 months ago
- Longformer: The Long-Document Transformer☆2,082Updated 2 years ago
- My take on a practical implementation of Linformer for Pytorch.☆411Updated 2 years ago
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,550Updated 4 years ago
- A fast MoE impl for PyTorch☆1,627Updated last week
- PyTorch extensions for high performance and large scale training.☆3,260Updated last month
- Fast Block Sparse Matrices for Pytorch☆546Updated 4 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆467Updated 4 years ago
- Fully featured implementation of Routing Transformer☆288Updated 3 years ago
- Profiling and inspecting memory in pytorch☆1,042Updated 6 months ago
- PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538☆1,043Updated 10 months ago
- Fast, differentiable sorting and ranking in PyTorch☆792Updated last year
- Rotary Transformer☆895Updated 2 years ago
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆635Updated 2 months ago
- Flexible components pairing 🤗 Transformers with Pytorch Lightning☆612Updated 2 years ago
- Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch☆1,795Updated 7 months ago
- higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…☆1,607Updated 2 years ago
- Implementation of Linformer for Pytorch☆266Updated last year
- An implementation of local windowed attention for language modeling☆420Updated last month
- list of efficient attention modules☆995Updated 3 years ago
- torch-optimizer -- collection of optimizers for Pytorch☆3,082Updated 10 months ago
- Implementation of ConvMixer for "Patches Are All You Need? 🤷"☆1,064Updated 2 years ago
- maximal update parametrization (µP)☆1,451Updated 7 months ago
- Library for 8-bit optimizers and quantization routines.☆717Updated 2 years ago
- 🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch☆2,094Updated 2 months ago