idiap / fast-transformers
Pytorch library for fast transformer implementations
☆1,665Updated last year
Alternatives and similar repositories for fast-transformers:
Users that are interested in fast-transformers are comparing it to the libraries listed below
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,110Updated 2 years ago
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆724Updated 8 months ago
- Reformer, the efficient Transformer, in Pytorch☆2,140Updated last year
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,115Updated last year
- Long Range Arena for Benchmarking Efficient Transformers☆739Updated last year
- Structured state space sequence models☆2,524Updated 6 months ago
- My take on a practical implementation of Linformer for Pytorch.☆409Updated 2 years ago
- torch-optimizer -- collection of optimizers for Pytorch☆3,067Updated 9 months ago
- An All-MLP solution for Vision, from Google AI☆1,007Updated 4 months ago
- PyTorch Extension Library of Optimized Scatter Operations☆1,590Updated last week
- Longformer: The Long-Document Transformer☆2,072Updated last year
- A concise but complete full-attention transformer with a set of promising experimental features from various papers☆4,985Updated last week
- DeLighT: Very Deep and Light-Weight Transformers☆468Updated 4 years ago
- higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…☆1,601Updated 2 years ago
- Profiling and inspecting memory in pytorch☆1,038Updated 5 months ago
- Collection of common code that's shared among different research projects in FAIR computer vision team.☆2,057Updated last month
- Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch☆1,787Updated 6 months ago
- A coding-free framework built on PyTorch for reproducible deep learning studies. PyTorch Ecosystem. 🏆25 knowledge distillation methods p…☆1,423Updated this week
- Fast, differentiable sorting and ranking in PyTorch☆786Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆609Updated last month
- An implementation of local windowed attention for language modeling☆403Updated this week
- Standalone TFRecord reader/writer with PyTorch data loaders☆872Updated 4 months ago
- Flexible components pairing 🤗 Transformers with Pytorch Lightning☆613Updated 2 years ago
- PyTorch extensions for high performance and large scale training.☆3,232Updated this week
- Machine learning metrics for distributed, scalable PyTorch applications.☆2,177Updated this week
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆602Updated 6 months ago
- Fully featured implementation of Routing Transformer☆288Updated 3 years ago
- Pytorch Lightning code guideline for conferences☆1,245Updated last year
- Toolbox of models, callbacks, and datasets for AI/ML researchers.☆1,704Updated last week
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,539Updated 4 years ago