lucidrains / linformerView external linksLinks
Implementation of Linformer for Pytorch
☆305Jan 5, 2024Updated 2 years ago
Alternatives and similar repositories for linformer
Users that are interested in linformer are comparing it to the libraries listed below
Sorting:
- My take on a practical implementation of Linformer for Pytorch.☆422Jul 27, 2022Updated 3 years ago
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆827May 5, 2024Updated last year
- Fully featured implementation of Routing Transformer☆300Nov 6, 2021Updated 4 years ago
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,172Feb 2, 2022Updated 4 years ago
- Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention☆270Aug 10, 2021Updated 4 years ago
- Reformer, the efficient Transformer, in Pytorch☆2,193Jun 21, 2023Updated 2 years ago
- A simple implementation of a deep linear Pytorch module☆21Oct 16, 2020Updated 5 years ago
- An implementation of local windowed attention for language modeling☆496Jul 16, 2025Updated 6 months ago
- Implementation of Fast Transformer in Pytorch☆176Aug 26, 2021Updated 4 years ago
- GPT, but made only out of MLPs☆89May 25, 2021Updated 4 years ago
- Pytorch library for fast transformer implementations☆1,761Mar 23, 2023Updated 2 years ago
- Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)☆75Jun 23, 2020Updated 5 years ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆57Jan 5, 2023Updated 3 years ago
- Pytorch implementation of Compressive Transformers, from Deepmind☆163Oct 4, 2021Updated 4 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆87Nov 1, 2025Updated 3 months ago
- A concise but complete full-attention transformer with a set of promising experimental features from various papers☆5,800Feb 7, 2026Updated last week
- Longformer: The Long-Document Transformer☆2,186Feb 8, 2023Updated 3 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆76Dec 4, 2022Updated 3 years ago
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,194Aug 22, 2023Updated 2 years ago
- Code for the paper PermuteFormer☆41Oct 10, 2021Updated 4 years ago
- A GPT, made only of MLPs, in Jax☆59Jun 23, 2021Updated 4 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Aug 20, 2024Updated last year
- list of efficient attention modules☆1,022Aug 23, 2021Updated 4 years ago
- Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorc…☆310Dec 27, 2021Updated 4 years ago
- Long Range Arena for Benchmarking Efficient Transformers☆777Dec 16, 2023Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆126Nov 13, 2020Updated 5 years ago
- Implementation of a Light Recurrent Unit in Pytorch☆49Oct 6, 2024Updated last year
- PyTorch Framework for Developing Memory Efficient Deep Invertible Networks☆256May 10, 2023Updated 2 years ago
- A PyTorch implement of Dilated RNN☆11Dec 31, 2017Updated 8 years ago
- Implementation of Axial attention - attending to multi-dimensional data efficiently☆397Aug 26, 2021Updated 4 years ago
- Axial Positional Embedding for Pytorch☆84Feb 25, 2025Updated 11 months ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 3 years ago
- [TPAMI2022 & NeurIPS2020] Official implementation of Self-Adaptive Training☆130Oct 17, 2021Updated 4 years ago
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆126Aug 25, 2025Updated 5 months ago
- Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. This specific repository is geared towards integration wit…☆325Aug 28, 2025Updated 5 months ago
- Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"☆372Sep 26, 2023Updated 2 years ago
- ☆24Nov 22, 2022Updated 3 years ago
- Fast Discounted Cumulative Sums in PyTorch☆97Aug 28, 2021Updated 4 years ago