fla-org / flash-linear-attentionLinks
π Efficient implementations of state-of-the-art linear attention models in Torch and Triton
β2,753Updated this week
Alternatives and similar repositories for flash-linear-attention
Users that are interested in flash-linear-attention are comparing it to the libraries listed below
Sorting:
- Muon: An optimizer for hidden layers in neural networksβ897Updated last week
- Helpful tools and examples for working with flex-attentionβ831Updated last week
- Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden Statesβ1,212Updated 11 months ago
- Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paperβ653Updated last week
- Puzzles for learning Tritonβ1,708Updated 7 months ago
- A PyTorch native platform for training generative AI modelsβ3,933Updated this week
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernelsβ1,292Updated this week
- π³ Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"