CUDA and Triton implementations of Flash Attention with SoftmaxN.
☆73May 26, 2024Updated 2 years ago
Alternatives and similar repositories for Flash-Attention-Softmax-N
Users that are interested in Flash-Attention-Softmax-N are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- Benchmark tests supporting the TiledCUDA library.☆19Nov 19, 2024Updated last year
- Cuda extensions for PyTorch☆12Dec 2, 2025Updated 5 months ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Transformers components but in Triton☆34May 9, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official Repository for "Modeling Hierarchical Structures with Continuous Recursive Neural Networks" (ICML 2021)☆12Aug 18, 2021Updated 4 years ago
- Official Repository for Efficient Linear-Time Attention Transformers.