lyj20071013 / Triton-FlashAttentionView on GitHub
This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The implementations range from basic block-wise processing to advanced techniques like FP8 quantization and prefetching
11Mar 26, 2026Updated 2 weeks ago

Alternatives and similar repositories for Triton-FlashAttention

Users that are interested in Triton-FlashAttention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?