lyj20071013 / Triton-FlashAttention
View external linksLinks

This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The implementations range from basic block-wise processing to advanced techniques like FP8 quantization and prefetching
11Jan 19, 2026Updated 3 weeks ago

Alternatives and similar repositories for Triton-FlashAttention

Users that are interested in Triton-FlashAttention are comparing it to the libraries listed below

Sorting:

Are these results useful?