xlite-dev / ffpa-attn-mma

📚FFPA(Split-D): Yet another Faster Flash Attention with O(1) GPU SRAM complexity large headdim, 1.8x~3x↑🎉 faster than SDPA EA.
164Updated last week

Alternatives and similar repositories for ffpa-attn-mma:

Users that are interested in ffpa-attn-mma are comparing it to the libraries listed below