softmax1 / Flash-Attention-Softmax-NLinks

CUDA and Triton implementations of Flash Attention with SoftmaxN.
70Updated last year

Alternatives and similar repositories for Flash-Attention-Softmax-N

Users that are interested in Flash-Attention-Softmax-N are comparing it to the libraries listed below

Sorting: