softmax1 / Flash-Attention-Softmax-N

CUDA and Triton implementations of Flash Attention with SoftmaxN.
66Updated 5 months ago

Related projects

Alternatives and complementary repositories for Flash-Attention-Softmax-N