tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)
609Updated 7 months ago

Related projects

Alternatives and complementary repositories for flash-attention-minimal