erfanzar / jax-flash-attn2

Flash Attention Implementation with Multiple Backend Support and Sharding This module provides a flexible implementation of Flash Attention with support for different backends (GPU, TPU, CPU) and platforms (Triton, Pallas, JAX).
18Updated this week

Related projects

Alternatives and complementary repositories for jax-flash-attn2