sunsetcoder / flash-attention-windows
View external linksLinks

Flash Attention 2 pre-built wheels for Windows. Drop-in replacement for PyTorch attention providing up to 10x speedup and 20x memory reduction. Compatible with Python 3.10 and CUDA 11.7+. No build setup required - just pip install and accelerate your transformer models. Supports modern NVIDIA GPUs (RTX 30/40, A100, H100).
35Dec 1, 2024Updated last year

Alternatives and similar repositories for flash-attention-windows

Users that are interested in flash-attention-windows are comparing it to the libraries listed below

Sorting:

Are these results useful?