nshepperd / flash_attn_jaxView external linksLinks
JAX bindings for Flash Attention v2
☆103Feb 5, 2026Updated last week
Alternatives and similar repositories for flash_attn_jax
Users that are interested in flash_attn_jax are comparing it to the libraries listed below
Sorting:
- ☆29Jul 9, 2024Updated last year
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)☆23Oct 15, 2024Updated last year
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- A simple, easy-to-understand library for diffusion models using Flax and Jax. Includes detailed notebooks on DDPM, DDIM, and EDM with sim…☆41May 6, 2025Updated 9 months ago
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 8 months ago
- ☆12Jan 4, 2024Updated 2 years ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated last year
- JAX/Flax implementation of the Hyena Hierarchy☆34Apr 27, 2023Updated 2 years ago
- ☆23Jun 18, 2024Updated last year
- ☆344Feb 6, 2026Updated last week
- Parallelizing non-linear sequential models over the sequence length☆56Jun 23, 2025Updated 7 months ago
- PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu☆78Dec 3, 2024Updated last year
- ☆35Nov 22, 2024Updated last year
- A simple library for scaling up JAX programs☆145Nov 4, 2025Updated 3 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆439Updated this week
- JMP is a Mixed Precision library for JAX.☆211Jan 30, 2025Updated last year
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆93Jan 25, 2024Updated 2 years ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Implement Flash Attention using Cute.☆100Dec 17, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- ☆51Jan 28, 2024Updated 2 years ago
- ☆124May 28, 2024Updated last year
- ☆18Aug 24, 2024Updated last year
- ☆41Oct 15, 2025Updated 3 months ago
- ☆20May 30, 2024Updated last year
- Ring attention implementation with flash attention☆980Sep 10, 2025Updated 5 months ago
- A FlashAttention implementation for JAX with support for efficient document mask computation and context parallelism.☆158Nov 11, 2025Updated 3 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Apr 17, 2024Updated last year
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Dec 2, 2023Updated 2 years ago
- ☆20Oct 11, 2023Updated 2 years ago
- Official Repository for Efficient Linear-Time Attention Transformers.☆18Jun 2, 2024Updated last year
- ☆19Dec 4, 2025Updated 2 months ago
- Stick-breaking attention☆62Jul 1, 2025Updated 7 months ago
- A JAX nn library☆22Sep 9, 2025Updated 5 months ago
- If it quacks like a tensor...☆59Nov 13, 2024Updated last year
- Einsum-like high-level array sharding API for JAX☆34Jul 16, 2024Updated last year
- JAX Synergistic Memory Inspector☆184Jul 16, 2024Updated last year
- ☆27Jul 28, 2025Updated 6 months ago