erfanzar / jax-flash-attn2View external linksLinks
A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).
☆34Mar 4, 2025Updated 11 months ago
Alternatives and similar repositories for jax-flash-attn2
Users that are interested in jax-flash-attn2 are comparing it to the libraries listed below
Sorting:
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆29Feb 2, 2026Updated 2 weeks ago
- Xerxes, a highly advanced Persian AI assistant developed by InstinctAI, a cutting-edge AI startup. primary function is to assist users wi…☆11Apr 27, 2024Updated last year
- Accelerate, Optimize performance with streamlined training and serving options with JAX.☆337Updated this week
- Agents for intelligence and coordination☆20Jan 4, 2026Updated last month
- OST Collection: An AI-powered suite of models that predict the next word matches with remarkable accuracy (Text Generative Models). OST C…☆16Nov 16, 2023Updated 2 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- If it quacks like a tensor...☆59Nov 13, 2024Updated last year
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 2 months ago
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- A FlashAttention implementation for JAX with support for efficient document mask computation and context parallelism.☆158Nov 11, 2025Updated 3 months ago
- Pytorch/XLA SPMD Test code in Google TPU☆23Apr 3, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Jun 6, 2024Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated last year
- Inference code for LLaMA models in JAX☆120May 21, 2024Updated last year
- ☆29Feb 27, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- Transformers components but in Triton☆34May 9, 2025Updated 9 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆40Jun 22, 2024Updated last year
- ☆44Nov 1, 2025Updated 3 months ago
- Pseudopotential converter from upf to psp8☆11Jan 25, 2023Updated 3 years ago
- LM engine is a library for pretraining/finetuning LLMs☆115Updated this week
- Berkeley CS285 2019 homework solution☆31Mar 24, 2023Updated 2 years ago
- Framework to reduce autotune overhead to zero for well known deployments.☆96Sep 19, 2025Updated 4 months ago
- Computer Science, Data Science and ML Fundamentals☆11May 30, 2025Updated 8 months ago
- ☆14May 14, 2019Updated 6 years ago
- ☆20May 24, 2025Updated 8 months ago
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- YouTube Assistant☆12May 15, 2023Updated 2 years ago
- ☆42Sep 20, 2022Updated 3 years ago
- ☆40Jan 5, 2024Updated 2 years ago
- seqax = sequence modeling + JAX☆170Jul 23, 2025Updated 6 months ago
- Make triton easier☆50Jun 12, 2024Updated last year
- Benchmarking field-level cosmological inference from galaxy surveys.☆12Jul 17, 2025Updated 7 months ago
- ☆10Mar 28, 2022Updated 3 years ago
- PyTorch implementation for PaLM: A Hybrid Parser and Language Model.☆10Jan 7, 2020Updated 6 years ago
- A Jax wrapper for cudaKDTree☆11Sep 26, 2025Updated 4 months ago
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated last year
- JAX implementation of GPTQ quantization algorithm☆10Jul 19, 2023Updated 2 years ago