CLAIRE-Labo / flash_attention
A basic pure pytorch implementation of flash attention
☆15Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for flash_attention
- ☆72Updated 4 months ago
- ☆35Updated 7 months ago
- ☆50Updated 2 weeks ago
- ☆53Updated 9 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆112Updated 6 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated 2 weeks ago
- Using FlexAttention to compute attention with different masking patterns☆40Updated last month
- ☆76Updated 6 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- Minimal but scalable implementation of large language models in JAX☆25Updated last week
- Triton Implementation of HyperAttention Algorithm☆46Updated 11 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆24Updated 6 months ago
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- Collection of autoregressive model implementation☆66Updated last week
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- ☆61Updated 2 months ago
- ☆18Updated last month
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated 11 months ago
- ☆50Updated last month
- GoldFinch and other hybrid transformer components☆39Updated 3 months ago
- ☆50Updated 5 months ago
- ☆24Updated 8 months ago
- LL3M: Large Language and Multi-Modal Model in Jax☆64Updated 6 months ago
- ☆31Updated 2 months ago
- ☆27Updated 7 months ago
- Experiment of using Tangent to autodiff triton☆72Updated 9 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆29Updated 3 weeks ago
- σ-GPT: A New Approach to Autoregressive Models☆59Updated 2 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆89Updated 2 months ago