Stability-AI / flash-attention

Fast and memory-efficient exact attention

☆10

Alternatives and similar repositories for flash-attention:

Users that are interested in flash-attention are comparing it to the libraries listed below

lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆117Updated 5 months ago
openai / consistency_models_cifar10
Consistency models trained on CIFAR-10, in JAX.
☆144Updated last year
xjdr-alt / mla_blog_translation
☆13Updated 9 months ago
LAION-AI / Open-GIA
O-GIA is an umbrella for research, infrastructure and projects ecosystem that should provide open source, reproducible datasets, models, …
☆90Updated 2 years ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆91Updated 3 weeks ago
ggerganov / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆24Updated last year
SkalskiP / SoM
Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️
☆85Updated last year
geohot / tinydreamer
An implementation of delta-iris in tinygrad
☆72Updated 7 months ago
mlfoundations / open-diffusion
Simple large-scale training of stable diffusion with multi-node support.
☆129Updated last year
huggingface / zapier
Hugging Face's Zapier Integration 🤗⚡️
☆48Updated last year
NousResearch / StripedHyenaTrainer
☆60Updated last year
Birch-san / llama-play
Command-line script for inferencing from models such as LLaMA, in a chat scenario, with LoRA adaptations
☆33Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆163Updated last month
huggingface / optimum-furiosa
Accelerated inference of 🤗 models using FuriosaAI NPU chips.
☆26Updated 9 months ago
geohot / dumbrl
Can RL solve simple problems?
☆54Updated last year
PrimeIntellect-ai / smart-contracts
Solidity contracts for the decentralized Prime Network protocol
☆17Updated this week
kyegomez / HRTX
Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2
☆16Updated 4 months ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆95Updated 7 months ago
ggerganov / vit.cpp
Inference Vision Transformer (ViT) in plain C/C++ with ggml
☆30Updated last year
hyhieu / easy_pybind
☆32Updated 9 months ago
lucidrains / maskbit-pytorch
Implementation of the proposed MaskBit from Bytedance AI
☆75Updated 4 months ago
GPT-Alternatives / gpt_alternatives
☆75Updated last year
lucidrains / titok-pytorch
Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
☆170Updated 9 months ago
huggingface / leaderboards
☆16Updated 2 weeks ago
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆62Updated 7 months ago
leloykun / modded-nanogpt
NanoGPT (124M) quality in 2.67B tokens
☆28Updated last month
vvvm23 / TchAIkovsky
Using JAX to generate piano music as MIDI
☆39Updated last year
SLAM-group / newhope
☆22Updated last year
facebookresearch / macta
MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection
☆46Updated last year
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆216Updated last year