Stability-AI / flash-attention
Fast and memory-efficient exact attention
☆10Updated last year
Alternatives and similar repositories for flash-attention:
Users that are interested in flash-attention are comparing it to the libraries listed below
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆117Updated 5 months ago
- Consistency models trained on CIFAR-10, in JAX.☆144Updated last year
- ☆13Updated 9 months ago
- O-GIA is an umbrella for research, infrastructure and projects ecosystem that should provide open source, reproducible datasets, models, …☆90Updated 2 years ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- GGML implementation of BERT model with Python bindings and quantization.☆24Updated last year
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆85Updated last year
- An implementation of delta-iris in tinygrad☆72Updated 7 months ago
- Simple large-scale training of stable diffusion with multi-node support.☆129Updated last year
- Hugging Face's Zapier Integration 🤗⚡️☆48Updated last year
- ☆60Updated last year
- Command-line script for inferencing from models such as LLaMA, in a chat scenario, with LoRA adaptations☆33Updated last year
- Implementation of the Llama architecture with RLHF + Q-learning☆163Updated last month
- Accelerated inference of 🤗 models using FuriosaAI NPU chips.☆26Updated 9 months ago
- Can RL solve simple problems?☆54Updated last year
- Solidity contracts for the decentralized Prime Network protocol☆17Updated this week
- Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2☆16Updated 4 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆95Updated 7 months ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆30Updated last year
- ☆32Updated 9 months ago
- Implementation of the proposed MaskBit from Bytedance AI☆75Updated 4 months ago
- ☆75Updated last year
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆170Updated 9 months ago
- ☆16Updated 2 weeks ago
- σ-GPT: A New Approach to Autoregressive Models☆62Updated 7 months ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated last month
- Using JAX to generate piano music as MIDI☆39Updated last year
- ☆22Updated last year
- MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection☆46Updated last year
- JAX implementation of the Llama 2 model☆216Updated last year