Fast Multi-dimensional Sparse Attention
☆736Apr 14, 2026Updated this week
Alternatives and similar repositories for NATTEN
Users that are interested in NATTEN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022☆1,176May 15, 2024Updated last year
- [ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.☆976Feb 25, 2026Updated last month
- EDM2 and Autoguidance -- Official PyTorch implementation☆837Dec 9, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- [WIP] Better (FP8) attention for Hopper☆33Feb 24, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-t…☆3,296Jan 17, 2026Updated 3 months ago
- New flexible and efficient image generation framework that sets new SOTA on FFHQ-256 with FID 2.05, 2022☆102Jun 26, 2025Updated 9 months ago
- Helpful tools and examples for working with flex-attention☆1,174Apr 13, 2026Updated last week
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆1,601Mar 16, 2025Updated last year
- [NeurIPS 2025] Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆215Sep 27, 2025Updated 6 months ago
- Tile primitives for speedy kernels☆3,312Apr 8, 2026Updated last week
- Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"☆1,149Dec 22, 2025Updated 3 months ago
- Blazingly fast neighborhood attention☆14Nov 28, 2023Updated 2 years ago
- Karras et al. (2022) diffusion models for PyTorch☆2,583Feb 12, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Efficient vision foundation models for high-resolution generation and perception.☆3,283Sep 5, 2025Updated 7 months ago
- A unified inference and post-training framework for accelerated video generation.☆3,396Updated this week
- VideoSys: An easy and efficient system for video generation☆2,021Aug 27, 2025Updated 7 months ago
- ☆81Dec 27, 2024Updated last year
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,252Feb 16, 2025Updated last year
- ☆24Jun 18, 2024Updated last year
- ☆192Jan 14, 2025Updated last year
- Minimal implementation of scalable rectified flow transformers, based on SD3's approach☆635Jul 1, 2024Updated last year
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆10,417Mar 30, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,289Oct 31, 2024Updated last year
- Ring attention implementation with flash attention☆1,006Sep 10, 2025Updated 7 months ago
- ☆261Jul 11, 2024Updated last year
- ☆236Oct 11, 2024Updated last year
- Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model☆1,305Jun 8, 2025Updated 10 months ago
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆8,503May 31, 2024Updated last year
- Fast and memory-efficient exact attention☆23,344Updated this week
- 🚀 Efficient implementations for emerging model architectures☆4,878Updated this week
- ☆44Oct 26, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆664Jan 15, 2026Updated 3 months ago
- xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism☆2,597Apr 9, 2026Updated last week
- [ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention☆655Mar 6, 2026Updated last month
- A suite of image and video neural tokenizers☆1,718Feb 11, 2025Updated last year
- ☆175Jan 8, 2026Updated 3 months ago
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆457Oct 29, 2025Updated 5 months ago
- A Quirky Assortment of CuTe Kernels☆924Apr 13, 2026Updated last week