Fast Multi-dimensional Sparse Attention
☆730Mar 25, 2026Updated this week
Alternatives and similar repositories for NATTEN
Users that are interested in NATTEN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022☆1,176May 15, 2024Updated last year
- [ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.☆969Feb 25, 2026Updated last month
- EDM2 and Autoguidance -- Official PyTorch implementation☆826Dec 9, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- [WIP] Better (FP8) attention for Hopper☆32Feb 24, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- New flexible and efficient image generation framework that sets new SOTA on FFHQ-256 with FID 2.05, 2022☆102Jun 26, 2025Updated 9 months ago
- Helpful tools and examples for working with flex-attention☆1,161Feb 8, 2026Updated last month
- [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-t…☆3,249Jan 17, 2026Updated 2 months ago
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆1,585Mar 16, 2025Updated last year
- Tile primitives for speedy kernels☆3,244Mar 17, 2026Updated last week
- [NeurIPS 2025] Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆214Sep 27, 2025Updated 6 months ago
- Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"☆1,129Dec 22, 2025Updated 3 months ago
- Blazingly fast neighborhood attention☆14Nov 28, 2023Updated 2 years ago
- Karras et al. (2022) diffusion models for PyTorch☆2,576Feb 12, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A unified inference and post-training framework for accelerated video generation.☆3,321Updated this week
- Efficient vision foundation models for high-resolution generation and perception.☆3,270Sep 5, 2025Updated 6 months ago
- VideoSys: An easy and efficient system for video generation☆2,020Aug 27, 2025Updated 7 months ago
- ☆80Dec 27, 2024Updated last year
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,254Feb 16, 2025Updated last year
- ☆23Jun 18, 2024Updated last year
- ☆191Jan 14, 2025Updated last year
- Minimal implementation of scalable rectified flow transformers, based on SD3's approach☆636Jul 1, 2024Updated last year
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆10,388Mar 18, 2026Updated last week
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,284Oct 31, 2024Updated last year
- ☆261Jul 11, 2024Updated last year
- Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model☆1,297Jun 8, 2025Updated 9 months ago
- ☆235Oct 11, 2024Updated last year
- Fast and memory-efficient exact attention☆22,938Mar 23, 2026Updated last week
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆8,450May 31, 2024Updated last year
- 🚀 Efficient implementations of state-of-the-art linear attention models☆4,692Updated this week
- Ring attention implementation with flash attention☆998Sep 10, 2025Updated 6 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆653Jan 15, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆44Oct 26, 2024Updated last year
- xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism☆2,577Updated this week
- [ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention☆648Mar 6, 2026Updated 3 weeks ago
- A Quirky Assortment of CuTe Kernels☆863Mar 22, 2026Updated last week
- A suite of image and video neural tokenizers☆1,717Feb 11, 2025Updated last year
- ☆175Jan 8, 2026Updated 2 months ago
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆450Oct 29, 2025Updated 5 months ago