shawnricecake/draft-attention

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shawnricecake/draft-attention)

shawnricecake / draft-attention

Code for Draft Attention

☆103

Alternatives and similar repositories for draft-attention

Users that are interested in draft-attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shawnricecake / fast-car
View on GitHub
[ICLR 2026] FastCar
☆16May 22, 2025Updated last year
Peyton-Chen / Sparse-vDiT
View on GitHub
The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …
☆52Jun 6, 2025Updated last year
svg-project / Sparse-VideoGen
View on GitHub
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
☆695Jul 4, 2026Updated 3 weeks ago
JIA-Lab-research / Jenga
View on GitHub
[NeurIPS 2025] Training-Free Efficient Video Generation via Dynamic Token Carving
☆287Aug 4, 2025Updated 11 months ago
NonvolatileMemory / flash_tree_attn
View on GitHub
☆20Dec 24, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
hao-ai-lab / Awesome-Video-Attention
View on GitHub
A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…
☆61Oct 27, 2025Updated 8 months ago
OliverRensu / GRAT
View on GitHub
This repository includes the official implementation of our paper "Grouping First, Attending Smartly: Training-Free Acceleration for Diff…
☆56May 21, 2025Updated last year
BienLuky / Rectified-SpaAttn
View on GitHub
The official implementation of "Rectified SpaAttn: Revisiting Attention Sparsity for Efficient Video Generation"
☆22Feb 8, 2026Updated 5 months ago
mit-han-lab / radial-attention
View on GitHub
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
☆604Nov 11, 2025Updated 8 months ago
thu-ml / SpargeAttn
View on GitHub
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
☆1,019Feb 25, 2026Updated 5 months ago
thu-nics / DiTFastAttn
View on GitHub
☆192Jan 14, 2025Updated last year
ziplab / BLADE
View on GitHub
[ICLR 2026] This is the official PyTorch implementation of "BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Gen…
☆49Oct 9, 2025Updated 9 months ago
KlingAIResearch / VMoBA
View on GitHub
Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"
☆64Jul 1, 2025Updated last year
H-EmbodVis / EasyCache
View on GitHub
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching
☆291May 12, 2026Updated 2 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
SandAI-org / MagiAttention
View on GitHub
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
☆888Updated this week
jingjing0419 / SAQ-SAM
View on GitHub
[AAAI 2026] Implementation of SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
☆17Nov 27, 2025Updated 7 months ago
mit-han-lab / Block-Sparse-Attention
View on GitHub
A sparse attention kernel supporting mix sparse patterns
☆539Jan 18, 2026Updated 6 months ago
ziplab / Pyramid-Sparse-Attention
View on GitHub
Official PyTorch implementation of [PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation](https://arxiv.org/abs…
☆25Jan 25, 2026Updated 6 months ago
Bluear7878 / H2-Cache-A-Hierarchical-Dual-Stage-Cache
View on GitHub
☆22Nov 3, 2025Updated 8 months ago
mit-han-lab / x-attention
View on GitHub
[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring
☆280Jul 6, 2025Updated last year
thu-ml / SLA
View on GitHub
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
☆324Feb 24, 2026Updated 5 months ago
Vicky0522 / TokensGen
View on GitHub
[ICCV 2025] TokensGen: Harnessing Condensed Tokens for Long Video Generation
☆57Dec 10, 2025Updated 7 months ago
NVlabs / rcm
View on GitHub
rCM & Causal-rCM: Leading and Unified Algorithms/Infrastructures for Bidirectional/Autoregressive Video Diffusion Distillation at Scale
☆772Jun 25, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ali-vilab / TTS-VAR
View on GitHub
Test-time Scaling for VAR models
☆33Sep 19, 2025Updated 10 months ago
Infrasys-AI / aiinfra-docs
View on GitHub
☆21Nov 6, 2025Updated 8 months ago
Zehong-Ma / MagCache
View on GitHub
The official code for NeurIPS 2025 "MagCache: Fast Video Generation with Magnitude-Aware Cache"
☆276Nov 17, 2025Updated 8 months ago
Shenyi-Z / TaylorSeer
View on GitHub
[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
☆407Mar 2, 2026Updated 4 months ago
chengzeyi / ParaAttention
View on GitHub
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
☆427Jul 5, 2025Updated last year
huggingface / flux-fast
View on GitHub
Making Flux go brrr on GPUs.
☆171Jan 5, 2026Updated 6 months ago
G-U-N / consolver
View on GitHub
[CVPR 2026 (Highlight)] Unofficial Implementation of "Image Diffusion Preview with Consistency Solver"
☆30Jan 24, 2026Updated 6 months ago
ICTMCG / SDTM
View on GitHub
Official repository for "Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration", which has been …
☆17Sep 29, 2025Updated 9 months ago
dc-ai-projects / DC-VideoGen
View on GitHub
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder
☆192Oct 5, 2025Updated 9 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Tencent-Hunyuan / DisCa
View on GitHub
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
☆24Apr 15, 2026Updated 3 months ago
sandyresearch / chipmunk
View on GitHub
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× …
☆111Sep 8, 2025Updated 10 months ago
xlite-dev / Awesome-DiT-Inference
View on GitHub
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
☆579Jun 13, 2026Updated last month
YujiaHu1109 / IEAP
View on GitHub
[NeurIPS 2025] IEAP: Image Editing As Programs with Diffusion Models
☆118Sep 27, 2025Updated 9 months ago
choi403 / ALG
View on GitHub
Improving Motion in Image-to-Video Models via Adaptive Low-Pass Guidance (CVPR 2026 Highlight)
☆59Feb 23, 2026Updated 5 months ago
ali-vilab / TeaCache
View on GitHub
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
☆1,357Jun 8, 2025Updated last year
Bujiazi / DiCache
View on GitHub
[ICLR 2026] Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache
☆61Jan 26, 2026Updated 6 months ago