hao-ai-lab/Awesome-Video-Attention

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hao-ai-lab/Awesome-Video-Attention)

hao-ai-lab / Awesome-Video-Attention

A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and caching, etc.

☆61

Alternatives and similar repositories for Awesome-Video-Attention

Users that are interested in Awesome-Video-Attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hao-ai-lab / flash-attention-fp4
View on GitHub
NVFP4 Flash-Attention 4 on BlackWell
☆28Updated this week
KlingAIResearch / VMoBA
View on GitHub
Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"
☆64Jul 1, 2025Updated last year
svg-project / Sparse-VideoGen
View on GitHub
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
☆692Jul 4, 2026Updated 2 weeks ago
hao-ai-lab / DistCA
View on GitHub
Efficient Long-context Language Model Training by Core Attention Disaggregation
☆106Apr 7, 2026Updated 3 months ago
hao-ai-lab / LookaheadReasoning
View on GitHub
[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning
☆69Oct 31, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Danielohayon / Block-Sparse-Flash-Attention
View on GitHub
☆34Dec 10, 2025Updated 7 months ago
xlite-dev / flux-faster
View on GitHub
A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.
☆24Jul 18, 2025Updated last year
sandyresearch / chipmunk
View on GitHub
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× …
☆111Sep 8, 2025Updated 10 months ago
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 7 months ago
shawnricecake / draft-attention
View on GitHub
Code for Draft Attention
☆103May 22, 2025Updated last year
xlite-dev / Awesome-DiT-Inference
View on GitHub
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
☆578Jun 13, 2026Updated last month
thu-ml / SLA
View on GitHub
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
☆322Feb 24, 2026Updated 4 months ago
huggingface / flux-fast
View on GitHub
Making Flux go brrr on GPUs.
☆171Jan 5, 2026Updated 6 months ago
arielshaulov / TokenTrim
View on GitHub
Official implementation of the paper "TOKENTRIM: INFERENCE-TIME TOKEN PRUNING FOR AUTOREGRESSIVE LONG VIDEO GENERATION"
☆15Feb 8, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lmgame-org / GRL
View on GitHub
Multi-Turn RL Training System with AgentTrainer for Language Model Game Reinforcement Learning
☆65Dec 18, 2025Updated 7 months ago
mit-han-lab / radial-attention
View on GitHub
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
☆603Nov 11, 2025Updated 8 months ago
gen-ai-team / Wan2.1-NABLA
View on GitHub
Wan: Open and Advanced Large-Scale Video Generative Models
☆31Jul 28, 2025Updated 11 months ago
attention-survey / Efficient_Attention_Survey
View on GitHub
A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention
☆304Dec 1, 2025Updated 7 months ago
oahzxl / Awesome-Efficient-Video-Generation
View on GitHub
A curated list of recent efficient video generation methods.
☆72Oct 7, 2025Updated 9 months ago
Peyton-Chen / Sparse-vDiT
View on GitHub
The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …
☆52Jun 6, 2025Updated last year
huggingface / lora-fast
View on GitHub
Minimal repository to demonstrate fast LoRA inference with Flux family of models.
☆32Jul 23, 2025Updated 11 months ago
thu-ml / SpargeAttn
View on GitHub
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
☆1,017Feb 25, 2026Updated 4 months ago
SandAI-org / MagiAttention
View on GitHub
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
☆883Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
HanGuo97 / hilt
View on GitHub
☆40Dec 14, 2025Updated 7 months ago
hao-ai-lab / FastVideo
View on GitHub
A unified inference and post-training framework for accelerated video generation.
☆3,862Updated this week
hao-ai-lab / research-agent
View on GitHub
☆17Feb 25, 2026Updated 4 months ago
NVlabs / rcm
View on GitHub
rCM & Causal-rCM: Leading and Unified Algorithms/Infrastructures for Bidirectional/Autoregressive Video Diffusion Distillation at Scale
☆766Jun 25, 2026Updated 3 weeks ago
thu-nics / MixDQ
View on GitHub
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
☆50Nov 27, 2024Updated last year
Tammytcl / Awesome-Diffusion-Acceleration-Cache
View on GitHub
A curated list of research papers, resources, and advancements on Diffusion Cache and related efficient diffusion model acceleration tech…
☆86Nov 4, 2025Updated 8 months ago
chengzeyi / ParaAttention
View on GitHub
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
☆427Jul 5, 2025Updated last year
radixark / miles_diffusion
View on GitHub
[Experimental] Miles-diffusion is an post-training framework for large-scale diffusion model training and production workloads, forked fr…
☆21Updated this week
hao-ai-lab / MuxServe
View on GitHub
☆90Oct 17, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Zehong-Ma / MagCache
View on GitHub
The official code for NeurIPS 2025 "MagCache: Fast Video Generation with Magnitude-Aware Cache"
☆275Nov 17, 2025Updated 8 months ago
zju-jiyicheng / Forcing-KV
View on GitHub
Implementation for paper "Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models".
☆118May 17, 2026Updated 2 months ago
Mutual-Luo / Awesome-Sparse-Attention-Video-Diffusion
View on GitHub
Awesome Video Diffusion Transformers Sparse Attention Papers
☆41May 30, 2026Updated last month
hao-ai-lab / d3LLM
View on GitHub
[ICML 2026] d3LLM: Ultra-Fast Diffusion LLM 🚀
☆147May 1, 2026Updated 2 months ago
ziplab / Pyramid-Sparse-Attention
View on GitHub
Official PyTorch implementation of [PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation](https://arxiv.org/abs…
☆25Jan 25, 2026Updated 5 months ago
togethercomputer / ParallelKernelBench
View on GitHub
☆41Jul 1, 2026Updated 2 weeks ago
Bujiazi / DiCache
View on GitHub
[ICLR 2026] Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache
☆61Jan 26, 2026Updated 5 months ago