Tammytcl / Awesome-Diffusion-Acceleration-CacheLinks

A curated list of research papers, resources, and advancements on Diffusion Cache and related efficient diffusion model acceleration techniques.

☆51

Alternatives and similar repositories for Awesome-Diffusion-Acceleration-Cache

Users that are interested in Awesome-Diffusion-Acceleration-Cache are comparing it to the libraries listed below

Sorting:

StargazerX0 / ScaleKV
[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
☆51Updated 3 weeks ago
Shenyi-Z / DuCa
(ToCa-v2) A New version of ToCa，with faster speed and better acceleration!
☆39Updated 8 months ago
Xingyu-Zheng / BiDM
(NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models
☆22Updated last year
prathebaselva / FORA
FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.
☆51Updated last year
ThisisBillhe / ZipAR
[ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…
☆53Updated 8 months ago
horseee / learning-to-cache
[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
☆116Updated last year
horseee / dKV-Cache
[NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models
☆119Updated 6 months ago
VainF / TinyFusion
[CVPR 2025 Highlight] TinyFusion: Diffusion Transformers Learned Shallow
☆146Updated 7 months ago
Bujiazi / DiCache
Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache
☆52Updated last month
Shenyi-Z / ToCa
[ICLR2025] Accelerating Diffusion Transformers with Token-wise Feature Caching
☆195Updated 8 months ago
KlingTeam / VMoBA
Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"
☆55Updated 4 months ago
thu-nics / R2R
[NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Tok…
☆59Updated 2 weeks ago
INV-WZQ / SparseD
[Arxiv 2025] SparseD: Sparse Attention for Diffusion Language Models
☆49Updated last month
LINs-lab / GMem
[Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models
☆40Updated 8 months ago
MarkXCloud / CSpD
The official repo of continuous speculative decoding
☆30Updated 8 months ago
NUS-HPC-AI-Lab / Dynamic-Diffusion-Transformer
☆89Updated 8 months ago
Peyton-Chen / Sparse-vDiT
The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …
☆49Updated 5 months ago
ModelTC / TFMQ-DM
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for…
☆109Updated last month
thu-nics / DiTFastAttn
☆187Updated 10 months ago
OpenSparseLLMs / Skip-DiT
✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
☆76Updated 4 months ago
mit-han-lab / lpd
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
☆80Updated 4 months ago
M-E-AGI-Lab / Muddit
Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.
☆95Updated 3 weeks ago
thu-ml / SLA
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
☆140Updated 2 weeks ago
hp-l33 / ARPG
Autoregressive Image Generation with Randomized Parallel Decoding
☆81Updated last month
shawnricecake / draft-attention
Code for Draft Attention
☆93Updated 6 months ago
mit-han-lab / VisCompare
A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders
☆24Updated 9 months ago
AdaCache-DiT / AdaCache
Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"
☆160Updated last year
NoakLiu / FastCache-xDiT
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]
☆45Updated 2 months ago
csguoh / IntLoRA
[ICML2025] LoRA fine-tune directly on the quantized models.
☆36Updated last year
Shenyi-Z / Cache4Diffusion
Aiming to integrate most existing feature caching-based diffusion acceleration schemes into a unified framework.
☆81Updated last month