horseee / dKV-CacheLinks

[NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models

☆120

Alternatives and similar repositories for dKV-Cache

Users that are interested in dKV-Cache are comparing it to the libraries listed below

Sorting:

maomaocun / dLLM-cache
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…
☆186Updated 3 weeks ago
z-lab / sparselora
[ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
☆64Updated 5 months ago
thu-nics / R2R
[NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Tok…
☆65Updated last week
OpenSparseLLMs / Linearization
☆62Updated 5 months ago
mit-han-lab / x-attention
[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring
☆255Updated 5 months ago
Aaronhuang-778 / Mixture-Compressor-MoE
[ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains More
☆63Updated 9 months ago
zhijie-group / Discrete-Diffusion-Forcing
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
☆210Updated 2 months ago
OpenMOSS / DiRL
☆90Updated 2 weeks ago
ThisisBillhe / ZipAR
[ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…
☆53Updated 8 months ago
horseee / learning-to-cache
[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
☆116Updated last year
czg1225 / dParallel
dParallel: Learnable Parallel Decoding for dLLMs
☆44Updated last month
OpenSparseLLMs / Skip-DiT
✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
☆78Updated 5 months ago
mit-han-lab / lpd
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
☆80Updated 4 months ago
StargazerX0 / ScaleKV
[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
☆51Updated last month
Gen-Verse / dLLM-RL
TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models
☆339Updated 3 weeks ago
ML-GSAI / Diffusion-LLM-Papers
A Collection of Papers on Diffusion Language Models
☆148Updated 2 months ago
Infini-AI-Lab / Multiverse
☆104Updated 2 months ago
yu-rp / Dimple
Dimple, the first Discrete Diffusion Multimodal Large Language Model
☆112Updated 5 months ago
thu-ml / ReMoE
[ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.
☆99Updated 11 months ago
czg1225 / VeriThinker
[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient
☆63Updated 2 months ago
OpenSparseLLMs / MoM
☆112Updated 2 months ago
mit-han-lab / VisCompare
A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders
☆24Updated 9 months ago
LiangrunFlora / Slow-Fast-Sampling
Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Princ…
☆35Updated 4 months ago
NUS-HPC-AI-Lab / DD-Ranking
Data distillation benchmark
☆71Updated 5 months ago
ThisisBillhe / EfficientDM
[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…
☆66Updated last year
inclusionAI / dFactory
Easy and Efficient dLLM Fine-Tuning
☆131Updated 2 weeks ago
yczhou001 / Awesome-Diffusion-LLM
paper list, tutorial, and nano code snippet for Diffusion Large Language Models.
☆136Updated 5 months ago
tilde-research / nsa-impl
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆126Updated 5 months ago
NEUIR / PC-Sampler
☆32Updated 2 months ago
sustcsonglin / linear-attention-and-beyond-slides
☆100Updated 9 months ago