thu-ml/SLA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thu-ml/SLA)

thu-ml / SLA

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

☆324

Alternatives and similar repositories for SLA

Users that are interested in SLA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thu-ml / SpargeAttn
View on GitHub
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
☆1,020Feb 25, 2026Updated 5 months ago
thu-ml / TurboDiffusion
View on GitHub
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
☆3,582Jul 16, 2026Updated last week
svg-project / Sparse-VideoGen
View on GitHub
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
☆694Jul 4, 2026Updated 3 weeks ago
thu-ml / SageAttention
View on GitHub
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-t…
☆3,502Jan 17, 2026Updated 6 months ago
attention-survey / Efficient_Attention_Survey
View on GitHub
A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention
☆304Dec 1, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
NVlabs / rcm
View on GitHub
rCM & Causal-rCM: Leading and Unified Algorithms/Infrastructures for Bidirectional/Autoregressive Video Diffusion Distillation at Scale
☆772Jun 25, 2026Updated last month
mit-han-lab / radial-attention
View on GitHub
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
☆604Nov 11, 2025Updated 8 months ago
mit-han-lab / flash-moba
View on GitHub
☆251Nov 19, 2025Updated 8 months ago
jt-zhang / Sparse_Attention_API
View on GitHub
☆66Oct 25, 2025Updated 8 months ago
mit-han-lab / fouroversix
View on GitHub
Code for the papers: “Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling” and “Adaptive Block-Scaled Data Types”
☆199Apr 21, 2026Updated 3 months ago
ziplab / BLADE
View on GitHub
[ICLR 2026] This is the official PyTorch implementation of "BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Gen…
☆49Oct 9, 2025Updated 9 months ago
hao-ai-lab / FastVideo
View on GitHub
A unified inference and post-training framework for accelerated video generation.
☆3,879Updated this week
hao-ai-lab / Awesome-Video-Attention
View on GitHub
A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…
☆61Oct 27, 2025Updated 8 months ago
tsinghua-ideal / Twilight
View on GitHub
[NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruning
☆105Jul 8, 2026Updated 2 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mit-han-lab / Block-Sparse-Attention
View on GitHub
A sparse attention kernel supporting mix sparse patterns
☆539Jan 18, 2026Updated 6 months ago
SandAI-org / MagiAttention
View on GitHub
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
☆888Updated this week
Tencent-Hunyuan / flex-block-attn
View on GitHub
flex-block-attn: an efficient block sparse attention computation library
☆130Dec 26, 2025Updated 6 months ago
thu-ml / TetraJet-MXFP4Training
View on GitHub
Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training
☆40May 4, 2026Updated 2 months ago
ziplab / Pyramid-Sparse-Attention
View on GitHub
Official PyTorch implementation of [PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation](https://arxiv.org/abs…
☆25Jan 25, 2026Updated 6 months ago
chengtao-lv / LightForcing
View on GitHub
[ICML 2026] Official repository for the paper "Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention"
☆42May 24, 2026Updated 2 months ago
jt-zhang / CardinalityEstimationTestbed
View on GitHub
CardinalityEstimationTestbed
☆49Sep 6, 2024Updated last year
KlingAIResearch / VMoBA
View on GitHub
Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"
☆64Jul 1, 2025Updated last year
SingleZombie / LLSA
View on GitHub
[CVPR 2026 Highlight] Official implementation of Log-linear Sparse Attention (LLSA).
☆91May 1, 2026Updated 2 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 3 weeks ago
sspec-project / SparseSpec
View on GitHub
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
☆115Dec 2, 2025Updated 7 months ago
thu-ml / Causal-Forcing
View on GitHub
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactiv…
☆879Updated this week
JIA-Lab-research / Jenga
View on GitHub
[NeurIPS 2025] Training-Free Efficient Video Generation via Dynamic Token Carving
☆287Aug 4, 2025Updated 11 months ago
mit-han-lab / x-attention
View on GitHub
[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring
☆280Jul 6, 2025Updated last year
TencentARC / RollingForcing
View on GitHub
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
☆444Oct 31, 2025Updated 8 months ago
GoatWu / Self-Forcing-Plus
View on GitHub
Unofficial extension implementation of Self-Forcing to support I2V && 14B training.
☆380Sep 29, 2025Updated 9 months ago
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,409Updated this week
Infini-AI-Lab / MonarchRT
View on GitHub
☆140Feb 17, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
xinghaow99 / pbs-attn
View on GitHub
[ICML 2026] Sparser Block-Sparse Attention via Token Permutation
☆31May 22, 2026Updated 2 months ago
JaydenLyh / Reward-Forcing
View on GitHub
[CVPR 2026 Highlight] Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
☆352Dec 15, 2025Updated 7 months ago
mit-han-lab / fastrl
View on GitHub
[ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
☆174Feb 27, 2026Updated 4 months ago
shawnricecake / draft-attention
View on GitHub
Code for Draft Attention
☆103May 22, 2025Updated last year
tianweiy / CausVid
View on GitHub
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
☆1,408Aug 7, 2025Updated 11 months ago
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 7 months ago
yuezhouhu / residual-context-diffusion
View on GitHub
[ICML 2026] Residual Context Diffusion (RCD): Repurposing discarded signals as structured priors for high-performance reasoning in dLLMs.
☆58Jun 28, 2026Updated 3 weeks ago