xlite-dev / flux-fasterLinks

A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.

☆24

Alternatives and similar repositories for flux-faster

Users that are interested in flux-faster are comparing it to the libraries listed below

Sorting:

chengzeyi / piflux
(WIP) Parallel inference for black-forest-labs' FLUX model.
☆18Updated last year
xdit-project / DistVAE
A parallelism VAE avoids OOM for high resolution image generation
☆82Updated 3 months ago
shawnricecake / draft-attention
Code for Draft Attention
☆93Updated 5 months ago
RiseAI-Sys / DAX
High performance inference engine for diffusion models
☆95Updated 2 months ago
huggingface / flux-fast
Making Flux go brrr on GPUs.
☆154Updated 4 months ago
thu-nics / DiTFastAttn
☆186Updated 10 months ago
Bujiazi / DiCache
Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache
☆52Updated last month
thu-nics / MixDQ
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
☆46Updated 11 months ago
WaveSpeedAI / QuantumAttention
[WIP] Better (FP8) attention for Hopper
☆32Updated 8 months ago
TMElyralab / lyraDiff
An out-of-the-box inference acceleration engine for Diffusion and DiT models
☆58Updated 8 months ago
Roblox / SmoothCache
Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.
☆45Updated 4 months ago
vipshop / cache-dit
A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗DiTs.
☆560Updated this week
TZW1998 / ParaTAA-Diffusion
This is the official repo for the paper "Accelerating Parallel Sampling of Diffusion Models" Tang et al. ICML 2024 https://openreview.net…
☆16Updated last year
czg1225 / AsyncDiff
[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
☆207Updated last month
NoakLiu / FastCache-xDiT
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]
☆45Updated 2 months ago
AdaCache-DiT / AdaCache
Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"
☆160Updated last year
mit-han-lab / patch_conv
Patch convolution to avoid large GPU memory usage of Conv2D
☆93Updated 9 months ago
Vchitect / LiteGen
A light-weight and high-efficient training framework for accelerating diffusion tasks.
☆50Updated last year
chengzeyi / ParaAttention
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
☆388Updated 4 months ago
timudk / flux_triton
Writing FLUX in Triton
☆41Updated last year
ziplab / PTQD
The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models
☆101Updated last year
mit-han-lab / VisCompare
A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders
☆24Updated 9 months ago
thu-nics / ViDiT-Q
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
☆132Updated 7 months ago
Vchitect / FasterCache
[ICLR 2025] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
☆251Updated 10 months ago
MingXiangL / Teacache-xDiT
Combining Teacache with xDiT to Accelerate Visual Generation Models
☆32Updated 7 months ago
KONAKONA666 / q8_kernels
☆77Updated 10 months ago
xdit-project / DiTCacheAnalysis
An auxiliary project analysis of the characteristics of KV in DiT Attention.
☆32Updated 11 months ago
sandyresearch / chipmunk
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× …
☆90Updated 2 months ago
horseee / learning-to-cache
[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
☆116Updated last year
TencentARC / FluxKits
☆108Updated 11 months ago