aredden/flux-fp8-api

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aredden/flux-fp8-api)

aredden / flux-fp8-api

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

☆287

Alternatives and similar repositories for flux-fp8-api

Users that are interested in flux-fp8-api are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

replicate / cog-flux
View on GitHub
Cog inference for flux models
☆371Jul 31, 2025Updated 11 months ago
sayakpaul / diffusers-torchao
View on GitHub
End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).
☆399Jan 8, 2026Updated 6 months ago
aredden / torch-cublas-hgemm
View on GitHub
PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu
☆78Dec 3, 2024Updated last year
bghira / SimpleTuner
View on GitHub
A general fine-tuning kit geared toward image/video/audio diffusion models.
☆2,885Updated this week
gojasper / style-rank
View on GitHub
A unified benchmarking framework for generative styling models in PyTorch
☆14Oct 27, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
chengzeyi / ParaAttention
View on GitHub
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
☆427Jul 5, 2025Updated last year
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
siliconflow / onediff
View on GitHub
OneDiff: An out-of-the-box acceleration library for diffusion models.
☆1,964Dec 4, 2025Updated 7 months ago
thecharlieblake / lovely-llama
View on GitHub
An implementation of the Llama architecture, to instruct and delight
☆21May 31, 2025Updated last year
KONAKONA666 / q8_kernels
View on GitHub
☆82Dec 27, 2024Updated last year
nunchaku-ai / nunchaku
View on GitHub
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
☆3,915Mar 7, 2026Updated 4 months ago
discus0434 / comfyui-flux-accelerator
View on GitHub
Accelerates Flux.1 image generation, just by using this node.
☆141Dec 19, 2024Updated last year
gau-nernst / kokoro
View on GitHub
https://hf.co/hexgrad/Kokoro-82M
☆14Jan 14, 2026Updated 6 months ago
chengzeyi / stable-fast
View on GitHub
https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
☆1,304Mar 27, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
nunchaku-ai / deepcompressor
View on GitHub
Model Compression Toolbox for Large Language Models and Diffusion Models
☆795Aug 14, 2025Updated 11 months ago
czg1225 / AsyncDiff
View on GitHub
[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
☆215Sep 27, 2025Updated 9 months ago
TencentARC / FluxKits
View on GitHub
☆109Nov 27, 2024Updated last year
SwayStar123 / microdiffusion
View on GitHub
☆49Feb 23, 2025Updated last year
XLabs-AI / x-flux
View on GitHub
☆2,232Nov 8, 2024Updated last year
PixArt-alpha / PixArt-sigma
View on GitHub
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
☆1,933Oct 31, 2024Updated last year
timudk / flux_triton
View on GitHub
Writing FLUX in Triton
☆42Sep 22, 2024Updated last year
LituRout / RF-Inversion
View on GitHub
Rectified Flow Inversion (RF-Inversion) - ICLR 2025
☆478Mar 19, 2025Updated last year
Huage001 / CLEAR
View on GitHub
[NeurIPS 2025] Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
☆219Sep 27, 2025Updated 9 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yardenfren1996 / B-LoRA
View on GitHub
Implicit Style-Content Separation using B-LoRA
☆402Nov 14, 2024Updated last year
lllyasviel / LayerDiffuse_DiffusersCLI
View on GitHub
LayerDiffuse in pure diffusers without any GUI
☆422Jun 16, 2024Updated 2 years ago
rohitgandikota / sliders
View on GitHub
Concept Sliders for Precise Control of Diffusion Models
☆1,136Apr 13, 2026Updated 3 months ago
tdrussell / diffusion-pipe
View on GitHub
A pipeline parallel training script for diffusion models.
☆1,997Jun 29, 2026Updated 3 weeks ago
Alpha-VLLM / Lumina-T2X
View on GitHub
Lumina-T2X is a unified framework for Text to Any Modality Generation
☆2,247Feb 16, 2025Updated last year
latentCall145 / channels-last-groupnorm
View on GitHub
A CUDA kernel for NHWC GroupNorm for PyTorch
☆23Nov 15, 2024Updated last year
ai-forever / Kandinsky-4
View on GitHub
Text and image to video generation: Kandinsky 4.0 (2024)
☆150Dec 17, 2024Updated last year
cloneofsimo / minDinoV2
View on GitHub
☆24Oct 15, 2024Updated last year
mit-han-lab / distrifuser
View on GitHub
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
☆727Dec 2, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
logtd / ComfyUI-Fluxtapoz
View on GitHub
Nodes for image juxtaposition for Flux in ComfyUI
☆1,397Jan 9, 2025Updated last year
itsmag11 / Omegance
View on GitHub
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis (ICCV, 2025)
☆52Jan 14, 2026Updated 6 months ago
google / RB-Modulation
View on GitHub
Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"
☆404Mar 19, 2025Updated last year
xdit-project / xDiT
View on GitHub
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
☆2,662Jul 14, 2026Updated last week
huggingface / optimum-quanto
View on GitHub
A pytorch quantization backend for optimum
☆1,048Updated this week
sayakpaul / tt-scale-flux
View on GitHub
Inference-time scaling of diffusion-based image and video generation models.
☆174Dec 17, 2025Updated 7 months ago
Yuanshi9815 / OminiControl
View on GitHub
[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer
☆1,926Jul 2, 2026Updated 3 weeks ago