Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
☆285Oct 12, 2024Updated last year
Alternatives and similar repositories for flux-fp8-api
Users that are interested in flux-fp8-api are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cog inference for flux models☆370Jul 31, 2025Updated 9 months ago
- End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).☆396Jan 8, 2026Updated 4 months ago
- PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu☆78Dec 3, 2024Updated last year
- A general fine-tuning kit geared toward image/video/audio diffusion models.☆2,833Updated this week
- A unified benchmarking framework for generative styling models in PyTorch☆14Oct 27, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆33Nov 4, 2024Updated last year
- https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching☆427Jul 5, 2025Updated 10 months ago
- OneDiff: An out-of-the-box acceleration library for diffusion models.☆1,966Dec 4, 2025Updated 5 months ago
- [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models☆3,853Mar 7, 2026Updated 2 months ago
- ☆81Dec 27, 2024Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 11 months ago
- Accelerates Flux.1 image generation, just by using this node.☆140Dec 19, 2024Updated last year
- [NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising☆213Sep 27, 2025Updated 7 months ago
- https://hf.co/hexgrad/Kokoro-82M☆14Jan 14, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.☆1,304Mar 27, 2025Updated last year
- ☆110Nov 27, 2024Updated last year
- ☆49Feb 23, 2025Updated last year
- [NeurIPS 2025] Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆215Sep 27, 2025Updated 7 months ago
- ☆2,237Nov 8, 2024Updated last year
- Writing FLUX in Triton☆42Sep 22, 2024Updated last year
- PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation☆1,922Oct 31, 2024Updated last year
- Rectified Flow Inversion (RF-Inversion) - ICLR 2025☆474Mar 19, 2025Updated last year
- Implicit Style-Content Separation using B-LoRA☆400Nov 14, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The ultimate training toolkit for finetuning diffusion models☆10,614May 19, 2026Updated last week
- LayerDiffuse in pure diffusers without any GUI☆422Jun 16, 2024Updated last year
- Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis (ICCV, 2025)☆52Jan 14, 2026Updated 4 months ago
- Official repository for VQDM:Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization paper☆34Sep 17, 2024Updated last year
- Concept Sliders for Precise Control of Diffusion Models☆1,134Apr 13, 2026Updated last month
- Text and image to video generation: Kandinsky 4.0 (2024)☆149Dec 17, 2024Updated last year
- Text-Guided Generation of Full-Body Image with Preserved Reference Face for Customized Animation☆24Jun 24, 2024Updated last year
- Model Compression Toolbox for Large Language Models and Diffusion Models☆783Aug 14, 2025Updated 9 months ago
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,252Feb 16, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models☆727Dec 2, 2024Updated last year
- Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"☆404Mar 19, 2025Updated last year
- Nodes for image juxtaposition for Flux in ComfyUI☆1,399Jan 9, 2025Updated last year
- A CUDA kernel for NHWC GroupNorm for PyTorch☆23Nov 15, 2024Updated last year
- PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"☆13Mar 11, 2026Updated 2 months ago
- A pytorch quantization backend for optimum☆1,040Apr 2, 2026Updated last month
- Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation☆576Sep 16, 2024Updated last year