arnavdantuluri / StableTriton

The first open source triton inference engine for Stable Diffusion, specifically for sdxl

☆12

Alternatives and similar repositories for StableTriton:

Users that are interested in StableTriton are comparing it to the libraries listed below

thu-nics / MixDQ
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
☆32Updated 2 months ago
chengzeyi / ParaAttention
Context parallel attention that accelerates DiT model inference with dynamic caching
☆189Updated this week
xdit-project / DistVAE
A parallelism VAE avoids OOM for high resolution image generation
☆53Updated last month
huggingface / diffusion-fast
Faster generation with text-to-image diffusion models.
☆210Updated 4 months ago
mit-han-lab / deepcompressor
Model Compression Toolbox for Large Language Models and Diffusion Models
☆330Updated this week
hnsywangxin / controlnet_stable_tensorrt
stable diffusion, controlnet, tensorrt, accelerate
☆55Updated last year
rajeevsrao / TensorRT
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
☆17Updated 11 months ago
thu-nics / ViDiT-Q
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
☆55Updated last week
Xiuyu-Li / q-diffusion
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
☆343Updated 11 months ago
mit-han-lab / patch_conv
Patch convolution to avoid large GPU memory usage of Conv2D
☆85Updated 3 weeks ago
ziplab / PTQD
The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models
☆94Updated 11 months ago
AlibabaPAI / FLASHNN
☆81Updated 5 months ago
bytedance / AffineQuant
Official implementation of the ICLR 2024 paper AffineQuant
☆24Updated 10 months ago
xiatwhu / trt2023
☆26Updated last year
BBuf / megatron-lm-parallel-group-playground
☆14Updated 10 months ago
xdit-project / DiTCacheAnalysis
An auxiliary project analysis of the characteristics of KV in DiT Attention.
☆25Updated 2 months ago
okotaku / diffengine
Diffusers training with mmengine
☆99Updated last year
ethansmith2000 / ImprovedTokenMerge
☆48Updated 11 months ago
thu-nics / DiTFastAttn
☆144Updated last month
Oneflow-Inc / one-fx
A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.
☆13Updated last year
alibaba / diffusers-api
☆34Updated last year
czg1225 / AsyncDiff
[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
☆179Updated 4 months ago
aredden / flux-fp8-api
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x fast…
☆248Updated 4 months ago
DefTruth / Awesome-Diffusion-Inference
📖A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉
☆191Updated last month
OpenPPL / ppl.nn.llm
☆140Updated 10 months ago
mit-han-lab / nunchaku
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
☆679Updated this week
pytorch-labs / tritonbench
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆89Updated this week
42Shawn / PTQ4DM
Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)
☆129Updated last year
jundaf2 / INT8-Flash-Attention-FMHA-Quantization
☆157Updated last year