arnavdantuluri / StableTriton
The first open source triton inference engine for Stable Diffusion, specifically for sdxl
☆12Updated last year
Alternatives and similar repositories for StableTriton:
Users that are interested in StableTriton are comparing it to the libraries listed below
- [ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization☆32Updated 2 months ago
- Context parallel attention that accelerates DiT model inference with dynamic caching☆189Updated this week
- A parallelism VAE avoids OOM for high resolution image generation☆53Updated last month
- Faster generation with text-to-image diffusion models.☆210Updated 4 months ago
- Model Compression Toolbox for Large Language Models and Diffusion Models☆330Updated this week
- stable diffusion, controlnet, tensorrt, accelerate☆55Updated last year
- TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.☆17Updated 11 months ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆55Updated last week
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.☆343Updated 11 months ago
- Patch convolution to avoid large GPU memory usage of Conv2D☆85Updated 3 weeks ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆94Updated 11 months ago
- ☆81Updated 5 months ago
- Official implementation of the ICLR 2024 paper AffineQuant☆24Updated 10 months ago
- ☆26Updated last year
- ☆14Updated 10 months ago
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆25Updated 2 months ago
- Diffusers training with mmengine☆99Updated last year
- ☆48Updated 11 months ago
- ☆144Updated last month
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Updated last year
- ☆34Updated last year
- [NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising☆179Updated 4 months ago
- Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x fast…☆248Updated 4 months ago
- 📖A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉☆191Updated last month
- ☆140Updated 10 months ago
- [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models☆679Updated this week
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆89Updated this week
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)☆129Updated last year
- ☆157Updated last year