arnavdantuluri / StableTritonLinks
The first open source triton inference engine for Stable Diffusion, specifically for sdxl
☆12Updated 2 years ago
Alternatives and similar repositories for StableTriton
Users that are interested in StableTriton are comparing it to the libraries listed below
Sorting:
- [ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization☆48Updated last year
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.☆369Updated last year
- A parallelism VAE avoids OOM for high resolution image generation☆84Updated 4 months ago
- https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching☆409Updated 5 months ago
- Model Compression Toolbox for Large Language Models and Diffusion Models☆722Updated 4 months ago
- TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.☆20Updated last year
- High performance inference engine for diffusion models☆102Updated 3 months ago
- Getting Started with Triton: A Tutorial for Python Beginners☆27Updated 2 months ago
- Faster generation with text-to-image diffusion models.☆231Updated 6 months ago
- 🤗A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.☆840Updated this week
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Updated 2 years ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆102Updated last year
- High Performance Int8 GEMM Kernels for SM80 and later GPUs.☆18Updated 9 months ago
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)☆140Updated 2 years ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆142Updated 9 months ago
- Combining Teacache with xDiT to Accelerate Visual Generation Models☆32Updated 8 months ago
- [CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models☆715Updated last year
- Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty…☆560Updated 2 years ago
- stable diffusion, controlnet, tensorrt, accelerate☆58Updated 2 years ago
- ☆188Updated 11 months ago
- [CVPR 2024] DeepCache: Accelerating Diffusion Models for Free☆950Updated last year
- ☆104Updated last year
- NART = NART is not A RunTime, a deep learning inference framework.☆37Updated 2 years ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆50Updated 2 years ago
- A Compressed Stable Diffusion for Efficient Text-to-Image Generation [ECCV'24]☆306Updated last year
- An out-of-the-box inference acceleration engine for Diffusion and DiT models☆60Updated 9 months ago
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)☆79Updated 5 months ago
- https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.☆1,295Updated 9 months ago
- Patch convolution to avoid large GPU memory usage of Conv2D☆93Updated 11 months ago
- ☆168Updated 2 years ago