kamalkraj / stable-diffusion-tritonserver
Deploy stable diffusion model with onnx/tenorrt + tritonserver
☆123Updated last year
Alternatives and similar repositories for stable-diffusion-tritonserver
Users that are interested in stable-diffusion-tritonserver are comparing it to the libraries listed below
Sorting:
- The Triton backend for TensorRT.☆75Updated this week
- ☆54Updated 2 years ago
- stable diffusion, controlnet, tensorrt, accelerate☆56Updated 2 years ago
- ONNX-Powered Inference for State-of-the-Art Face Upscalers☆98Updated 9 months ago
- ☆31Updated 2 years ago
- Faster generation with text-to-image diffusion models.☆213Updated 7 months ago
- Quantized stable-diffusion cutting down memory 75%, testing in streamlit, deploying in container☆54Updated last week
- TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.☆19Updated last year
- NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that del…☆26Updated last year
- Common source, scripts and utilities shared across all Triton repositories.☆71Updated 2 weeks ago
- Inference speed-up for stable-diffusion (ldm) with TensorRT.☆35Updated last year
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆200Updated 3 weeks ago
- A Toolkit to Help Optimize Onnx Model☆145Updated this week
- The Triton backend for the ONNX Runtime.☆144Updated last week
- ☆255Updated last week
- ☆99Updated last year
- Python bindings for ggml☆140Updated 8 months ago
- Writing FLUX in Triton☆33Updated 7 months ago
- Unofficial implementation. Stable diffusion model trained by AI Feedback-Based Self-Training Direct Preference Optimization.☆64Updated last year
- End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).☆349Updated 2 months ago
- Common source, scripts and utilities for creating Triton backends.☆321Updated last week
- Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x fast…☆263Updated 7 months ago
- Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty…☆558Updated last year
- ☆68Updated 4 months ago
- Simple large-scale training of stable diffusion with multi-node support.☆132Updated 2 years ago
- SSD-1B, an open-source text-to-image model, outperforming previous versions by being 50% smaller and 60% faster than SDXL.☆175Updated last year
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated 5 months ago
- https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching☆271Updated last week
- ☆84Updated 2 years ago
- Generate long weighted prompt embeddings for Stable Diffusion☆116Updated 3 weeks ago