A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆353Mar 12, 2026Updated 2 weeks ago
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Deepspeed windows information☆44Mar 9, 2024Updated 2 years ago
- Advanced CLI diffusion inference/training suite based on Musubi Tuner☆40Mar 20, 2026Updated last week
- This is a pre-built wheel of Triton 3.3.0 for Windows with Nvidia only + Proton☆40May 18, 2025Updated 10 months ago
- ☆18Dec 2, 2024Updated last year
- ☆18Nov 28, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆42Mar 21, 2026Updated last week
- [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-t…☆52Jan 19, 2026Updated 2 months ago
- Fork of the Triton language and compiler for Windows support and easy installation☆1,888Feb 18, 2026Updated last month
- Gemma3的comfyui版本☆10Sep 6, 2025Updated 6 months ago
- dgenerate is a scriptable command line tool (and library) for generating images and animation sequences using stable diffusion and relate…☆42Oct 15, 2025Updated 5 months ago
- ☆24Nov 23, 2025Updated 4 months ago
- Fork of ACE-Step v1.0 for LoRA training with < 10 GB VRAM☆66Feb 3, 2026Updated last month
- Fine-tuning code for CLIP models☆271Jan 28, 2026Updated 2 months ago
- An Extension for Automatic1111 Webui that makes the interface easier to use on mobile (portrait)☆16Apr 16, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- FramePack with existing video input.☆29May 15, 2025Updated 10 months ago
- gguf (GPT-Generated Unified Format) connector☆53Mar 20, 2026Updated last week
- ☆32Jul 20, 2024Updated last year
- ☆63Nov 10, 2025Updated 4 months ago
- My Smart Queue Management Scripts for OpenWRT☆19Oct 14, 2024Updated last year
- Adds a button to download sample images in one click for CivitAI☆45May 17, 2023Updated 2 years ago
- A comprehensive codebase for training and finetuning Image <> Latent models.☆50Mar 1, 2025Updated last year
- realtime conversational dynamics☆19Mar 19, 2025Updated last year
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.☆18Dec 19, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Official code for SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models (NeurIPS 2023)☆13Mar 4, 2024Updated 2 years ago
- Pre-compiled Python whl for Flash-attention, SageAttention, NATTEN, xFormer etc☆559Feb 26, 2026Updated last month
- ☆23Jan 1, 2026Updated 2 months ago
- Extension for Forge-based UIs (Forge, reForge, etc) and ComfyUI to replace CFG with Negative Rejection Steering☆16Feb 14, 2026Updated last month
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆164Mar 12, 2026Updated 2 weeks ago
- Simple Qwen3-VL gguf model loader for Comfy-UI.☆48Mar 16, 2026Updated last week
- stable-diffusion-webui-images-browser☆14Jan 12, 2023Updated 3 years ago
- Flux Pro via Replicate API☆23Dec 26, 2024Updated last year
- Code for my collection of predictors/classifiers/etc☆14Jul 18, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- GUI-focused roop☆15Mar 6, 2025Updated last year
- Sage attention for turning.☆59Dec 29, 2025Updated 2 months ago
- HDM model loader for ComfyUI☆41Dec 14, 2025Updated 3 months ago
- ICDE 2025 Paper, Grounding Natural Language to SQL Translation with Data-Based Self-Explanations☆17May 24, 2025Updated 10 months ago
- ☆22Aug 21, 2025Updated 7 months ago
- ☆13May 22, 2024Updated last year
- Controlnet module for Wan2.1☆30Aug 4, 2025Updated 7 months ago