A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆514Jun 7, 2026Updated this week
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Nov 28, 2025Updated 6 months ago
- Deepspeed windows information☆44Mar 9, 2024Updated 2 years ago
- One-click Qwen3.6-27B inference on Windows. 158 tok/s on RTX 5090, 72 tok/s on RTX 3090. Native, no WSL, no Docker, no telemetry.☆201May 14, 2026Updated 3 weeks ago
- Advanced CLI diffusion inference/training suite based on Musubi Tuner☆40Apr 15, 2026Updated last month
- Fork of the Triton language and compiler for Windows support and easy installation☆1,933Feb 18, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆18Dec 2, 2024Updated last year
- A repo to quantize diffusion models directly in ComfyUI☆116Jun 14, 2025Updated 11 months ago
- A video clipper for Hunyuan video training.☆87Jul 28, 2025Updated 10 months ago
- [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-t…☆68Jan 19, 2026Updated 4 months ago
- a neural network trainer for weebs☆14Jun 1, 2026Updated last week
- Gemma3的comfyui版本☆12Sep 6, 2025Updated 9 months ago
- ☆71Updated this week
- Fork of ACE-Step v1.0 for LoRA training with < 10 GB VRAM☆68Feb 3, 2026Updated 4 months ago
- dgenerate is a scriptable command line tool (and library) for generating images and animation sequences using stable diffusion and relate…☆43Oct 15, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- An Extension for Automatic1111 Webui that makes the interface easier to use on mobile (portrait)☆16Apr 16, 2024Updated 2 years ago
- Fine-tuning code for CLIP models☆275Updated this week
- gguf (GPT-Generated Unified Format) connector☆57Jun 4, 2026Updated last week
- FramePack with existing video input.☆29May 15, 2025Updated last year
- llvm powered deobfuscation of a vm-based protection☆57Feb 25, 2026Updated 3 months ago
- ☆32Jul 20, 2024Updated last year
- ☆67Nov 10, 2025Updated 7 months ago
- [DEPRECATED] Attempts to convert a Flux lora to a Chroma lora☆21Nov 9, 2025Updated 7 months ago
- A comprehensive codebase for training and finetuning Image <> Latent models.☆50Mar 1, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.☆18Dec 19, 2024Updated last year
- Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossi…☆143Mar 24, 2026Updated 2 months ago
- 多进程 GFPGAN,提高运行效率和资源利用,根据设备不同提高数倍速度☆13Oct 31, 2023Updated 2 years ago
- ☆24Jan 1, 2026Updated 5 months ago
- ☆27Updated this week
- Extension for Forge-based UIs (Forge, reForge, etc) and ComfyUI to replace CFG with Negative Rejection Steering☆16May 16, 2026Updated 3 weeks ago
- Pre-compiled Python whl for Flash-attention, SageAttention, NATTEN, xFormer etc☆672Apr 1, 2026Updated 2 months ago
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆172Updated this week
- Performs the entire AI cover generation process with UI☆30Aug 4, 2025Updated 10 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Extension/Script for Stable Diffusion UI by AUTOMATIC1111 https://github.com/AUTOMATIC1111/stable-diffusion-webui☆19Feb 10, 2023Updated 3 years ago
- A Multipurpose toolkit for managing, editing and creating models.☆12Aug 13, 2024Updated last year
- Money tracking - Android App for planning, tracking your spending, monitoring your credit and budget - UNIBO 2016/2017☆12Sep 14, 2017Updated 8 years ago
- ☆15Jun 6, 2025Updated last year
- Code for my collection of predictors/classifiers/etc☆14Jul 18, 2024Updated last year
- An open-source tool for efficiently parsing x64dbg trace files (.trace32 & .trace64).☆46Jan 20, 2026Updated 4 months ago
- HDM model loader for ComfyUI☆42Dec 14, 2025Updated 5 months ago