SystemPanic / vllm-windowsLinks
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆285Updated last month
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below
Sorting:
- Service for testing out the new Qwen2.5 omni model☆62Updated 8 months ago
- ☆135Updated 10 months ago
- ☆128Updated last year
- Deepspeed windows information☆44Updated last year
- ☆46Updated 11 months ago
- Quantized text-audio foundation model from Boson AI☆43Updated 5 months ago
- automatically quant GGUF models☆220Updated 3 weeks ago
- llama.cpp fork with additional SOTA quants and improved performance☆44Updated last week
- Make abliterated models with transformers, easy and fast☆112Updated last month
- This is a pre-built wheel of Triton 3.3.0 for Windows with Nvidia only + Proton☆40Updated 8 months ago
- SoTA open-source TTS☆147Updated last month
- Free ComfyUI Workflows☆43Updated 3 weeks ago
- gguf (GPT-Generated Unified Format) connector☆49Updated last week
- ☆51Updated 11 months ago
- Memory Management for the GPU Poor, run the latest open source frontier models on consumer Nvidia GPUs☆170Updated this week
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆153Updated this week
- PyQt6 1st try☆293Updated last year
- A collection of compiled wheels for deepspeed built for python 3.10 and 3.11 with support for cuda 11.8 and 12.1 for Windows☆86Updated last year
- ComfyUI node for highly expressive speech and realistic zero-shot voice cloning☆371Updated last month
- stable-diffusion.cpp bindings for python☆94Updated last month
- Attempts to bypass AI Image detection by employing various methods such as: Noise Injection, FFT Smoothing, FFT Matching, Pixel Perturbat…☆191Updated 4 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆47Updated 8 months ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆105Updated 2 months ago
- Privacy-first agentic framework with powerful reasoning & task automation capabilities. Natively distributed and fully ISO 27XXX complian…☆68Updated 9 months ago
- This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized co…☆57Updated last year
- ☆44Updated 11 months ago
- Fast and memory-efficient exact attention - Windows wheels☆36Updated 8 months ago
- Docker compose to run vLLM on Windows☆113Updated 2 years ago
- Enable true multi gpu capability in Comfy UI using XDiT XFuser and FSDP managed by Ray☆262Updated this week
- Python bindings for llama.cpp☆144Updated this week