SystemPanic / vllm-windowsLinks
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆175Updated last week
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below
Sorting:
- Service for testing out the new Qwen2.5 omni model☆60Updated 5 months ago
- ☆42Updated 8 months ago
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆15Updated 8 months ago
- ☆122Updated 11 months ago
- Docker compose to run vLLM on Windows☆103Updated last year
- ☆124Updated 6 months ago
- automatically quant GGUF models☆204Updated last week
- Deepspeed windows information☆43Updated last year
- ☆51Updated 11 months ago
- SoTA open-source TTS☆99Updated 2 weeks ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆33Updated 3 weeks ago
- ☆42Updated 8 months ago
- Game Companion AI is an advanced application designed to enhance the gaming experience by providing real-time analysis and interpretation…☆53Updated last year
- llama.cpp fork with additional SOTA quants and improved performance☆31Updated this week
- Free ComfyUI Workflows☆35Updated last month
- Quantized text-audio foundation model from Boson AI☆36Updated last month
- Phi4 Multimodal Instruct - OpenAI endpoint and Docker Image for self-hosting☆40Updated 7 months ago
- OminiControl for the GPU Poor☆38Updated 8 months ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆103Updated 6 months ago
- Development repository for the Triton language and compiler☆34Updated 11 months ago
- Polyglot is a fast, elegant, and free translation tool using AI.☆63Updated last year
- Make abliterated models with transformers, easy and fast☆89Updated 5 months ago
- ☆51Updated 7 months ago
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆103Updated 2 weeks ago
- Run Orpheus 3B Locally with Gradio UI, Standalone App☆22Updated 6 months ago
- Running Microsoft's BitNet via Electron, React & Astro☆44Updated last week
- ☆23Updated last year
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆81Updated 11 months ago
- stable-diffusion.cpp bindings for python☆66Updated this week
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)☆29Updated 4 months ago