SystemPanic / vllm-windowsLinks
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆247Updated 2 weeks ago
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below
Sorting:
- Service for testing out the new Qwen2.5 omni model☆62Updated 7 months ago
- Quantized text-audio foundation model from Boson AI☆41Updated 3 months ago
- ☆126Updated last year
- automatically quant GGUF models☆219Updated last month
- ☆43Updated 10 months ago
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆153Updated this week
- llama.cpp fork with additional SOTA quants and improved performance☆40Updated this week
- ☆133Updated 9 months ago
- Deepspeed windows information☆44Updated last year
- A collection of compiled wheels for deepspeed built for python 3.10 and 3.11 with support for cuda 11.8 and 12.1 for Windows☆81Updated last year
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆15Updated 10 months ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆105Updated 3 weeks ago
- ☆51Updated last year
- Development repository for the Triton language and compiler☆34Updated last year
- Fast and memory-efficient exact attention - Windows wheels☆36Updated 7 months ago
- SoTA open-source TTS☆125Updated last month
- ACE-Step: A Step Towards Music Generation Foundation Model☆45Updated 6 months ago
- This is a pre-built wheel of Triton 3.3.0 for Windows with Nvidia only + Proton☆39Updated 6 months ago
- Writing Extension for Text Generation WebUI☆64Updated 4 months ago
- Free ComfyUI Workflows☆40Updated 2 weeks ago
- Memory Management for the GPU Poor, run the latest open source frontier models on consumer Nvidia GPUs☆160Updated 2 weeks ago
- This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized co…☆57Updated last year
- Wan2.1, quantized and optimized so it fits on your 3090/4090☆34Updated 9 months ago
- PyQt6 1st try☆291Updated 11 months ago
- ☆44Updated 10 months ago
- ☆227Updated 7 months ago
- ComfyUI node for highly expressive speech and realistic zero-shot voice cloning☆320Updated last month
- OminiControl for the GPU Poor☆39Updated 10 months ago
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆178Updated last week
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆107Updated 2 weeks ago