SystemPanic / vllm-windowsLinks
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆132Updated 3 weeks ago
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below
Sorting:
- ☆40Updated 6 months ago
- Service for testing out the new Qwen2.5 omni model☆55Updated 3 months ago
- ☆51Updated 9 months ago
- OminiControl for the GPU Poor☆38Updated 6 months ago
- ☆120Updated 9 months ago
- ☆120Updated 5 months ago
- Development repository for the Triton language and compiler☆34Updated 10 months ago
- SoTA open-source TTS☆50Updated 3 weeks ago
- automatically quant GGUF models☆195Updated last week
- ACE-Step: A Step Towards Music Generation Foundation Model☆42Updated 3 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 8 months ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆33Updated last week
- Quantized text-audio foundation model from Boson AI☆22Updated last week
- Wan2.1, quantized and optimized so it fits on your 3090/4090☆34Updated 5 months ago
- Cosmos1GP for the GPU Poor by DeepBeepMeep☆74Updated 6 months ago
- Make abliterated models with transformers, easy and fast☆83Updated 4 months ago
- Memory Management for the GPU Poor, run the latest open source frontier models on consumer Nvidia GPUs☆144Updated 2 weeks ago
- stable-diffusion.cpp bindings for python☆58Updated last month
- Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossi…☆49Updated 2 weeks ago
- Privacy-first agentic framework with powerful reasoning & task automation capabilities. Natively distributed and fully ISO 27XXX complian…☆66Updated 4 months ago
- Deepspeed windows information☆42Updated last year
- Run Ollama LLM models in Google Colab for free☆37Updated 9 months ago
- win32 native frontend for llama-cli☆12Updated 9 months ago
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)☆29Updated 2 months ago
- A random walk voice style cloning application for Kokoro text to speech☆117Updated 2 months ago
- ☆40Updated 6 months ago
- Game Companion AI is an advanced application designed to enhance the gaming experience by providing real-time analysis and interpretation…☆52Updated 10 months ago
- Run Stable diffusion 3 on low VRAM systems☆28Updated last year
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆15Updated 6 months ago
- ☆205Updated 3 months ago