SystemPanic / vllm-windows
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆27Updated last week
Alternatives and similar repositories for vllm-windows:
Users that are interested in vllm-windows are comparing it to the libraries listed below
- Interact with a AI Game-engine that keep building its rules and world as you play, adapted to your gameplay.☆42Updated 10 months ago
- Loader extension for tabbyAPI in SillyTavern☆25Updated 8 months ago
- A simple framework for using a local Koboldcpp LLM to help with story-writing☆21Updated last year
- An extension to use Kokoro TTS in text generation webui☆18Updated last month
- Development repository for the Triton language and compiler☆33Updated 6 months ago
- Run Orpheus 3B Locally with Gradio UI, Standalone App☆20Updated 3 weeks ago
- Bridging wrapper for llama-cpp-python within ComfyUI☆55Updated 9 months ago
- ☆104Updated last month
- A Lightweight Gradio Web interface for Text-to-Audio Generation utilising SAO1.0☆51Updated 10 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated 11 months ago
- gguf node for comfyui☆44Updated this week
- Make abliterated models with transformers, easy and fast☆67Updated last week
- Fast and memory-efficient exact attention - Windows wheels☆38Updated 2 weeks ago
- ☆22Updated last year
- Attend - to what matters.☆14Updated 2 months ago
- ☆23Updated 6 months ago
- Genertaes control vectors for use with llama.cpp in GGUF format.☆22Updated last month
- A cli app for experimenting with kokoro voice creating and mixing using the available voices to interpolate new ones☆23Updated 2 months ago
- An API for VoiceCraft.☆25Updated 9 months ago
- Croco.Cpp is a 3rd party testground for KoboldCPP, a simple one-file way to run various GGML/GGUF models with KoboldAI's UI. (for Croco.C…☆104Updated this week
- Cosmos1GP for the GPU Poor by DeepBeepMeep☆63Updated 2 months ago
- 8-bit CUDA functions for PyTorch☆25Updated last year
- Service for testing out the new Qwen2.5 omni model☆35Updated 3 weeks ago
- The successful integration of Qwen2-VL-Instruct into the ComfyUI platform has enabled a smooth operation, supporting (but not limited to)…☆93Updated 3 weeks ago
- Create text chunks which end at natural stopping points without using a tokenizer☆26Updated last month
- LCM test nodes for comfyui☆62Updated last year
- Testbed for the fastest SD pipelines☆35Updated last year
- AI Media processing using ComfyUI☆127Updated this week
- Fast and memory-efficient exact attention - Windows wheels☆33Updated last year
- ☆35Updated 3 weeks ago