Docker compose to run vLLM on Windows
☆121Jan 1, 2024Updated 2 years ago
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Real-time voice conversation system with Sesame CSM, featuring web-based audio visualization and GPU acceleration. Educational implementa…☆17Mar 18, 2025Updated last year
- Yet another frontend for LLM, written using .NET and WinUI 3☆11Sep 14, 2025Updated 9 months ago
- A full-stack document management and AI chat application that enables users to upload, manage, and chat with their documents using AI. Bu…☆16Aug 10, 2025Updated 10 months ago
- Documentation and helper scripts for Gigabyte Aero 15x v8 workarounds☆18Oct 30, 2018Updated 7 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A fast batching API to serve LLM models☆189Apr 26, 2024Updated 2 years ago
- Offline tool that processes YouTube videos using WhisperX for automatic transcription and speaker diarization, detects logical fallacies,…☆29Aug 14, 2024Updated last year
- ☆31Mar 26, 2025Updated last year
- Categorize credit card transactions using a local large language model similar to GPT3☆15Dec 29, 2023Updated 2 years ago
- an auto-sleeping and -waking framework around llama.cpp☆13Feb 8, 2025Updated last year
- XTTSv2 Extension for oobabooga text-generation-webui☆34Jul 17, 2024Updated last year
- Playing with CSM☆22Mar 14, 2025Updated last year
- The task aims at extracting required fields in receipts captured by mobile devices☆34Nov 4, 2022Updated 3 years ago
- This is an LLM interface that you can use to analyze and get insight into diary entries or other documents completely offline.☆16Dec 31, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Llama.cpp runner/swapper and proxy that emulates LMStudio / Ollama backends☆59Aug 21, 2025Updated 10 months ago
- ☆23Jun 1, 2024Updated 2 years ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆26Mar 28, 2025Updated last year
- ☆31Apr 22, 2024Updated 2 years ago
- Offline LLM chatbot with personalized memory — works on CPU with multi-session memory support.☆22Jan 10, 2026Updated 5 months ago
- A static file containing a list of popular RSS feeds.☆13Aug 25, 2016Updated 9 years ago
- RAG AI Agent with Realtime Source Validation (Human in the Loop) - Built with CopilotKit + Pydantic AI☆63Dec 21, 2025Updated 6 months ago
- this is a Manual Named-Entities/Part-of-speech Tagger for Spacy, You can use it to create your own training datasets.☆12Jun 16, 2018Updated 8 years ago
- This file maps a given list of company names to their proper website and also maps a give list of websites to the company name.☆15Nov 16, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Dec 16, 2024Updated last year
- ☆12Sep 22, 2024Updated last year
- 🐱💻A key-stroke logging application for windows, also capable of capturing mouse window clicks and send event logs to a remote server☆14Jul 2, 2021Updated 5 years ago
- ☆12Aug 31, 2023Updated 2 years ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Mar 2, 2024Updated 2 years ago
- Mic-controlled mouse clicks☆17Oct 6, 2025Updated 8 months ago
- TLS & API keys for your LLM APIs☆20Dec 17, 2025Updated 6 months ago
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆25Sep 1, 2025Updated 10 months ago
- ☆11May 2, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Nov 12, 2020Updated 5 years ago
- Generate Structured JSON with probs from Language Models☆17Mar 23, 2025Updated last year
- An extension that lets the AI take the wheel, allowing it to use the mouse and keyboard, recognize UI elements, and prompt itself :3...no…☆128Oct 22, 2024Updated last year
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated last year
- Create RP training data from a VN, using GPT-4☆19Nov 2, 2023Updated 2 years ago
- Visualize Action Recognition Models☆11Apr 21, 2017Updated 9 years ago
- 3x Faster Inference; Unofficial implementation of EAGLE Speculative Decoding☆84Jul 3, 2025Updated 11 months ago