Docker compose to run vLLM on Windows
☆116Jan 1, 2024Updated 2 years ago
Alternatives and similar repositories for vllm-windows
Users that are interested in vllm-windows are comparing it to the libraries listed below
Sorting:
- A full-stack document management and AI chat application that enables users to upload, manage, and chat with their documents using AI. Bu…☆17Aug 10, 2025Updated 6 months ago
- Multilingual extension of the SesameAILabs Conversational Speech Generation Model☆29Mar 26, 2025Updated 11 months ago
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Sep 14, 2025Updated 5 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- ESPNet TTS with Streamlit GUI☆14Apr 30, 2023Updated 2 years ago
- Simple node proxy for llama-server that enables MCP use☆17May 10, 2025Updated 9 months ago
- Llama.cpp runner/swapper and proxy that emulates LMStudio / Ollama backends☆52Aug 21, 2025Updated 6 months ago
- XTTSv2 Extension for oobabooga text-generation-webui☆34Jul 17, 2024Updated last year
- Offline LLM chatbot with personalized memory — works on CPU with multi-session memory support.☆22Jan 10, 2026Updated last month
- Categorize credit card transactions using a local large language model similar to GPT3☆15Dec 29, 2023Updated 2 years ago
- GoalChain for goal-orientated LLM conversation flows☆71Dec 2, 2024Updated last year
- This is an LLM interface that you can use to analyze and get insight into diary entries or other documents completely offline.☆16Dec 31, 2023Updated 2 years ago
- ☆29Apr 22, 2024Updated last year
- TLS & API keys for your LLM APIs☆20Dec 17, 2025Updated 2 months ago
- ☆17Dec 16, 2024Updated last year
- Playing with CSM☆22Mar 14, 2025Updated 11 months ago
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆23Sep 1, 2025Updated 6 months ago
- A local-first LLM development studio. Build, test, and customize inference workflows with your own models — no cloud, totally local.☆17May 21, 2025Updated 9 months ago
- Automated LLM novelist☆46Apr 11, 2024Updated last year
- private-machine is an AI companion system with emotion, needs and goals simulation. Very silly, not based on real science.☆29Feb 26, 2026Updated last week
- The smallest Docker image with FPC (FreePascal compiler) (100MB)☆18Apr 6, 2025Updated 11 months ago
- LLM FX: A LLM Server Desktop Client free for everyone!☆36Updated this week
- ☆130Nov 9, 2024Updated last year
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated 9 months ago
- ☆24Jun 1, 2024Updated last year
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆26Mar 28, 2025Updated 11 months ago
- An open source, Gradio-based chatbot app that combines the best of retrieval augmented generation and prompt engineering into an intellig…☆58Aug 1, 2024Updated last year
- Code example of how to call your OpenAI assistant via API (Python).☆24Dec 23, 2023Updated 2 years ago
- The full stack Next.js starter project for vibe coding.☆26May 27, 2025Updated 9 months ago
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.☆26Jun 3, 2024Updated last year
- simple terminal-based AI coding agent. This is for learning purposes more than a final working app.☆27Mar 6, 2025Updated last year
- Autonomous, agentic, creative story writing system that incorporates stored embeddings and Knowledge Graphs.☆95Feb 16, 2026Updated 2 weeks ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Dec 1, 2024Updated last year
- Llama cute voice assistant☆27Sep 10, 2023Updated 2 years ago
- ☆31Mar 26, 2025Updated 11 months ago
- Offline tool that processes YouTube videos using WhisperX for automatic transcription and speaker diarization, detects logical fallacies,…☆29Aug 14, 2024Updated last year
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆35Feb 11, 2026Updated 3 weeks ago
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation☆32Nov 16, 2024Updated last year