pepijndevos / llama_multiserver
A proxy that hosts multiple single-model runners such as LLama.cpp and vLLM
☆12Updated 2 months ago
Alternatives and similar repositories for llama_multiserver:
Users that are interested in llama_multiserver are comparing it to the libraries listed below
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆34Updated last week
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆29Updated this week
- ☆44Updated last month
- Easily view and modify JSON datasets for large language models☆71Updated last week
- ☆24Updated last month
- My version of an LLM Websearch Agent using a local SearXNG server because SearXNG is great.☆23Updated last week
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools.☆20Updated this week
- Large-Language-Model to Machine Interface project.☆17Updated last year
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆22Updated 8 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆26Updated last month
- Local LLM inference & management server with built-in OpenAI API☆31Updated 10 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆53Updated last week
- Writing Extension for Text Generation WebUI☆49Updated last month
- Something similar to Apple Intelligence?☆59Updated 8 months ago
- Build HTML artefacts with Ollama☆11Updated 2 months ago
- An F/OSS solution combining AI with Wikipedia knowledge via a RAG pipeline☆31Updated last month
- Realtime tts reading of large textfiles by your favourite voice. +Translation via LLM (Python script)☆53Updated 4 months ago
- SoftWhisper simplifies audio and video transcription using the powerful Whisper model. Easily select custom models, languages, and tasks,…☆48Updated 4 months ago
- ☆16Updated 2 months ago
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆69Updated 5 months ago
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆37Updated 6 months ago
- ☆22Updated 6 months ago
- run ollama & gguf easily with a single command☆49Updated 9 months ago
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆54Updated 3 weeks ago
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆41Updated 5 months ago
- ☆44Updated 3 months ago
- A simple experiment on letting two local LLM have a conversation about anything!☆104Updated 8 months ago