gpustack / llama-box
LLM inference server implementation based on llama.cpp.
☆25Updated this week
Related projects ⓘ
Alternatives and complementary repositories for llama-box
- HTTP proxy for on-demand model loading with llama.cpp (or other OpenAI compatible backends)☆38Updated this week
- Local LLM inference & management server with built-in OpenAI API☆31Updated 6 months ago
- LLM inference in C/C++☆11Updated 2 months ago
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆12Updated 9 months ago
- ☆18Updated 2 weeks ago
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆88Updated last month
- Large Model Proxy is designed to make it easy to run multiple resource-heavy Large Models (LM) on the same machine with limited amount of…☆46Updated last month
- ☆24Updated this week
- Demo of an "always-on" AI assistant.☆23Updated 8 months ago
- Something similar to Apple Intelligence?☆57Updated 4 months ago
- automatically quant GGUF models☆137Updated this week
- Mycomind Daemon: A mycelium-inspired, advanced Mixture-of-Memory-RAG-Agents (MoMRA) cognitive assistant that combines multiple AI models …☆30Updated 4 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆54Updated 2 months ago
- Easily view and modify JSON datasets for large language models☆62Updated last month
- ☆31Updated 10 months ago
- Port of Suno AI's Bark in C/C++ for fast inference☆54Updated 6 months ago
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…☆56Updated 2 months ago
- Course Project for COMP4471 on RWKV☆16Updated 9 months ago
- ☆25Updated last month
- llama.cpp fork with additional SOTA quants and improved performance☆89Updated this week
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆28Updated 3 months ago
- ☆17Updated 3 weeks ago
- ☆53Updated 5 months ago
- Golang web client for Ollama, fast and easy to use.☆23Updated 3 weeks ago
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆22Updated 9 months ago
- Terminal Voice Assistant is a powerful and flexible tool designed to help users interact with their terminal using natural language comma…☆16Updated 5 months ago
- Mixture-of-Ollamas☆26Updated 3 months ago
- Polyglot is a fast, elegant, and free translation tool using AI.☆51Updated 2 months ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆20Updated 4 months ago
- cli tool to quantize gguf, gptq, awq, hqq and exl2 models☆62Updated last month