yazon / flexllamaLinks
🚀 FlexLLama - Lightweight self-hosted tool for running multiple llama.cpp server instances with OpenAI v1 API compatibility and multi-GPU support
☆46Updated last month
Alternatives and similar repositories for flexllama
Users that are interested in flexllama are comparing it to the libraries listed below
Sorting:
- ☆178Updated 5 months ago
- ☆83Updated 10 months ago
- Python language chat with Ollama models locally, anthropic and openai☆24Updated 9 months ago
- Personal voice assistant, with voice interruption and Twilio support☆18Updated 10 months ago
- ☆203Updated 4 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools such as web search …☆47Updated 4 months ago
- Autonomous, agentic, creative story writing system that incorporates stored embeddings and Knowledge Graphs.☆91Updated this week
- ☆19Updated 6 months ago
- A sleek web interface for Ollama, making local LLM management and usage simple. WebOllama provides an intuitive UI to manage Ollama model…☆60Updated 3 months ago
- Explore the unknown, build the future, own your data.☆225Updated this week
- ☆27Updated 7 months ago
- A real-time shared memory layer for multi-agent LLM systems.☆50Updated 6 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆46Updated 7 months ago
- A persistent local memory for AI, LLMs, or Copilot in VS Code.☆183Updated 2 months ago
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆32Updated 11 months ago
- Crow is a Desktop AI Assistant☆32Updated last year
- Multi-agent autonomous research system using LangGraph and LangChain. Generates citation-backed reports with credibility scoring and web …☆120Updated last week
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆29Updated last month
- Give your local LLM a real memory with a lightweight, fully local memory system. 100% offline and under your control.☆65Updated 3 months ago
- ☆58Updated 10 months ago
- 🗣️ Real‑time, low‑latency voice, vision, and conversational‑memory AI assistant built on LiveKit and local LLMs ✨☆100Updated 6 months ago
- CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It pro…☆67Updated last year
- ☆29Updated 8 months ago
- Run Orpheus 3B Locally With LM Studio☆32Updated 9 months ago
- Llama.cpp runner/swapper and proxy that emulates LMStudio / Ollama backends☆49Updated 4 months ago
- An fully autonomous agent that accesses the browser and performs tasks.☆17Updated 8 months ago
- Realtime tts reading of large textfiles by your favourite voice. +Translation via LLM (Python script)☆52Updated last year
- The most feature-complete local AI workstation. Multi-GPU inference, integrated Stable Diffusion + ADetailer, voice cloning, research-gra…☆41Updated this week
- Cognito: Supercharge your Chrome browser with AI. Guide, query, and control everything using natural language.☆56Updated last month
- A web application that converts speech to speech 100% private☆81Updated 7 months ago