π FlexLLama - Lightweight self-hosted tool for running multiple llama.cpp server instances with OpenAI v1 API compatibility and multi-GPU support
β55Mar 5, 2026Updated last month
Alternatives and similar repositories for flexllama
Users that are interested in flexllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The most feature-complete local AI workstation. Multi-GPU inference, integrated Stable Diffusion + ADetailer, voice cloning, research-graβ¦β59Feb 24, 2026Updated last month
- A Python-based chat application utilizing a Local LLM to generate complex thought chains for various use cases such as product developmenβ¦β20Feb 18, 2026Updated last month
- OpenAPI specifications => MCP (Model Context Protocol) toolsβ19Dec 9, 2024Updated last year
- ACE-Step: A Step Towards Music Generation Foundation Modelβ51May 20, 2025Updated 10 months ago
- [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Trainingβ47Jul 18, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Measuring Thinking Efficiency in Reasoning Models - Research Repositoryβ39Dec 2, 2025Updated 4 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools such as web search β¦β52Feb 10, 2026Updated 2 months ago
- β66Jun 24, 2025Updated 9 months ago
- An fully autonomous agent that accesses the browser and performs tasks.β18Apr 25, 2025Updated 11 months ago
- Visually select, search, and copy your code into your clipboard for LLM context.β26May 18, 2025Updated 10 months ago
- Personal voice assistant, with voice interruption and Twilio supportβ18Feb 24, 2025Updated last year
- β17Dec 16, 2024Updated last year
- β24Jan 22, 2025Updated last year
- A forward proxy to turn network traffic into personal memory for AI agentsβ37Mar 30, 2026Updated last week
- NordVPN Special Discount Offer β’ AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- β21Jul 25, 2025Updated 8 months ago
- A lightweight LLaMA.cpp HTTP server Docker image based on Alpine Linux.β33Oct 3, 2025Updated 6 months ago
- FlexAudioPrint is a Python-based app for transcribing audio to text using OpenAI's Whisper model. It offers a Gradio web interface and a β¦β10Jan 29, 2026Updated 2 months ago
- Automates the creation of full-text (sound and text) ebooks in epub/epub3/daisy format, the webserver/client creates smil files to sync aβ¦β10Nov 12, 2021Updated 4 years ago
- Cleanai (https://github.com/willmil11/cleanai) except I'm making it in c now. Fast and clean from the start this time :)β17Mar 6, 2026Updated last month
- Anonymize sensitive data in a controlled, pseudo-random wayβ14Dec 18, 2015Updated 10 years ago
- Llama.cpp runner/swapper and proxy that emulates LMStudio / Ollama backendsβ55Aug 21, 2025Updated 7 months ago
- β12May 30, 2025Updated 10 months ago
- Crashbench is a LLM benchmark to measure bug-finding and reporting capabilities of LLMsβ14Mar 8, 2026Updated last month
- End-to-end encrypted cloud storage - Proton Drive β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Enable tool/function calling for any LLM, in OpenAI and Ollama API formats, adding universal function calling to models without native suβ¦β75Dec 9, 2025Updated 4 months ago
- β12Apr 21, 2025Updated 11 months ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.β34Feb 11, 2026Updated last month
- Simple node proxy for llama-server that enables MCP useβ19May 10, 2025Updated 11 months ago
- General Tool-calling API Proxyβ60Mar 26, 2026Updated 2 weeks ago
- π³ MCTS-inspired parallel beam search for conversation optimization. Explore multiple dialogue strategies simultaneously, stress-test aβ¦β35Jan 18, 2026Updated 2 months ago
- AutoTile tileset generator for Unityβ10Jul 5, 2019Updated 6 years ago
- A reverse proxy manager written in go, to convert exposed ports into token-based auth protected portsβ20Apr 14, 2025Updated 11 months ago
- Offline LLM chatbot with personalized memory β works on CPU with multi-session memory support.β22Jan 10, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Orchestrator Kit for Agentic Reasoning - OrKa is a modular AI orchestration system that transforms Large Language Models (LLMs) into compβ¦β93Mar 25, 2026Updated 2 weeks ago
- Qt and QML based Close Combat-like game.β16Aug 3, 2013Updated 12 years ago
- code for Towards Data Science article on prompt-loss-weightβ11Jun 4, 2025Updated 10 months ago
- A Python script to auto-detect and auto-crop a person in a imageβ16Mar 7, 2026Updated last month
- Moondream MCP Server in Pythonβ44Jul 2, 2025Updated 9 months ago
- Simple HTML template library for C++β15Feb 3, 2021Updated 5 years ago
- An enchanced useState hook which keeps track of the states history, allowing you to undo and redo states.β15Apr 20, 2022Updated 3 years ago