mostlygeek / llama-swapLinks
Reliable model swapping for any local OpenAI compatible server - llama.cpp, vllm, etc
☆1,862Updated last week
Alternatives and similar repositories for llama-swap
Users that are interested in llama-swap are comparing it to the libraries listed below
Sorting:
- llama.cpp fork with additional SOTA quants and improved performance☆1,315Updated this week
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,083Updated last week
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆558Updated last week
- Manifold is a platform for enabling workflow automation using AI assistants.☆464Updated last week
- WilmerAI is one of the oldest LLM semantic routers. It uses multi-layer prompt routing and complex workflows to allow you to not only cre…☆786Updated last month
- VS Code extension for LLM-assisted code/text completion☆1,043Updated last week
- ☆1,193Updated this week
- Effortlessly run LLM backends, APIs, frontends, and services with one command.☆2,136Updated 2 weeks ago
- LLM Frontend in a single html file☆659Updated this week
- Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.☆427Updated last week
- Large-scale LLM inference engine☆1,583Updated this week
- Lemonade helps users run local LLMs with the highest performance by configuring state-of-the-art inference engines for their NPUs and GPU…☆1,590Updated last week
- An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.☆830Updated 9 months ago
- Go manage your Ollama models☆1,546Updated last month
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.☆869Updated 6 months ago
- ☆226Updated 6 months ago
- The AI toolkit for the AI developer☆1,052Updated this week
- Docs for GGUF quantization (unofficial)☆308Updated 3 months ago
- ☆484Updated last week
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.☆238Updated last week
- Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙☆1,360Updated 3 weeks ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆97Updated last week
- Web UI for ExLlamaV2☆511Updated 9 months ago
- Big & Small LLMs working together☆1,194Updated this week
- RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for in…☆2,296Updated this week
- The Fastest Way to Fine-Tune LLMs Locally☆325Updated 7 months ago
- LocalAGI is a powerful, self-hostable AI Agent platform designed for maximum privacy and flexibility. A complete drop-in replacement for …☆1,325Updated this week
- Easy to use interface for the Whisper model optimized for all GPUs!☆383Updated 3 months ago
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆341Updated 8 months ago
- OpenAPI Tool Servers☆744Updated last month