mostlygeek / llama-swapLinks
Model swapping for llama.cpp (or any local OpenAPI compatible server)
☆961Updated this week
Alternatives and similar repositories for llama-swap
Users that are interested in llama-swap are comparing it to the libraries listed below
Sorting:
- llama.cpp fork with additional SOTA quants and improved performance☆584Updated this week
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆987Updated this week
- What If Language Models Expertly Routed All Inference? WilmerAI allows prompts to be routed to specialized workflows based on the domain …☆705Updated this week
- Manifold is a platform for enabling workflow automation using AI assistants.☆435Updated 2 weeks ago
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆408Updated this week
- ☆204Updated last month
- Large-scale LLM inference engine☆1,453Updated this week
- VS Code extension for LLM-assisted code/text completion☆807Updated this week
- Lightweight Inference server for OpenVINO☆186Updated last week
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.☆757Updated 2 months ago
- Easy to use interface for the Whisper model optimized for all GPUs!☆221Updated this week
- LLM Frontend in a single html file☆498Updated 5 months ago
- Web UI for ExLlamaV2☆501Updated 4 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆779Updated this week
- Code execution utilities for Open WebUI & Ollama☆285Updated 7 months ago
- A local AI companion that uses a collection of free, open source AI models in order to create two virtual companions that will follow you…☆217Updated last week
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆573Updated 4 months ago
- Big & Small LLMs working together☆972Updated this week
- The Fastest Way to Fine-Tune LLMs Locally☆305Updated 3 months ago
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆255Updated 3 months ago
- Optimizing inference proxy for LLMs☆2,550Updated this week
- ☆343Updated this week
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆316Updated 3 months ago
- InferX is a Inference Function as a Service Platform☆109Updated this week
- Local LLM Powered Recursive Search & Smart Knowledge Explorer☆243Updated 4 months ago
- OpenAPI Tool Servers☆472Updated 2 weeks ago
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆418Updated last week
- ☆203Updated last month
- The AI toolkit for the AI developer☆724Updated last week
- An AI memory layer with short- and long-term storage, semantic clustering, and optional memory decay for context-aware applications.☆636Updated 5 months ago