mostlygeek / llama-swapLinks
Model swapping for llama.cpp (or any local OpenAPI compatible server)
☆1,333Updated last week
Alternatives and similar repositories for llama-swap
Users that are interested in llama-swap are comparing it to the libraries listed below
Sorting:
- llama.cpp fork with additional SOTA quants and improved performance☆1,050Updated last week
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,031Updated this week
- What If Language Models Expertly Routed All Inference? WilmerAI allows prompts to be routed to specialized workflows based on the domain …☆759Updated this week
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆473Updated last week
- Manifold is a platform for enabling workflow automation using AI assistants.☆457Updated 2 weeks ago
- Large-scale LLM inference engine☆1,524Updated this week
- LLM Frontend in a single html file☆629Updated 7 months ago
- ☆221Updated 3 months ago
- Effortlessly run LLM backends, APIs, frontends, and services with one command.☆1,991Updated 3 weeks ago
- VS Code extension for LLM-assisted code/text completion☆917Updated this week
- ☆1,037Updated this week
- The Fastest Way to Fine-Tune LLMs Locally☆316Updated 5 months ago
- The AI toolkit for the AI developer☆863Updated this week
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆334Updated 5 months ago
- Web UI for ExLlamaV2☆506Updated 6 months ago
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.☆812Updated 4 months ago
- Lightweight Inference server for OpenVINO☆198Updated this week
- An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.☆805Updated 6 months ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆92Updated 2 months ago
- Easy to use interface for the Whisper model optimized for all GPUs!☆274Updated 3 weeks ago
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆209Updated 6 months ago
- Go manage your Ollama models☆1,370Updated 2 weeks ago
- ☆163Updated last week
- Local LLM Powered Recursive Search & Smart Knowledge Explorer☆251Updated 6 months ago
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.☆512Updated last month
- InferX is a Inference Function as a Service Platform☆128Updated this week
- AlwaysReddy is a LLM voice assistant that is always just a hotkey away.☆749Updated 5 months ago
- Open-source LLMOps platform for hosting and scaling AI in your own infrastructure 🏓🦙☆1,098Updated this week
- Run Orpheus 3B Locally With LM Studio☆453Updated 5 months ago
- An AI memory layer with short- and long-term storage, semantic clustering, and optional memory decay for context-aware applications.☆656Updated 7 months ago