mostlygeek / llama-swapLinks
Model swapping for llama.cpp (or any local OpenAI API compatible server)
☆1,499Updated this week
Alternatives and similar repositories for llama-swap
Users that are interested in llama-swap are comparing it to the libraries listed below
Sorting:
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,047Updated 2 weeks ago
- llama.cpp fork with additional SOTA quants and improved performance☆1,165Updated this week
- Manifold is a platform for enabling workflow automation using AI assistants.☆458Updated last month
- What If Language Models Expertly Routed All Inference? WilmerAI allows prompts to be routed to specialized workflows based on the domain …☆768Updated last week
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆491Updated last week
- LLM Frontend in a single html file☆643Updated 7 months ago
- Large-scale LLM inference engine☆1,543Updated this week
- ☆1,115Updated this week
- Effortlessly run LLM backends, APIs, frontends, and services with one command.☆2,037Updated 2 weeks ago
- An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.☆808Updated 7 months ago
- ☆223Updated 4 months ago
- VS Code extension for LLM-assisted code/text completion☆948Updated last week
- Go manage your Ollama models☆1,417Updated 3 weeks ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆96Updated 2 weeks ago
- Big & Small LLMs working together☆1,142Updated this week
- Open-source LLMOps platform for hosting and scaling AI in your own infrastructure 🏓🦙☆1,292Updated this week
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆337Updated 6 months ago
- Web UI for ExLlamaV2☆513Updated 7 months ago
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.☆831Updated 4 months ago
- Create Custom LLMs☆1,730Updated 3 weeks ago
- tl/dw (Too Long, Didn't Watch): Your Personal Research Multi-Tool - a naive attempt at 'A Young Lady's Illustrated Primer' (Open Source N…☆1,044Updated this week
- A proxy server for multiple ollama instances with Key security☆489Updated last week
- Docs for GGUF quantization (unofficial)☆257Updated last month
- Code execution utilities for Open WebUI & Ollama☆296Updated 10 months ago
- Easy to use interface for the Whisper model optimized for all GPUs!☆294Updated last month
- The Fastest Way to Fine-Tune LLMs Locally☆317Updated 5 months ago
- InferX is a Inference Function as a Service Platform☆132Updated this week
- Lemonade helps users run local LLMs with the highest performance by configuring state-of-the-art inference engines for their NPUs and GPU…☆1,233Updated this week
- AlwaysReddy is a LLM voice assistant that is always just a hotkey away.☆754Updated 6 months ago
- ☆201Updated 3 weeks ago