Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on different ports and loading/unloading them on demand
☆88Feb 7, 2026Updated 2 weeks ago
Alternatives and similar repositories for large-model-proxy
Users that are interested in large-model-proxy are comparing it to the libraries listed below
Sorting:
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- A proxy that hosts multiple single-model runners such as LLama.cpp and vLLM☆12May 30, 2025Updated 8 months ago
- ☆17Dec 16, 2024Updated last year
- ☆20Sep 28, 2024Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- run ollama & gguf easily with a single command☆52May 15, 2024Updated last year
- This GUI aims to simplify the process of converting GGUF files to llamafile format by providing an intuitive and convenient way for users…☆14Jan 2, 2026Updated last month
- A library and CLI utilities for managing performance states of NVIDIA GPUs.☆33Oct 6, 2024Updated last year
- Synthify: Seamlessly generate ai datasets with a no-code UI | https://synthify.toolstack.run☆48Feb 9, 2025Updated last year
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆47Sep 26, 2024Updated last year
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Feb 10, 2025Updated last year
- Large-Language-Model to Machine Interface project.☆19Dec 5, 2023Updated 2 years ago
- TLS & API keys for your LLM APIs☆20Dec 17, 2025Updated 2 months ago
- ☆210Jan 5, 2026Updated last month
- A frontend for creative writing with LLMs☆148Jul 15, 2024Updated last year
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc☆2,445Updated this week
- Simple LLM inference server☆20Jun 13, 2024Updated last year
- ☆22Aug 9, 2024Updated last year
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated 9 months ago
- ☆20Aug 12, 2024Updated last year
- GPU Power and Performance Manager☆68Oct 13, 2024Updated last year
- Autonomous, agentic, creative story writing system that incorporates stored embeddings and Knowledge Graphs.☆95Feb 16, 2026Updated last week
- The one who calls upon functions - Function-Calling Language Model☆36Oct 2, 2023Updated 2 years ago
- This project is a reverse-engineered version of Figma's tone changer. It uses Groq's Llama-3-8b for high-speed inference and to adjust th…☆90Jul 26, 2024Updated last year
- Tool for loading and testing native shaders translated from crosstl☆13Dec 15, 2024Updated last year
- 🎮 Material You TUI for monitoring NVIDIA GPUs☆58Jan 16, 2026Updated last month
- Prometheus exporter for Linux based GDDR6/GDDR6X VRAM and GPU Core Hot spot temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆24Oct 2, 2024Updated last year
- An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API.☆17Jul 21, 2025Updated 7 months ago
- ☆15Apr 9, 2025Updated 10 months ago
- REBUS: A Robust Evaluation Benchmark of Understanding Symbols☆13Aug 13, 2024Updated last year
- ☆12May 30, 2025Updated 8 months ago
- Evolutionary Search for expert-level performance on any task with environmental feedback☆14Oct 12, 2025Updated 4 months ago
- "a towel is about the most massively useful thing an interstellar AI hitchhiker can have"☆48Oct 9, 2024Updated last year
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆30Mar 20, 2025Updated 11 months ago
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- Efficient visual programming for AI language models☆360May 13, 2025Updated 9 months ago
- a metaprogramming language that compiles from types☆10Jun 26, 2024Updated last year
- code for training and using chess embeddings models☆13Jun 9, 2024Updated last year
- Open source static analysis toolkit for LLM agent plans☆13Aug 9, 2025Updated 6 months ago