Smart OpenAI‑compatible proxy for llama.cpp: manages slots, saves/restores KV cache to disk, routes requests by prefix similarity, and protects hot slots from being overwritten. Accelerates long prompts (30–60k tokens) via instant reuse or fast on‑demand restore; supports SSE streaming and non‑stream JSON over /v1/chat/completions.
☆40Nov 14, 2025Updated 6 months ago
Alternatives and similar repositories for proxycache
Users that are interested in proxycache are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Proxy for OpenAI☆16Sep 2, 2025Updated 8 months ago
- NPX/Docker package that creates Ollama API server and forward requests to Gemni/OpenAI/Deepseek/Kimi K2. Mainly purpose to use Free tier …☆32Oct 5, 2025Updated 7 months ago
- Copilot Chat extension for VS Code☆17Jul 17, 2025Updated 10 months ago
- world's stupidest moe llm in 103M parameters☆20Jul 18, 2025Updated 10 months ago
- Skills for creating high quality skills and agents☆148Apr 5, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- triton for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆46Dec 8, 2025Updated 5 months ago
- Interactive terminals for AI agents, built for what you can't --yes away. SSH+MFA, GRUB/U-Boot, debconf installers, SOL/serial consoles, …☆79Mar 18, 2026Updated 2 months ago
- ☆17Jun 16, 2024Updated last year
- ☆30Nov 5, 2024Updated last year
- Helper scripts to be run on Frappe sites☆16May 8, 2026Updated 2 weeks ago
- MCP tools for Rust Context Engineering (rustdocs, rust analyzer)☆17Feb 8, 2026Updated 3 months ago
- Push Notification Relay Server for Frappe Apps☆12Dec 24, 2025Updated 5 months ago
- Proxies ToRadio and FromRadio packets between a single Meshtastic device and multiple websocket clients.☆13Apr 20, 2024Updated 2 years ago
- TaskFlowAI is a lightweight and flexible framework designed for creating AI-driven task pipelines and multi-agent workflows. It provides …☆17Nov 18, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CLI secret management☆16May 15, 2026Updated last week
- Awesome AI Benchmarks☆32Jan 16, 2026Updated 4 months ago
- IceCash. Касса Linux. Рабочее место кассира под linux с использованием web интерфейса. С драйвером к Штрих-М ФРК.☆25Jun 9, 2015Updated 10 years ago
- A tiny PID 1 for containers, written in x86-64 NASM and ARM64 GAS.☆19Feb 23, 2026Updated 3 months ago
- Scripts and tools for optimizing quantizations in llama.cpp with GGUF imatrices.☆19Jan 10, 2025Updated last year
- ☆23Mar 26, 2026Updated 2 months ago
- "Bubble Universe" display hack☆15Oct 17, 2023Updated 2 years ago
- A list WebGL/WebGPU/WebXR of libraries and frameworks.☆21Jan 3, 2024Updated 2 years ago
- Transform YouTube videos into a compounding knowledge base with transcripts, vision analysis, and agentic search. Works as an MCP server …☆109Apr 13, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- “There is no such thing as a moral or an immoral book. Books are well written, or badly written.” I want to find all the well written con…☆20Nov 6, 2024Updated last year
- Jailer is an eBPF-based process jailing system that provides mandatory access control (MAC) for Linux. It tracks processes using BPF task…☆52Mar 16, 2026Updated 2 months ago
- Прокси-сервер для подключения Алисы к Dialogflow☆13Mar 23, 2021Updated 5 years ago
- ☆11Nov 10, 2024Updated last year
- ☆16Dec 16, 2024Updated last year
- AdaLLM is an NVFP4-first inference runtime for Ada Lovelace (RTX 4090) with FP8 KV cache and custom decode kernels. This repo targets NVF…☆120Feb 15, 2026Updated 3 months ago
- Connect Channel Messenger, Zalo, Viber, Skype, Telegram☆13Nov 5, 2018Updated 7 years ago
- Blinka makes her debut on the big screen! With this library you can use CircuitPython displayio code on PC and Raspberry Pi to output to …☆13Feb 21, 2026Updated 3 months ago
- Working with LLM in C#☆16Apr 1, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An OpenSource implementation of remixicon for React Native☆41Nov 11, 2025Updated 6 months ago
- Downloads books from the amazon web reader☆31Oct 15, 2025Updated 7 months ago
- Publish updates to several social networks simultaneously☆16Apr 12, 2015Updated 11 years ago
- KodiPlay☆12Apr 4, 2023Updated 3 years ago
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆39Dec 2, 2025Updated 5 months ago
- Рускоговорящий GLaDOS анти-ассистент☆14Jun 23, 2024Updated last year
- Build custom ReST api's on top of Frappe☆22Nov 25, 2021Updated 4 years ago