airnsk / proxycache
View external linksLinks

Smart OpenAI‑compatible proxy for llama.cpp: manages slots, saves/restores KV cache to disk, routes requests by prefix similarity, and protects hot slots from being overwritten. Accelerates long prompts (30–60k tokens) via instant reuse or fast on‑demand restore; supports SSE streaming and non‑stream JSON over /v1/chat/completions.
31Nov 14, 2025Updated 3 months ago

Alternatives and similar repositories for proxycache

Users that are interested in proxycache are comparing it to the libraries listed below

Sorting:

Are these results useful?