airnsk / proxycacheView on GitHub
Smart OpenAI‑compatible proxy for llama.cpp: manages slots, saves/restores KV cache to disk, routes requests by prefix similarity, and protects hot slots from being overwritten. Accelerates long prompts (30–60k tokens) via instant reuse or fast on‑demand restore; supports SSE streaming and non‑stream JSON over /v1/chat/completions.
35Nov 14, 2025Updated 4 months ago

Alternatives and similar repositories for proxycache

Users that are interested in proxycache are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?