PacifAIst / QuanslothView on GitHub
Based on the implementation of Google's TurboQuant (ICLR 2026) — Quansloth brings elite KV cache compression to local LLM inference. Quansloth is a fully private, air-gapped AI server that runs massive context models natively on consumer hardware with ease
55Apr 6, 2026Updated this week

Alternatives and similar repositories for Quansloth

Users that are interested in Quansloth are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?