tonbistudio / turboquant-pytorchView on GitHub
From-scratch PyTorch implementation of Google's TurboQuant (ICLR 2026) for LLM KV cache compression. 5x compression at 3-bit with 99.5% attention fidelity.
436Mar 25, 2026Updated this week

Alternatives and similar repositories for turboquant-pytorch

Users that are interested in turboquant-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?