scrya-com / rotorquantView on GitHub
KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.
653Apr 3, 2026Updated 2 weeks ago

Alternatives and similar repositories for rotorquant

Users that are interested in rotorquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?