DD-DuDa / BitDecoding

A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.
34Updated 2 weeks ago

Alternatives and similar repositories for BitDecoding:

Users that are interested in BitDecoding are comparing it to the libraries listed below