jllllll / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
64Updated last year

Alternatives and similar repositories for exllama

Users that are interested in exllama are comparing it to the libraries listed below

Sorting: