facebookresearch / chai

CHAI is a library for dynamic pruning of attention heads for efficient LLM inference.
12Updated 3 months ago

Alternatives and similar repositories for chai:

Users that are interested in chai are comparing it to the libraries listed below