facebookresearch / chai

CHAI is a library for dynamic pruning of attention heads for efficient LLM inference.
13Updated 4 months ago

Alternatives and similar repositories for chai:

Users that are interested in chai are comparing it to the libraries listed below