qiuzh20 / gated_attention

The official implementation for Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
17Updated this week

Alternatives and similar repositories for gated_attention

Users that are interested in gated_attention are comparing it to the libraries listed below

Sorting: