piotrpiekos / MoSALinks

User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
16Updated last month

Alternatives and similar repositories for MoSA

Users that are interested in MoSA are comparing it to the libraries listed below

Sorting: