piotrpiekos / MoSALinks

User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
22Updated 2 months ago

Alternatives and similar repositories for MoSA

Users that are interested in MoSA are comparing it to the libraries listed below

Sorting: