piotrpiekos / MoSALinks

User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
23Updated 3 months ago

Alternatives and similar repositories for MoSA

Users that are interested in MoSA are comparing it to the libraries listed below

Sorting: