piotrpiekos / MoSAView on GitHub
User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
28May 3, 2025Updated 10 months ago

Alternatives and similar repositories for MoSA

Users that are interested in MoSA are comparing it to the libraries listed below

Sorting:

Are these results useful?