piotrpiekos / MoSAView on GitHub
User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice routing providing a content-based sparse attention mechanism.
28May 3, 2025Updated 10 months ago

Alternatives and similar repositories for MoSA

Users that are interested in MoSA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?