thu-nics / MoA

The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
100Updated last week

Related projects

Alternatives and complementary repositories for MoA