SkyworkAI / MoH

MoH: Multi-Head Attention as Mixture-of-Head Attention
143Updated 2 weeks ago

Related projects

Alternatives and complementary repositories for MoH