SkyworkAI / MoH

MoH: Multi-Head Attention as Mixture-of-Head Attention
157Updated 3 weeks ago

Related projects

Alternatives and complementary repositories for MoH