Spico197 / MoE-SFT

🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
β˜†38Updated 5 months ago

Alternatives and similar repositories for MoE-SFT:

Users that are interested in MoE-SFT are comparing it to the libraries listed below