yaof20 / DenseMixerLinks

Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
35Updated this week

Alternatives and similar repositories for DenseMixer

Users that are interested in DenseMixer are comparing it to the libraries listed below

Sorting: