okoge-kaz / moe-recipes

Ongoing research training Mixture of Expert models.
18Updated this week

Related projects: