VITA-Group / Random-MoE-as-Dropout

[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang
48Updated last year

Alternatives and similar repositories for Random-MoE-as-Dropout:

Users that are interested in Random-MoE-as-Dropout are comparing it to the libraries listed below