agwaBom / towards_moeLinks
Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022
☆10Updated 2 years ago
Alternatives and similar repositories for towards_moe
Users that are interested in towards_moe are comparing it to the libraries listed below
Sorting:
- This repository contains the code of the distribution shift framework presented in A Fine-Grained Analysis on Distribution Shift (Wiles e…☆83Updated 2 months ago
- LISA for ICML 2022☆51Updated 2 years ago
- ☆38Updated 4 years ago
- Code for "Surgical Fine-Tuning Improves Adaptation to Distribution Shifts" published at ICLR 2023☆29Updated 2 years ago
- Online Hyperparameter Optimization☆11Updated 4 years ago
- ☆25Updated last year
- ☆38Updated 9 months ago
- Code for "Just Train Twice: Improving Group Robustness without Training Group Information"☆72Updated last year
- Official repository for Fourier model that can generate periodic signals☆10Updated 3 years ago
- ☆157Updated 4 years ago
- ☆108Updated 2 years ago
- ☆34Updated 3 months ago
- Weighted Training for Cross-Task Learning☆15Updated 2 years ago
- Deep Learning & Information Bottleneck☆61Updated 2 years ago
- Energy-Based Models for Continual Learning Official Repository (PyTorch)☆42Updated 2 years ago
- ☆46Updated 2 years ago
- Learning from Failure: Training Debiased Classifier from Biased Classifier (NeurIPS 2020)☆91Updated 4 years ago
- Code to implement the AND-mask and geometric mean to do gradient based optimization, from the paper "Learning explanations that are hard …☆40Updated 4 years ago
- ☆31Updated last year
- [NeurIPS 2021] A Geometric Analysis of Neural Collapse with Unconstrained Features☆58Updated 3 years ago
- "Understanding Dataset Difficulty with V-Usable Information" (ICML 2022, outstanding paper)☆87Updated last year
- ☆73Updated 3 years ago
- Code used in "Understanding Dimensional Collapse in Contrastive Self-supervised Learning" paper.☆79Updated 2 years ago
- Code for paper "Can contrastive learning avoid shortcut solutions?" NeurIPS 2021.☆47Updated 3 years ago
- Source code for paper "Contrastive Out-of-Distribution Detection for Pretrained Transformers", EMNLP 2021☆40Updated 3 years ago
- Crawl & visualize ICLR papers and reviews☆110Updated 2 years ago
- Use this package to compute intrinsic dimensionality of your task given a fixed neural network in PYTORCH!☆36Updated 2 years ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆31Updated 2 years ago
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆35Updated 3 years ago
- ☆58Updated 2 years ago