agwaBom / towards_moe
Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022
☆10Updated 2 years ago
Alternatives and similar repositories for towards_moe:
Users that are interested in towards_moe are comparing it to the libraries listed below
- LISA for ICML 2022☆47Updated last year
- Discover and Cure: Concept-aware Mitigation of Spurious Correlation (ICML 2023)☆41Updated 11 months ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆27Updated 10 months ago
- ☆28Updated 8 months ago
- Code for "Surgical Fine-Tuning Improves Adaptation to Distribution Shifts" published at ICLR 2023☆29Updated last year
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Updated 2 years ago
- ☆44Updated 2 years ago
- Code for the ICLR 2022 paper "Attention-based interpretability with Concept Transformers"☆40Updated last year
- ☆16Updated 11 months ago
- ☆25Updated 11 months ago
- ☆66Updated 3 years ago
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆32Updated 6 months ago
- Official implementation of ORCA proposed in the paper "Cross-Modal Fine-Tuning: Align then Refine"☆71Updated last year
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- Code for "Just Train Twice: Improving Group Robustness without Training Group Information"☆71Updated 10 months ago
- Code for the ICLR 2021 Paper "In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness"☆12Updated 3 years ago
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Updated last year
- ☆38Updated 4 months ago
- ☆26Updated last year
- ☆28Updated last year
- Implementation of Beyond Neural Scaling beating power laws for deep models and prototype-based models☆33Updated this week
- ☆38Updated 3 years ago
- Code for Environment Inference for Invariant Learning (ICML 2021 Paper)☆50Updated 3 years ago
- ☆18Updated 8 months ago
- Repo for the paper: "Agree to Disagree: Diversity through Disagreement for Better Transferability"☆35Updated 2 years ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆29Updated 2 years ago
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated last year
- ☆29Updated last year
- ☆60Updated 3 years ago
- Benchmark for Natural Temporal Distribution Shift (NeurIPS 2022)☆65Updated 2 years ago