Towards Understanding the Mixture-of-Experts Layer in Deep Learning
☆35Dec 12, 2023Updated 2 years ago
Alternatives and similar repositories for MoE
Users that are interested in MoE are comparing it to the libraries listed below
Sorting:
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆22Mar 7, 2024Updated 2 years ago
- Neural theorem proving tutorial, version II☆40Apr 26, 2024Updated last year
- [AAAI 2025] Trusted Unified Feature-Neighborhood Dynamics for Multi-View Classification☆19Apr 17, 2025Updated 10 months ago
- Repositorio de la unidad 2 del curso INFO274: Simulación, Instituto de Informática, UACh☆10Dec 18, 2022Updated 3 years ago
- 中国科学院大学太极实验室2025年度“大学生创新实践训练 计划”☆14Apr 1, 2025Updated 11 months ago
- Single-Source Domain Generalization for Bearing Fault Diagnosis Using Feature-Augmented Adaptive Neuro-Fuzzy Inference System☆11Apr 13, 2024Updated last year
- Python Version of Andrew Welter's Hatebase Wrapper☆10Feb 20, 2022Updated 4 years ago
- ☆10Nov 29, 2022Updated 3 years ago
- Code and Dataset for <Quantitative Analysis of Melodic Similarity in Music Copyright Infringement Cases, ISMIR 2024>☆14Nov 12, 2024Updated last year
- Code for the experiments in the ACL 2020 paper "Estimating predictive uncertainty for rumour verification models"☆11May 15, 2020Updated 5 years ago
- Algorithm to detect bursts in the EEG of preterm infants (Python version of an existing Matlab program)☆11Feb 17, 2020Updated 6 years ago
- [IEEE TKDE 2024] Code of "Robust and Consistent Anchor Graph Learning for Multi-view Clustering"☆13Feb 28, 2024Updated 2 years ago
- The code of MetaViewer: Towards A Unified Multi-View Representation (CVPR 2023).☆10Nov 20, 2023Updated 2 years ago
- Pytorch implementation of Detective☆12Jul 11, 2024Updated last year
- A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective is a paper accepted by NeurIPS 2023☆10May 15, 2024Updated last year
- A guide to structured generation using constrained decoding☆14Jun 9, 2024Updated last year
- Machine learning project using federated learning for text generation☆11May 5, 2024Updated last year
- AI-powered CLI tool for automated electrochemical impedance spectroscopy (EIS) data analysis☆13Jan 21, 2025Updated last year
- Engineering Blog article prototypes☆17Oct 12, 2025Updated 4 months ago
- ☆13Jan 31, 2024Updated 2 years ago
- [ICASSP'2025] "M³Rec: Selective State Space Models with Mixture-of-Modality Experts for Multi-Modal Sequential Recommendation"☆11Jul 9, 2025Updated 8 months ago
- ☆11Feb 25, 2025Updated last year
- [ICLR'25] "Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing" by Peihao Wang, Ruisi Cai, Yue…☆17Mar 21, 2025Updated 11 months ago
- Diffusion-based Missing-view Generation With the Application on Incomplete Multi-view Clustering☆10May 26, 2024Updated last year
- a fast implementation of BM25☆10Sep 15, 2022Updated 3 years ago
- Bias Tests for Voice Technologies (bt4vt)☆11Jun 16, 2024Updated last year
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆13Mar 30, 2024Updated last year
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation☆19Nov 28, 2022Updated 3 years ago
- Mit diesem Tool können die aus Fahrzeuglevel pOCV Kurven, sowohl DVA, als auch ICA Kurven abgeleitet werden.☆12Aug 7, 2024Updated last year
- Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022☆10Jan 6, 2023Updated 3 years ago
- Example code accompanying the sternberg concept cell data release for Kyzar et al. (2024)☆12Jan 22, 2024Updated 2 years ago
- Delving into the Continuous Domain Adaptation (ACM MM22)☆12Jul 10, 2022Updated 3 years ago
- Official implementation of the paper "From Optimization to Generalization: Fair Federated Learning against Quality Shift via Inter-Client…☆10Mar 13, 2025Updated 11 months ago
- This is a dataset (paired cloud and cloud-free Sentinel-2A image patches)☆11Jul 14, 2025Updated 7 months ago
- lanmt ebm☆12Jun 19, 2020Updated 5 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- A collection of Roblox Executors.☆12Dec 7, 2024Updated last year
- Deep Double Incomplete Multi-view Multi-label Learning with Incomplete Labels and Missing Views☆13Apr 7, 2024Updated last year
- Applies ROME and MEMIT on Mamba-S4 models☆14Apr 5, 2024Updated last year