LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS
β48Apr 23, 2026Updated last week
Alternatives and similar repositories for LibMoE
Users that are interested in LibMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2025] π CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.β29Apr 21, 2025Updated last year
- β10Mar 14, 2021Updated 5 years ago
- β22Jul 30, 2023Updated 2 years ago
- Official Release of NeurIPS 2024 paper "Slot State Space Models"β11Mar 22, 2025Updated last year
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Modelβ13Feb 11, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Trackingβ13May 3, 2024Updated 2 years ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"β19Mar 10, 2025Updated last year
- β11Jul 25, 2021Updated 4 years ago
- [NeurIPS 2025] ExGra-Med: Medical Multi-Modal LLM with Extended Context Alignmentβ41Apr 7, 2026Updated 3 weeks ago
- [ICRA 2024] Language-Conditioned Affordance-Pose Detection in 3D Point Cloudsβ52Jan 10, 2025Updated last year
- β15Jan 24, 2025Updated last year
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Modelsβ61Feb 7, 2025Updated last year
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"β19Jan 25, 2025Updated last year
- The code of γM4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysisγβ14Mar 31, 2025Updated last year
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.β16Apr 8, 2025Updated last year
- β12Apr 17, 2023Updated 3 years ago
- [ACL 2026 Main] Analytical FFN-to-MoE Restructuring via Activation Pattern Analysisβ38Apr 24, 2026Updated last week
- β17Mar 20, 2025Updated last year
- Implementation of Online Hedge Backpropagationβ52Jun 20, 2018Updated 7 years ago
- Scaling Laws for Mixture of Experts Modelsβ15Feb 25, 2025Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".β30Nov 12, 2024Updated last year
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformersβ26Jun 7, 2023Updated 2 years ago
- A web app for both Text-based and Visual Question Answering.β13Nov 13, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Exβ¦β25Oct 13, 2025Updated 6 months ago
- Implementation for MomentumSMoEβ19Apr 19, 2025Updated last year
- Python implementation of the supervised graph prediction method proposed in http://arxiv.org/abs/2202.03813 using PyTorch library and POTβ¦β15Feb 25, 2022Updated 4 years ago
- English-Vietnamese Machine Translation using Transformer (Pytorch)β12Jun 30, 2023Updated 2 years ago
- [KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generationsβ160Mar 29, 2026Updated last month
- A PyTorch implementation of The ICLR 2019 paper "Invariant and Equivariant Graph Networks" by Haggai Maron, Heli Ben-Hamu, Nadav Shamir aβ¦β17Mar 7, 2022Updated 4 years ago
- My personal blog about AI, ML and DL πβ11Aug 23, 2023Updated 2 years ago
- lanox vim themeβ15Mar 22, 2016Updated 10 years ago
- Mixture-of-Experts Multimodal Variational Autoencoderβ15Jul 3, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusionβ14Mar 17, 2025Updated last year
- [ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initializationβ24Oct 5, 2025Updated 6 months ago
- MGPATHβ14Oct 15, 2025Updated 6 months ago
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"β63Jan 5, 2026Updated 3 months ago
- β16Jan 30, 2022Updated 4 years ago
- [ACL 2024] Novel reranking method to select the best solutions for code generationβ16Jun 9, 2024Updated last year
- Trustworthy Knowledge Graph Completion Based on Multi-sourced Noisy Data, WWW 2022β14Apr 6, 2022Updated 4 years ago