Fsoft-AIC / LibMoE
LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS
☆37Updated 2 weeks ago
Alternatives and similar repositories for LibMoE
Users that are interested in LibMoE are comparing it to the libraries listed below
Sorting:
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation☆35Updated last year
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 6 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆36Updated 11 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆58Updated 7 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆67Updated 2 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆54Updated 8 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆73Updated 6 months ago
- [ICLR 2025] 🚀 CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.☆23Updated 3 weeks ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆58Updated 2 months ago
- ☆29Updated last year
- Code for Heima☆42Updated 3 weeks ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆51Updated 3 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated 2 months ago
- ☆27Updated last year
- Residual Prompt Tuning: a method for faster and better prompt tuning.☆54Updated 2 years ago
- ☆97Updated 2 months ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆31Updated last year
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆72Updated 6 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆78Updated 6 months ago
- [ICML 2024] One Prompt is Not Enough: Automated Construction of a Mixture-of-Expert Prompts - TurningPoint AI☆23Updated 7 months ago
- ☆177Updated last year
- ☆41Updated 6 months ago
- Codes for Merging Large Language Models☆29Updated 9 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆76Updated 8 months ago
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆24Updated 11 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆38Updated last year
- ☆41Updated 9 months ago
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)☆32Updated 7 months ago
- ☆17Updated 4 months ago
- ☆10Updated 3 weeks ago