Fsoft-AIC / LibMoE
LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS
β37Updated 2 months ago
Alternatives and similar repositories for LibMoE:
Users that are interested in LibMoE are comparing it to the libraries listed below
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillationβ35Updated last year
- [ICLR 2025] π CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.β22Updated 4 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"β89Updated this week
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Modelsβ49Updated 2 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enlaβ¦β57Updated 6 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."β42Updated 5 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"β61Updated 6 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"β37Updated 6 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"β53Updated 7 months ago
- β173Updated last year
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigationβ54Updated 4 months ago
- An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFTβ89Updated last month
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)β31Updated 6 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.β126Updated 2 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)β16Updated 4 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"β36Updated 10 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.β111Updated 7 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.β77Updated 5 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMsβ24Updated 5 months ago
- [ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Modelsβ31Updated 5 months ago
- Code for "Merging Text Transformers from Different Initializations"β20Updated 2 months ago
- β40Updated 5 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoningβ42Updated 8 months ago
- Pioneering in Vietnamese Multimodal Large Language Modelβ46Updated 2 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Modelsβ75Updated 7 months ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasksβ¦β37Updated 4 months ago
- β27Updated last year
- β73Updated 3 weeks ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Mergingβ56Updated last month
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.β65Updated 2 months ago