Fsoft-AIC / LibMoELinks
LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS
☆46Updated 3 weeks ago
Alternatives and similar repositories for LibMoE
Users that are interested in LibMoE are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆76Updated 11 months ago
- Codes for Merging Large Language Models☆35Updated last year
- [ICLR 2025] CAMEx: Curvature-Aware Merging of Experts☆22Updated 11 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆61Updated last year
- ☆30Updated 2 years ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆51Updated last year
- ☆206Updated last year
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆203Updated last year
- ☆29Updated last year
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆141Updated last year
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆152Updated 6 months ago
- [EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models☆84Updated last year
- Awesome Low-Rank Adaptation☆59Updated 5 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆100Updated last year
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆57Updated last year
- Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)☆55Updated 7 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆85Updated 9 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆58Updated 11 months ago
- ☆152Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆85Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆40Updated last year
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆143Updated 9 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆88Updated 4 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆101Updated last year
- CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for task-aware parameter-efficient fine-tuning(NeurIPS 2024)☆53Updated last year
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Updated last year
- ☆28Updated last year
- ☆141Updated 10 months ago
- A Sober Look at Language Model Reasoning☆92Updated 2 months ago
- An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFT☆132Updated 10 months ago