Fsoft-AIC / LibMoELinks
LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS
☆41Updated 3 months ago
Alternatives and similar repositories for LibMoE
Users that are interested in LibMoE are comparing it to the libraries listed below
Sorting:
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation☆35Updated last year
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆69Updated 6 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆42Updated last year
- ☆122Updated 6 months ago
- [ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆73Updated 2 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆81Updated 10 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆123Updated 2 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆46Updated 11 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆56Updated last year
- ☆69Updated 10 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆84Updated 7 months ago
- ☆190Updated last year
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆89Updated 10 months ago
- ☆101Updated 5 months ago
- ☆27Updated last year
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆70Updated 11 months ago
- Code for Heima☆53Updated 5 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆61Updated 11 months ago
- Awesome Low-Rank Adaptation☆44Updated last month
- ☆166Updated 4 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆74Updated 3 months ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆150Updated 2 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆134Updated last year
- Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)☆85Updated 3 months ago
- A Sober Look at Language Model Reasoning☆83Updated last week
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆129Updated 2 months ago
- Code for "Merging Text Transformers from Different Initializations"☆20Updated 7 months ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆176Updated 5 months ago
- [NeurIPS 2025🔥]Main source code of SRPO framework.☆34Updated this week
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆58Updated 9 months ago