withinmiaov / A-Survey-on-Mixture-of-Experts
☆97Updated last month
Related projects: ⓘ
- ☆109Updated last month
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆104Updated this week
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆58Updated 3 weeks ago
- ☆54Updated 2 months ago
- The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.☆62Updated 6 months ago
- ☆119Updated last week
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆63Updated 3 months ago
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆49Updated last month
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆114Updated 2 months ago
- A curated list of Model Merging methods.☆70Updated last month
- Survey on Data-centric Large Language Models☆58Updated 2 months ago
- ☆19Updated last month
- This repository collects awesome survey, resource, and paper for Lifelong Learning for Large Language Models. (Updated Regularly)☆22Updated 2 weeks ago
- A generalized framework for subspace tuning methods in parameter efficient fine-tuning.☆59Updated this week
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆81Updated 3 weeks ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆47Updated last month
- ☆139Updated 2 months ago
- FusionBench: A Comprehensive Benchmark of Deep Model Fusion☆42Updated 2 weeks ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models☆86Updated 4 months ago
- LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment☆190Updated 4 months ago
- ☆20Updated 3 months ago
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆53Updated 2 weeks ago
- [SIGIR'24] The official implementation code of MOELoRA.☆113Updated last month
- ☆40Updated 5 months ago
- ☆24Updated 5 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆120Updated 4 months ago
- ☆125Updated this week
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆56Updated this week
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆49Updated last month
- ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI☆84Updated 2 months ago