nzjin / awesome_moe
The collections of MOE (Mixture Of Expert) papers, code and tools, etc.
☆11Updated 10 months ago
Alternatives and similar repositories for awesome_moe:
Users that are interested in awesome_moe are comparing it to the libraries listed below
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆33Updated 9 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆95Updated 9 months ago
- ☆31Updated last year
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆38Updated 2 months ago
- [EMNLP 2023 Main] Sparse Low-rank Adaptation of Pre-trained Language Models☆70Updated 10 months ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆55Updated 2 months ago
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆18Updated 7 months ago
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆38Updated last year
- ☆26Updated 9 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆39Updated 2 months ago
- ☆27Updated last year
- TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models☆63Updated 11 months ago
- One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning☆38Updated last year
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Updated 6 months ago
- The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agen…☆23Updated 10 months ago
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning☆38Updated 5 months ago
- Codebase for ACL 2023 paper "Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memori…☆47Updated last year
- The code and data for the paper JiuZhang3.0☆40Updated 7 months ago
- ☆12Updated 3 weeks ago
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆32Updated last year
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆37Updated 3 months ago
- Awesome Learn From Model Beyond Fine-Tuning: A Survey☆54Updated last month
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆76Updated 11 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆15Updated 8 months ago
- ☆20Updated 6 months ago
- my commonly-used tools☆48Updated last week
- This repository collects awesome survey, resource, and paper for Lifelong Learning for Large Language Models. (Updated Regularly)☆37Updated 2 months ago
- [ACL 2024] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module …☆36Updated 6 months ago