ZhenweiAn / Dynamic_MoELinks
Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"
☆60Updated last year
Alternatives and similar repositories for Dynamic_MoE
Users that are interested in Dynamic_MoE are comparing it to the libraries listed below
Sorting:
- ☆114Updated 2 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆121Updated last month
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆126Updated 9 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆171Updated last month
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆86Updated 8 months ago
- ☆118Updated 4 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆175Updated last year
- ☆149Updated last year
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆38Updated last year
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆82Updated 5 months ago
- ☆141Updated 2 months ago
- ☆96Updated 3 months ago
- An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFT☆108Updated 5 months ago
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆56Updated 9 months ago
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆57Updated this week
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆157Updated last year
- State-of-the-art Parameter-Efficient MoE Fine-tuning Method☆176Updated 11 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.☆77Updated 2 months ago
- ☆112Updated last year
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆90Updated last month
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆49Updated 9 months ago
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆52Updated 5 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆22Updated 5 months ago
- ☆65Updated 8 months ago
- ☆25Updated 4 months ago
- ☆127Updated 2 months ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆36Updated last month
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆62Updated 8 months ago
- Efficient Mixture of Experts for LLM Paper List☆87Updated 7 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆63Updated 9 months ago