withinmiaov / A-Survey-on-Mixture-of-Experts-in-LLMs
The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".
β337Updated last month
Alternatives and similar repositories for A-Survey-on-Mixture-of-Experts-in-LLMs:
Users that are interested in A-Survey-on-Mixture-of-Experts-in-LLMs are comparing it to the libraries listed below
- Survey Paper List - Efficient LLM and Foundation Modelsβ248Updated 7 months ago
- π° Must-read papers on KV Cache Compression (constantly updating π€).β393Updated 3 weeks ago
- A collection of AWESOME things about mixture-of-expertsβ1,103Updated 4 months ago
- Paper list for Efficient Reasoning.β412Updated last week
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuningβ195Updated 5 months ago
- Awesome list for LLM pruning.β224Updated 4 months ago
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Modelsβ350Updated last week
- Awesome list for LLM quantizationβ210Updated 4 months ago
- A curated reading list of research in Mixture-of-Experts(MoE).β619Updated 6 months ago
- β194Updated 6 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β85Updated 4 months ago
- LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignmentβ326Updated last year
- A Telegram bot to recommend arXiv papersβ270Updated 3 weeks ago
- β144Updated 7 months ago
- AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).β320Updated last year
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)β261Updated 2 weeks ago
- π° Must-read papers and blogs on Speculative Decoding β‘οΈβ704Updated last week
- TransMLA: Multi-Head Latent Attention Is All You Needβ243Updated this week
- [Arxiv 2025] Efficient Reasoning Models: A Surveyβ130Updated this week
- Awesome-LLM-KV-Cache: A curated list of πAwesome LLM KV Cache Papers with Codes.β288Updated 2 months ago
- Paper List of Inference/Test Time Scaling/Computingβ207Updated last week
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)β347Updated 3 months ago
- β132Updated 9 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMsβ136Updated last month
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Modelsβ162Updated 4 months ago
- A generalized framework for subspace tuning methods in parameter efficient fine-tuning.β139Updated 2 months ago
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.β371Updated last week
- Awesome-Low-Rank-Adaptationβ94Updated 6 months ago
- A curated list of Model Merging methods.β92Updated 7 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"β158Updated 10 months ago