arpita8 / Awesome-Mixture-of-Experts-PapersLinks

Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.

☆128

Alternatives and similar repositories for Awesome-Mixture-of-Experts-Papers

Users that are interested in Awesome-Mixture-of-Experts-Papers are comparing it to the libraries listed below

Sorting:

CASE-Lab-UMD / LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
☆174Updated 4 months ago
ZihanWang314 / CoE
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
☆219Updated last month
kyegomez / Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆103Updated this week
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆171Updated last week
jxiw / MambaInLlama
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
☆225Updated 3 months ago
astramind-ai / Mixture-of-depths
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆167Updated last year
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
facebookresearch / RAM
A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).
☆265Updated last week
NVlabs / MaskLLM
[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
☆174Updated 7 months ago
efficientscaling / Z1
Repo for "Z1: Efficient Test-time Scaling with Code"
☆63Updated 3 months ago
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆160Updated 3 months ago
MingLiiii / Layer_Gradient
[ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆71Updated last month
ruixin31 / Spurious_Rewards
☆323Updated last week
ypwang61 / One-Shot-RLVR
official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”
☆337Updated last week
wuhy68 / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)
☆146Updated 10 months ago
yihedeng9 / rlhf-summary-notes
A brief and partial summary of RLHF algorithms.
☆131Updated 5 months ago
hao-ai-lab / Dynasor
Simple extension on vLLM to help you speed up reasoning model without training.
☆172Updated 2 months ago
fangyuan-ksgk / Tiny-GRPO
minimal GRPO implementation from scratch
☆94Updated 4 months ago
sunblaze-ucb / Intuitor
Code for the paper: "Learning to Reason without External Rewards"
☆344Updated 3 weeks ago
CASE-Lab-UMD / Unified-MoE-Compression
The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
☆72Updated 4 months ago
metame-ai / awesome-llm-plaza
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
☆206Updated last week
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆234Updated 2 months ago
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆178Updated last month
GeniusHTX / TALE
☆126Updated 2 months ago
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆164Updated 4 months ago
Lucky-Lance / Expert_Sparsity
[ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
☆95Updated last year
ScalingIntelligence / large_language_monkeys
☆101Updated 10 months ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆212Updated 6 months ago
tianyi-lab / MoE-Embedding
Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
☆76Updated 9 months ago
FairyFali / SLMs-Survey
Survey of Small Language Models from Penn State, ...
☆187Updated 2 weeks ago