[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆60Feb 7, 2025Updated last year
Alternatives and similar repositories for Look-into-MoEs
Users that are interested in Look-into-MoEs are comparing it to the libraries listed below
Sorting:
- ☆91Aug 18, 2024Updated last year
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 3 months ago
- Implementation for MomentumSMoE☆19Apr 19, 2025Updated 10 months ago
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆18Dec 23, 2024Updated last year
- ☆16Jul 23, 2024Updated last year
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆59Nov 24, 2024Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- ☆22Sep 2, 2025Updated 6 months ago
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆93Dec 3, 2024Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Sep 26, 2024Updated last year
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.☆16Apr 8, 2025Updated 11 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- Improving transparency of large language models' reasoning☆14Nov 25, 2025Updated 3 months ago
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 4 months ago
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- Implementation for the paper: CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference☆35Mar 6, 2025Updated last year
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- ☆21Oct 22, 2025Updated 4 months ago
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆18Oct 18, 2025Updated 4 months ago
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆15Feb 4, 2025Updated last year
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inference☆13Jun 7, 2025Updated 9 months ago
- Codebase for character-centric story understanding☆14Jan 20, 2022Updated 4 years ago