inclusionAI / MoBELinks
Mixture-of-Basis-Experts for Compressing MoE-based LLMs
☆23Updated 3 months ago
Alternatives and similar repositories for MoBE
Users that are interested in MoBE are comparing it to the libraries listed below
Sorting:
- A comprehensive and efficient long-context model evaluation framework☆27Updated 2 weeks ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆49Updated this week
- ☆16Updated last year
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆18Updated 10 months ago
- FuseAI Project☆87Updated 10 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆23Updated 2 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆31Updated 4 months ago
- ☆19Updated 11 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆16Updated last month
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆80Updated 2 months ago
- ☆19Updated 9 months ago
- The paper list of multilingual pre-trained models (Continual Updated).☆23Updated last year
- ☆21Updated 3 weeks ago
- Control LLM☆20Updated 8 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Updated last year
- ☆13Updated 10 months ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Updated last year
- DCPO: Dynamic Adaptive Clipping for RL☆44Updated 2 months ago
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆25Updated last month
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Updated 2 months ago
- ☆24Updated last year
- ☆16Updated last year
- Official Implementation of APB (ACL 2025 main Oral)☆32Updated 9 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆55Updated 10 months ago
- ☆14Updated 10 months ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Updated 2 months ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆18Updated 8 months ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Updated 10 months ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Updated last year