[ICML 2025 Oral] Mixture of Lookup Experts
☆72Dec 3, 2025Updated 3 months ago
Alternatives and similar repositories for MoLE
Users that are interested in MoLE are comparing it to the libraries listed below
Sorting:
- The code of 《M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis》☆14Mar 31, 2025Updated 11 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.☆16Apr 8, 2025Updated 10 months ago
- LCA-on-the-line (ICML 2024 Oral)☆13Feb 13, 2025Updated last year
- ☆21Oct 22, 2025Updated 4 months ago
- [ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.☆105Dec 20, 2024Updated last year
- ☆21Dec 11, 2024Updated last year
- Implementation and checkpoints of Imagen, Google's text-to-image synthesis neural network, in Pytorch☆17Dec 22, 2022Updated 3 years ago
- ☆129Jun 6, 2025Updated 9 months ago
- Research work aimed at addressing the problem of modeling infinite-length context☆46Dec 18, 2025Updated 2 months ago
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆28Jul 15, 2025Updated 7 months ago
- Here are some of the results of my experiments applying Deep Learning for object detection.☆19Sep 24, 2021Updated 4 years ago
- Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration☆29Nov 22, 2025Updated 3 months ago
- ☆32Nov 4, 2024Updated last year
- The official implementation of paper "Multi-Outputs Is All You Need For Deblur"☆22Aug 16, 2022Updated 3 years ago
- [CoLM 24] Official Repository of MambaByte: Token-free Selective State Space Model☆24Oct 12, 2024Updated last year
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆62Nov 7, 2024Updated last year
- ☆64Jan 12, 2026Updated last month
- ☆23Dec 12, 2024Updated last year
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆31Oct 24, 2024Updated last year
- Implementation of DropCov as described in DropCov: A Simple yet Effective Method for Improving Deep Architectures☆10Oct 15, 2022Updated 3 years ago
- Source code of the paper "An efficient implementation for solving the all pairs minimax path problem in an undirected dense graph."☆16Dec 3, 2025Updated 3 months ago
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆37May 31, 2025Updated 9 months ago
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆33Aug 11, 2022Updated 3 years ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32May 29, 2024Updated last year
- Official implementation of "Mixture of Experts Meets Prompt-Based Continual Learning" (NeurIPS 2024)☆42Aug 1, 2025Updated 7 months ago
- The evaluation framework for training-free sparse attention in LLMs☆121Jan 27, 2026Updated last month
- ☆11May 3, 2022Updated 3 years ago
- CRAI is a multimodal large language model based on the Mixture of Experts (MoE) architecture, supporting text and image cross-modal tasks…☆16Apr 29, 2025Updated 10 months ago
- A Two-stage Network for Image Dehazing with Multi-scale Fusion and Adaptive Learning☆10Apr 23, 2024Updated last year
- [TIP2025] The implementation of "Uncertainty Guided Refinement for Fine-grained Salient Object Detection"☆16Apr 20, 2025Updated 10 months ago
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆91Dec 3, 2024Updated last year
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆133Apr 12, 2025Updated 10 months ago
- LED : Light Enhanced Depth Estimation at Night☆13Dec 9, 2025Updated 2 months ago
- opencv+python实现对运动目标的识别与定位(示例:黄色小球)☆10Sep 25, 2022Updated 3 years ago
- Data Programming for Text Detection in Documents using SPEAR☆12Mar 26, 2025Updated 11 months ago
- ☆11Aug 20, 2025Updated 6 months ago
- 这是福州大学的《Matlab实践》图像处理软件GUI设计☆13Jul 20, 2021Updated 4 years ago
- dynamic planning, hybrid models, hierarchical active inference, tool use☆13Jun 13, 2025Updated 8 months ago