Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
☆31Aug 4, 2024Updated last year
Alternatives and similar repositories for RMoE
Users that are interested in RMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated last year
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆39May 28, 2024Updated last year
- ☆10Mar 18, 2025Updated last year
- ☆29May 24, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆20May 28, 2025Updated 11 months ago
- ☆15Jul 25, 2024Updated last year
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated last year
- ☆92Aug 18, 2024Updated last year
- UFT: Unifying Supervised and Reinforcement Fine-Tuning☆30Jun 30, 2025Updated 10 months ago
- [NeurIPS 2025] Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM☆24Feb 10, 2026Updated 3 months ago
- MMoE: Multimodal Mixture-of-Experts (EMNLP 2024)☆16Nov 14, 2024Updated last year
- The official implement of "Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models"☆17Mar 24, 2025Updated last year
- MultimodalSDK provides tools to easily apply machine learning algorithms on well-known affective computing datasets such as CMU-MOSI, CMU…☆15Jan 18, 2018Updated 8 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code and data for the paper "Steering Conversational Large Language Models for Long Emotional Support Conversations" along with a UI to v…☆15Apr 14, 2025Updated last year
- [ACM Multimedia 2021] Spatiotemporal Inconsistency Learning for DeepFake Video Detection☆11Jul 13, 2023Updated 2 years ago
- Official code for the ICLR 2025 paper, "Ada-K Routing: Boosting the Efficiency of MoE-based LLMs"☆12Mar 1, 2025Updated last year
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation☆16Mar 31, 2026Updated last month
- This is a PyTorch implementation of "Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection" accepted by ACM MM…