Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
☆30Aug 4, 2024Updated last year
Alternatives and similar repositories for RMoE
Users that are interested in RMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated last year
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆39May 28, 2024Updated last year
- ✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM☆11Jun 16, 2025Updated 10 months ago
- ☆10Mar 18, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆29May 24, 2024Updated last year
- ☆20May 28, 2025Updated 11 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated last year
- ☆92Aug 18, 2024Updated last year
- UFT: Unifying Supervised and Reinforcement Fine-Tuning☆29Jun 30, 2025Updated 10 months ago
- [NeurIPS 2025] Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM☆24Feb 10, 2026Updated 2 months ago
- MMoE: Multimodal Mixture-of-Experts (EMNLP 2024)☆15Nov 14, 2024Updated last year
- The official implement of "Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models"☆17Mar 24, 2025Updated last year
- Code and data for the paper "Steering Conversational Large Language Models for Long Emotional Support Conversations" along with a UI to v…☆15Apr 14, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Magic ELF: Image Deraining Meets Association Learning and Transformer☆14Sep 13, 2023Updated 2 years ago
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- Implementation of BitNet-1.58 instruct tuning☆27Apr 14, 2024Updated 2 years ago
- ☆15Mar 18, 2025Updated last year
- Breast tumor segmentation and shape classification in mammograms using generative adversarial and convolutional neural network☆13Jul 30, 2021Updated 4 years ago
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆21Mar 23, 2026Updated last month
- ImageNet training code of Res2Net☆15Nov 2, 2020Updated 5 years ago
- A suite of multimodal language models that are powerful and efficient☆19Jan 13, 2025Updated last year
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model☆13Feb 11, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)☆20Jun 25, 2025Updated 10 months ago
- This repository contains 2 tools: - A py3 Lib for NLP & image-caption metrics - Code for a two-tailed t-test with paired samples. It wil…☆18Apr 4, 2021Updated 5 years ago
- Single Image Deraining via Recurrent Hierarchy Enhancement Network (ACM'MM2019)☆18Dec 28, 2019Updated 6 years ago
- ☆19Jun 13, 2024Updated last year
- [ACL 2026 Main] Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis☆38Apr 24, 2026Updated last week
- Attention-guided dense-upsampling networks for breast mass segmentation in whole mammograms☆12Oct 9, 2019Updated 6 years ago
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- Understanding deep networks and large models.☆28Jan 23, 2026Updated 3 months ago
- ☆17Feb 23, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICLR 2026] Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation☆35Feb 4, 2026Updated 3 months ago
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Nov 6, 2023Updated 2 years ago
- 运动车辆检测☆12Jul 28, 2018Updated 7 years ago
- ☆11Sep 16, 2024Updated last year
- [ACM MM 2025] Phys4DGen: Physics-Compliant 4D Generation with Multi-Material Composition Perception☆12Apr 18, 2026Updated 2 weeks ago
- [WACV2023] This is the official PyTorch impelementation of our paper "[Rethinking Rotation in Self-Supervised Contrastive Learning: Adapt…☆12Feb 24, 2023Updated 3 years ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆63Apr 18, 2024Updated 2 years ago