longrongyang/STGC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/longrongyang/STGC)

longrongyang / STGC

Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

☆13

Alternatives and similar repositories for STGC

Users that are interested in STGC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tianyi-lab / R2-T2
View on GitHub
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆19Mar 10, 2025Updated last year
wangyccn / CR-AI-V1.5
View on GitHub
CRAI is a multimodal large language model based on the Mixture of Experts (MoE) architecture, supporting text and image cross-modal tasks…
☆16Apr 29, 2025Updated last year
EmmaSRH / ARVFM
View on GitHub
Awesome autoregressive vision foundation models
☆26Dec 24, 2024Updated last year
T-Lab-CUHKSZ / G2RPO-A
View on GitHub
[ACL 2026] G2RPO-A: Guided Group Relative Policy Optimization with Adaptive Guidance
☆16May 20, 2026Updated 2 months ago
Bigyehahaha / M4
View on GitHub
The code of 《M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis》
☆14Mar 31, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
LINs-lab / DynMoE
View on GitHub
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
☆161Jul 9, 2025Updated last year
yushuiwx / MH-MoE
View on GitHub
☆20Nov 5, 2024Updated last year
ttw1018 / MoPE-DST
View on GitHub
The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"
☆19Jan 25, 2025Updated last year
LINs-lab / ReLA
View on GitHub
[NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations
☆19Jan 19, 2025Updated last year
wrmedford / moe-scaling
View on GitHub
Scaling Laws for Mixture of Experts Models
☆15Feb 25, 2025Updated last year
zdebruine / MMVAE
View on GitHub
Mixture-of-Experts Multimodal Variational Autoencoder
☆15Jul 3, 2025Updated last year
LINs-lab / awesome_papers
View on GitHub
☆20May 28, 2025Updated last year
Leey21 / A-Data-Centric-Study
View on GitHub
☆18Mar 2, 2026Updated 4 months ago
TencentARC / SGAT4PASS
View on GitHub
[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
☆37Jun 20, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
WuTao-CS / CustomCrafter
View on GitHub
[AAAI 2025] CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
☆51Jan 12, 2025Updated last year
he-h / ST-MoE-BERT
View on GitHub
This repository contains the code for the paper "ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mo…
☆16Feb 20, 2025Updated last year
rxtan2 / Koala-video-llm
View on GitHub
☆37Sep 16, 2024Updated last year
ChenZiHong-Gavin / MoE-Visualizer
View on GitHub
MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.
☆16Apr 8, 2025Updated last year
Taishi-N324 / Drop-Upcycling
View on GitHub
[ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
☆25Oct 5, 2025Updated 9 months ago
pandeydeep9 / EvidentialResearch2023
View on GitHub
Analysis of evidential models
☆15Jun 22, 2023Updated 3 years ago
The-Swarm-Corporation / Mamba-R1
View on GitHub
Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Ex…
☆25Oct 13, 2025Updated 9 months ago
lartpang / RunIt
View on GitHub
A simple program scheduler for your code on different devices.
☆12Mar 8, 2026Updated 4 months ago
EchoPluto / MagicID
View on GitHub
☆35Mar 18, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
LINs-lab / FedBR
View on GitHub
[ICML 2023] FedBR: Improving Federated Learning on Heterogeneous Data via Local Learning Bias Reduction
☆29Mar 7, 2024Updated 2 years ago
JohanChane / ranger-quit_cd_wd
View on GitHub
Ranger plugin.
☆13Jun 6, 2024Updated 2 years ago
LINs-lab / GMem
View on GitHub
[Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models
☆43Mar 11, 2025Updated last year
ysngki / UMoE
View on GitHub
☆24Oct 22, 2025Updated 9 months ago
qizhou000 / LiveEdit
View on GitHub
[CVPR 2025] Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
☆25Jun 22, 2025Updated last year
lucabarsellotti / awesome-open-vocabulary-semantic-segmentation
View on GitHub
☆15May 7, 2024Updated 2 years ago
LINs-lab / RCGM
View on GitHub
[ICLR 2026] Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation
☆40Feb 4, 2026Updated 5 months ago
ivattyue / Ada-K
View on GitHub
Official code for the ICLR 2025 paper, "Ada-K Routing: Boosting the Efficiency of MoE-based LLMs"
☆12Mar 1, 2025Updated last year
schrau24 / FlowProcessing
View on GitHub
Flow Processing Tool (Matlab) for 4D flow MRI data
☆15Jun 10, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
TinyTigerPan / BCKD
View on GitHub
Official Implementation of Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection
☆51Oct 7, 2023Updated 2 years ago
Fsoft-AIC / LibMoE
View on GitHub
[TMLR 2026] LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS
☆52May 26, 2026Updated 2 months ago
cviaai / AF-PLUS
View on GitHub
Official MICCAI-2022 submission repository
☆12Jun 30, 2022Updated 4 years ago
kamanphoebe / Look-into-MoEs
View on GitHub
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆60Feb 7, 2025Updated last year
kyegomez / MHMoE
View on GitHub
Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch
☆31Updated this week
koayon / awesome-adaptive-computation
View on GitHub
A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
☆163Jan 1, 2025Updated last year