[NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
☆84Dec 27, 2025Updated 4 months ago
Alternatives and similar repositories for MoME
Users that are interested in MoME are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning☆28Mar 23, 2025Updated last year
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆72Oct 17, 2025Updated 7 months ago
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts☆163Jun 8, 2024Updated last year
- [NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression☆67Feb 19, 2025Updated last year
- A large scale inpainting & t2i anime image dataset☆16Oct 18, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Multimodal Instruction Tuning with Conditional Mixture of LoRA (ACL 2024)☆32Aug 9, 2024Updated last year
- ☆18Aug 7, 2024Updated last year
- CLIP-MoE: Mixture of Experts for CLIP☆58Oct 10, 2024Updated last year
- Official code repository of Shuffle-R1☆26Feb 23, 2026Updated 2 months ago
- 【NeurIPS 2024】Dense Connector for MLLMs☆183Oct 14, 2024Updated last year
- Progressive Language-guided Visual Learning for Multi-Task Visual Grounding☆13May 9, 2025Updated last year
- [CVPR 2025] Implementation of "Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models"☆40Apr 28, 2025Updated last year
- ☆17Dec 1, 2023Updated 2 years ago
- Official Implementation of VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Jo…☆23Jun 27, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- AGIQA-1k-Database for AI Generated Content Image Quality Assessment☆29May 1, 2023Updated 3 years ago
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆98Mar 22, 2026Updated last month
- Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)☆24May 8, 2026Updated last week
- LVAS-Agent Code Base☆21Apr 15, 2025Updated last year
- ☆47Nov 8, 2024Updated last year
- [ACM MM 2025] DFBench: Benchmarking Deepfake Image Detection Capability of Large Multimodal Models☆25Aug 6, 2025Updated 9 months ago
- The implementation codes of paper: Multimodal Sentiment Analysis with Mutual Information-based Disentangled Representation Learning☆21May 8, 2025Updated last year
- Matryoshka Multimodal Models☆123Jan 22, 2025Updated last year
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"☆46Oct 9, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICASSP2024] Code for paper "SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection"☆15Jul 6, 2024Updated last year
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆54Jul 24, 2025Updated 9 months ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆63Nov 7, 2024Updated last year
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆52Jul 16, 2024Updated last year
- (TIP 2024) Towards Robust Referring Image Segmentation☆37Mar 2, 2024Updated 2 years ago
- Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection☆18Mar 19, 2024Updated 2 years ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆52Oct 19, 2024Updated last year
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆33Mar 16, 2024Updated 2 years ago
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆23May 8, 2025Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆86Nov 10, 2024Updated last year
- Official Pytorch implementation of 'Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning'? (ICLR2024)☆13Mar 8, 2024Updated 2 years ago
- The released data for the paper entilted "FakeBench: Probing Explainable Fake Image Detection via Large Multimodal Models"☆54Jul 28, 2025Updated 9 months ago
- The offical implementation of 'FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant'☆49Nov 22, 2024Updated last year
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆63Aug 23, 2024Updated last year
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆124Jul 1, 2024Updated last year