LINs-lab / DynMoE
[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
☆59Updated 4 months ago
Alternatives and similar repositories for DynMoE:
Users that are interested in DynMoE are comparing it to the libraries listed below
- Code release for VTW (AAAI 2025)☆28Updated last month
- A Self-Training Framework for Vision-Language Reasoning☆60Updated 2 months ago
- [EMNLP 2024 Findings🔥] Official implementation of "LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Infe…☆88Updated 2 months ago
- ☆91Updated 6 months ago
- ☆121Updated 5 months ago
- [ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View☆43Updated 3 months ago
- ✈️ Accelerating Vision Diffusion Transformers with Skip Branches.☆59Updated last month
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆62Updated last month
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.☆17Updated this week
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".☆50Updated last week
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆68Updated this week
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆78Updated 3 months ago
- SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆31Updated last month
- A Survey on the Honesty of Large Language Models☆51Updated last month
- 🚀LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆64Updated last month
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆30Updated 6 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆15Updated 8 months ago
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆38Updated last month
- ☆57Updated 7 months ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models☆109Updated 8 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆28Updated 6 months ago
- ☆36Updated 2 weeks ago
- MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆32Updated last month
- M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆49Updated 3 weeks ago
- ☆32Updated this week
- ☆16Updated last month
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆51Updated 4 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆40Updated 2 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆55Updated 2 months ago