jie040109 / MLAELinks
The official PyTorch implementation of the paper "MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning"
☆28Updated 7 months ago
Alternatives and similar repositories for MLAE
Users that are interested in MLAE are comparing it to the libraries listed below
Sorting:
- CLIP-MoE: Mixture of Experts for CLIP☆42Updated 9 months ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆39Updated 3 months ago
- a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity☆29Updated last month
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆63Updated last month
- Recent Advances on MLLM's Reasoning Ability☆24Updated 3 months ago
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆68Updated 2 months ago
- Adapting LLaMA Decoder to Vision Transformer☆28Updated last year
- ☆16Updated 8 months ago
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆39Updated 9 months ago
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆49Updated last year
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆56Updated 8 months ago
- ☆15Updated 8 months ago
- [ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.☆33Updated 6 months ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆37Updated 5 months ago
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆15Updated 2 weeks ago
- ☆53Updated 2 months ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆32Updated 9 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆65Updated last month
- Official implementation of MC-LLaVA.☆32Updated last month
- Official implementation of "Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology"☆29Updated last week
- ☆24Updated last week
- ☆12Updated 9 months ago
- AdaMoLE: Adaptive Mixture of LoRA Experts☆33Updated 9 months ago
- Code release for VTW (AAAI 2025) Oral☆44Updated 6 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆80Updated last year
- 🚀 Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models☆24Updated last month
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆115Updated 4 months ago
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆46Updated 7 months ago
- ☆89Updated 3 months ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆32Updated 3 weeks ago