OpenGVLab / Multitask-Model-SelectorLinks

[NIPS2023]Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector

☆37

Alternatives and similar repositories for Multitask-Model-Selector

Users that are interested in Multitask-Model-Selector are comparing it to the libraries listed below

Sorting:

OpenGVLab / MMIU
[ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
☆90Updated last year
chenshuang-zhang / imagenet_d
[CVPR 2024 Highlight] ImageNet-D
☆46Updated last year
wjpoom / SPEC
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆49Updated 5 months ago
iancovert / locality-alignment
☆53Updated 10 months ago
AtsuMiyai / UPD
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
☆79Updated 6 months ago
m1k2zoo / negbench
Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"
☆42Updated 7 months ago
tripletclip / TripletCLIP
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆46Updated last year
Adamdad / Repfusion
☆57Updated 2 years ago
locuslab / llava-token-compression
☆45Updated last year
ggjy / vision_weak_to_strong
☆38Updated last year
kxfan2002 / SophiaVL-R1
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆87Updated 4 months ago
si0wang / ViCrit
☆24Updated 5 months ago
sjz5202 / LLaVA-Reward
Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
☆22Updated 4 months ago
OpenSparseLLMs / CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
☆49Updated last year
RainBowLuoCS / DEEM
(ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.
☆44Updated 5 months ago
lixinustc / GraphAdapter
The efficient tuning method for VLMs
☆80Updated last year
HKUST-LongGroup / DyME
Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆15Updated 3 weeks ago
THU-MIG / VTC-CLS
official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"
☆23Updated 7 months ago
TencentARC / pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Updated 2 years ago
zycheiheihei / Transferable-Visual-Prompting
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…
☆46Updated 11 months ago
liuzhuang13 / bias
☆113Updated last year
techmonsterwang / iLLaMA
Adapting LLaMA Decoder to Vision Transformer
☆30Updated last year
waltonfuture / MM-UPT
[NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
☆69Updated last month
sterzhang / PVIT
Official Repository of Personalized Visual Instruct Tuning
☆33Updated 9 months ago
inclusionAI / M2-Reasoning
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
☆46Updated 4 months ago
OoDBag / VisTA
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
☆20Updated 6 months ago
ML-GSAI / DPT
Official PyTorch implementation for "Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels"
☆96Updated last year
yuecao0119 / MMInstruct
[SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…
☆60Updated last year
FreedomIntelligence / TRIM
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆19Updated last year
YU-deep / VisMem
☆43Updated 3 weeks ago