ssyze / EVELinks

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

☆10

Alternatives and similar repositories for EVE

Users that are interested in EVE are comparing it to the libraries listed below

Sorting:

OpenGVLab / Multitask-Model-Selector
[NIPS2023]Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector
☆37Updated last year
zycheiheihei / Transferable-Visual-Prompting
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…
☆45Updated 10 months ago
HKUST-LongGroup / DyME
Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆15Updated last month
ImKeTT / ZeroGen
[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation
☆13Updated 2 years ago
OoDBag / VisTA
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
☆19Updated 5 months ago
THU-MIG / VTC-CLS
official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"
☆23Updated 6 months ago
taolinzhang / BoostAdapter
[NeurIPS2024] An official pytorch implement of the paper: BoostAdapter: Improving Test-Time Adaptation via Regional Bootstrapping
☆17Updated 7 months ago
locuslab / llava-token-compression
☆44Updated last year
deeplearning-wisc / NSCL
Code for ICML 2023 paper "When and How Does Known Class Help Discover Unknown Ones? Provable Understandings Through Spectral Analysis"
☆13Updated 2 years ago
PKU-ML / adainf
Official code for ICLR 2024 paper "Do Generated Data Always Help Contrastive Learning?"
☆31Updated last year
csarron / PuMer
[ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
☆34Updated last year
OpenSparseLLMs / CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
☆49Updated last year
sangminwoo / AvisC
[ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vis…
☆19Updated last year
waltonfuture / MM-UPT
[NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
☆60Updated last week
nishadsinghi / CleanCLIP
Official PyTorch implementation of "CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning" @ ICCV 2023
☆38Updated 3 weeks ago
tripletclip / TripletCLIP
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆44Updated 11 months ago
yuecao0119 / MMFuser
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …
☆59Updated last year
shubhamprshr27 / NeglectedTailsVLM
This repository houses the code for the paper - "The Neglected of VLMs"
☆29Updated 6 months ago
lixinustc / GraphAdapter
The efficient tuning method for VLMs
☆80Updated last year
Yuqifan1117 / HalluciDoctor
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)
☆49Updated last year
HenryHZY / VL-PET
[ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"
☆52Updated 2 years ago
G-JWLee / COINCIDE_code
☆20Updated last year
TencentARC / pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Updated 2 years ago
chenshuang-zhang / imagenet_d
[CVPR 2024 Highlight] ImageNet-D
☆44Updated last year
RainBowLuoCS / DEEM
(ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.
☆44Updated 4 months ago
YBZh / OpenOOD-VLM
ECCV24, NeurIPS24, Benchmarking Generalized Out-of-Distribution Detection with Vision-Language Models
☆28Updated 10 months ago
TencentARC / FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
☆32Updated 2 years ago
iCGY96 / awesome_concept_learning_list
A curated list of papers & resources linked to concept learning
☆13Updated 2 years ago
adobe-research / llava-score
☆11Updated last year
wuxiyang1996 / AutoHallusion
AutoHallusion Codebase (EMNLP 2024)
☆21Updated 11 months ago