JUNJIE99 / VISTA_Evaluation_FineTuningLinks

Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.

☆43

Alternatives and similar repositories for VISTA_Evaluation_FineTuning

Users that are interested in VISTA_Evaluation_FineTuning are comparing it to the libraries listed below

Sorting:

DataArcTech / RagVL
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆86Updated last year
TIGER-AI-Lab / UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆167Updated last year
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆117Updated 11 months ago
LinWeizheDragon / FLMR
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
☆101Updated 5 months ago
mayubo2333 / MMLongBench-Doc
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆105Updated last month
BAAI-DCAI / DataOptim
A collection of visual instruction tuning datasets.
☆76Updated last year
XMUDeepLIT / LLaVE
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
☆72Updated 5 months ago
friedrichor / UNITE
official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"
☆38Updated 4 months ago
thunlp / Muffin
☆66Updated last year
zhourax / VEGA
☆37Updated last year
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆97Updated last year
haon-chen / mmE5
☆56Updated 8 months ago
OpenMatch / UniVL-DR
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…
☆53Updated last year
BUAADreamer / SPN4CIR
[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
☆39Updated 2 months ago
FreedomIntelligence / ALLaVA
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆276Updated last year
FuxiaoLiu / MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
☆96Updated 10 months ago
HJYao00 / DenseConnector
【NeurIPS 2024】Dense Connector for MLLMs
☆180Updated last year
mynameischaos / Lion
Lion: Kindling Vision Intelligence within Large Language Models
☆51Updated last year
pengts / VW-LMM
☆25Updated last year
OpenGVLab / MMT-Bench
[ICML 2024] | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
☆115Updated last year
vlf-silkie / VLFeedback
☆100Updated last year
deepglint / UniME
[ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆95Updated last week
Code-kunkun / LamRA
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
☆172Updated 4 months ago
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆84Updated 9 months ago
X2FD / LVIS-INSTRUCT4V
☆133Updated last year
foundation-multimodal-models / CAPTURE
☆79Updated last year
palchenli / VL-Instruction-Tuning
☆91Updated last year
yhy-2000 / VideoDeepResearch
☆120Updated this week
jinbo0906 / Awesome-MLLM-Datasets
This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …
☆59Updated 6 months ago
X-PLUG / mPLUG
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
☆96Updated 2 years ago