OFA-Sys / TouchStoneLinks

Touchstone: Evaluating Vision-Language Models by Language Models

☆83

Alternatives and similar repositories for TouchStone

Users that are interested in TouchStone are comparing it to the libraries listed below

Sorting:

mlfoundations / VisIT-Bench
☆50Updated 2 years ago
X2FD / LVIS-INSTRUCT4V
☆133Updated last year
thunlp / Muffin
☆66Updated last year
vlf-silkie / VLFeedback
☆100Updated last year
palchenli / VL-Instruction-Tuning
☆91Updated 2 years ago
HYPJUDY / Sparkles
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆44Updated last year
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆117Updated last year
FreedomIntelligence / MLLM-Bench
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆72Updated last year
BAAI-DCAI / Visual-Instruction-Tuning
SVIT: Scaling up Visual Instruction Tuning
☆164Updated last year
YujieLu10 / LLMScore
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
☆134Updated 2 years ago
icoz69 / StableLLAVA
Official repo for StableLLAVA
☆95Updated last year
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆84Updated 10 months ago
xiangyu-mm / EasyGen
The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"
☆74Updated last year
yuecao0119 / MMInstruct
[SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…
☆59Updated last year
BAAI-DCAI / DataOptim
A collection of visual instruction tuning datasets.
☆76Updated last year
FudanNLPLAB / MouSi
☆75Updated last year
FreedomIntelligence / ALLaVA
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆277Updated last year
TencentARC / GVT
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
☆58Updated 2 years ago
SparksJoe / Prism
A Framework for Decoupling and Assessing the Capabilities of VLMs
☆43Updated last year
scenarios / WeMM
☆87Updated last year
MAmmoTH-VL / MAmmoTH-VL
(ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
☆48Updated 5 months ago
Yangyi-Chen / SOLO
[TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"
☆147Updated last year
TIGER-AI-Lab / Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]
☆231Updated 8 months ago
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆98Updated last year
pipilurj / G-LLaVA
Official github repo of G-LLaVA
☆148Updated 9 months ago
OpenGVLab / MMT-Bench
[ICML 2024] | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
☆115Updated last year
baaivision / CapsFusion
[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale
☆212Updated last year
Victorwz / MLM_Filter
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
☆68Updated 7 months ago
yuweihao / MM-Vet
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
☆317Updated 10 months ago
BAAI-DCAI / Dataset-Pruning
Dataset pruning for ImageNet and LAION-2B.
☆79Updated last year