Touchstone: Evaluating Vision-Language Models by Language Models
☆83Jan 18, 2024Updated 2 years ago
Alternatives and similar repositories for TouchStone
Users that are interested in TouchStone are comparing it to the libraries listed below
Sorting:
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- ☆21Oct 10, 2023Updated 2 years ago
- ☆14Nov 10, 2021Updated 4 years ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- ☆51Oct 29, 2023Updated 2 years ago
- ☆134Dec 22, 2023Updated 2 years ago
- OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models☆151Jan 7, 2023Updated 3 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆361Jan 14, 2025Updated last year
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- see readme☆92May 9, 2022Updated 3 years ago
- ☆806Jul 8, 2024Updated last year
- A collection of visual instruction tuning datasets.☆77Mar 14, 2024Updated 2 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 6 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆130Oct 10, 2023Updated 2 years ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆253Aug 21, 2025Updated 7 months ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated last year
- Pretrained Diffusion Models for Unified Human Motion Synthesis☆18Feb 28, 2023Updated 3 years ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆954Mar 19, 2025Updated last year
- paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/☆270Aug 9, 2023Updated 2 years ago
- Natural Perturbation for Robust Question Answering☆12Apr 7, 2020Updated 5 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated 2 years ago
- A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model To…☆1,065Oct 6, 2024Updated last year
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆360Dec 18, 2023Updated 2 years ago
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆319Jun 3, 2024Updated last year
- Dataset for the investigation of visual semiotics, and how specific visual features and design choices can elicit specific emotions, thou…☆10Dec 13, 2023Updated 2 years ago
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…☆559Apr 21, 2024Updated last year
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,557Apr 24, 2024Updated last year
- ☆11Aug 4, 2024Updated last year
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆45Apr 3, 2025Updated 11 months ago
- ☆12Mar 12, 2023Updated 3 years ago
- Official repository of MMDU dataset☆104Sep 29, 2024Updated last year
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Sep 21, 2023Updated 2 years ago
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Aug 18, 2023Updated 2 years ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Jan 18, 2023Updated 3 years ago
- ☆90Jul 4, 2024Updated last year
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆368Jul 24, 2025Updated 7 months ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Nov 13, 2024Updated last year