Touchstone: Evaluating Vision-Language Models by Language Models
☆83Jan 18, 2024Updated 2 years ago
Alternatives and similar repositories for TouchStone
Users that are interested in TouchStone are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- ☆21Oct 10, 2023Updated 2 years ago
- ☆13Nov 10, 2021Updated 4 years ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆51Oct 29, 2023Updated 2 years ago
- ☆134Dec 22, 2023Updated 2 years ago
- OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models☆151Jan 7, 2023Updated 3 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆364Jan 14, 2025Updated last year
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- ☆813Jul 8, 2024Updated last year
- A collection of visual instruction tuning datasets.☆77Mar 14, 2024Updated 2 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 10 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆269Sep 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆131Oct 10, 2023Updated 2 years ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated 2 years ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆263Aug 21, 2025Updated 10 months ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆954Mar 19, 2025Updated last year
- paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/☆272Aug 9, 2023Updated 2 years ago
- Natural Perturbation for Robust Question Answering☆12Apr 7, 2020Updated 6 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆297Mar 13, 2024Updated 2 years ago
- A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model To…☆1,063Oct 6, 2024Updated last year
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆361Dec 18, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆321Jun 3, 2024Updated 2 years ago
- Dataset for the investigation of visual semiotics, and how specific visual features and design choices can elicit specific emotions, thou…☆10Dec 13, 2023Updated 2 years ago
- ☆11Aug 4, 2024Updated last year
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…☆566Apr 21, 2024Updated 2 years ago
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,558Apr 24, 2024Updated 2 years ago
- ☆12Mar 12, 2023Updated 3 years ago
- Official repository of MMDU dataset☆107Sep 29, 2024Updated last year
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Sep 21, 2023Updated 2 years ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Jan 18, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆90Jul 4, 2024Updated last year
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆372Jul 24, 2025Updated 11 months ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Nov 13, 2024Updated last year
- Code, data, models for the Sherlock corpus☆62Nov 11, 2022Updated 3 years ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆20Nov 4, 2025Updated 7 months ago
- METER: A Multimodal End-to-end TransformER Framework☆377Nov 16, 2022Updated 3 years ago
- [CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》☆150Jun 7, 2023Updated 3 years ago