Touchstone: Evaluating Vision-Language Models by Language Models
☆83Jan 18, 2024Updated 2 years ago
Alternatives and similar repositories for TouchStone
Users that are interested in TouchStone are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- ☆21Oct 10, 2023Updated 2 years ago
- ☆13Nov 10, 2021Updated 4 years ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆51Oct 29, 2023Updated 2 years ago
- ☆134Dec 22, 2023Updated 2 years ago
- OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models☆151Jan 7, 2023Updated 3 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆364Jan 14, 2025Updated last year
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- ☆812Jul 8, 2024Updated last year
- A collection of visual instruction tuning datasets.☆77Mar 14, 2024Updated 2 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 9 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆269Sep 12, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆131Oct 10, 2023Updated 2 years ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated last year
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆262Aug 21, 2025Updated 9 months ago
- Pretrained Diffusion Models for Unified Human Motion Synthesis☆19Feb 28, 2023Updated 3 years ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆954Mar 19, 2025Updated last year
- paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/☆272Aug 9, 2023Updated 2 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆297Mar 13, 2024Updated 2 years ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆360Dec 18, 2023Updated 2 years ago
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆320Jun 3, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Dataset for the investigation of visual semiotics, and how specific visual features and design choices can elicit specific emotions, thou…☆10Dec 13, 2023Updated 2 years ago
- ☆11Aug 4, 2024Updated last year
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…☆565Apr 21, 2024Updated 2 years ago
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,561Apr 24, 2024Updated 2 years ago
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆48Apr 3, 2025Updated last year
- ☆12Mar 12, 2023Updated 3 years ago
- Official repository of MMDU dataset☆105Sep 29, 2024Updated last year
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Sep 21, 2023Updated 2 years ago
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Aug 18, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆77Jan 18, 2023Updated 3 years ago
- ☆90Jul 4, 2024Updated last year
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆369Jul 24, 2025Updated 10 months ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Nov 13, 2024Updated last year
- Code, data, models for the Sherlock corpus☆62Nov 11, 2022Updated 3 years ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆20Nov 4, 2025Updated 7 months ago
- METER: A Multimodal End-to-end TransformER Framework☆377Nov 16, 2022Updated 3 years ago