Touchstone: Evaluating Vision-Language Models by Language Models
☆83Jan 18, 2024Updated 2 years ago
Alternatives and similar repositories for TouchStone
Users that are interested in TouchStone are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Oct 10, 2023Updated 2 years ago
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- ☆13Nov 10, 2021Updated 4 years ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆51Oct 29, 2023Updated 2 years ago
- ☆134Dec 22, 2023Updated 2 years ago
- OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models☆151Jan 7, 2023Updated 3 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆363Jan 14, 2025Updated last year
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- see readme☆92May 9, 2022Updated 3 years ago
- ☆808Jul 8, 2024Updated last year
- A collection of visual instruction tuning datasets.☆77Mar 14, 2024Updated 2 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆130Oct 10, 2023Updated 2 years ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆255Aug 21, 2025Updated 7 months ago
- Pretrained Diffusion Models for Unified Human Motion Synthesis☆18Feb 28, 2023Updated 3 years ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆954Mar 19, 2025Updated last year
- paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/☆270Aug 9, 2023Updated 2 years ago
- Natural Perturbation for Robust Question Answering☆12Apr 7, 2020Updated 6 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated 2 years ago
- A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model To…☆1,062Oct 6, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆360Dec 18, 2023Updated 2 years ago
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆319Jun 3, 2024Updated last year
- Dataset for the investigation of visual semiotics, and how specific visual features and design choices can elicit specific emotions, thou…☆10Dec 13, 2023Updated 2 years ago
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…☆560Apr 21, 2024Updated last year
- ☆11Aug 4, 2024Updated last year
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,558Apr 24, 2024Updated last year
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆46Apr 3, 2025Updated last year
- Official repository of MMDU dataset☆106Sep 29, 2024Updated last year
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Sep 21, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Aug 18, 2023Updated 2 years ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆77Jan 18, 2023Updated 3 years ago
- ☆90Jul 4, 2024Updated last year
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆368Jul 24, 2025Updated 8 months ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Nov 13, 2024Updated last year
- Code, data, models for the Sherlock corpus☆61Nov 11, 2022Updated 3 years ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 5 months ago