Alpha-Innovator / SimChart9K
The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.
☆21Updated 10 months ago
Alternatives and similar repositories for SimChart9K:
Users that are interested in SimChart9K are comparing it to the libraries listed below
- A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo☆32Updated 5 months ago
- ☆87Updated last year
- ☆94Updated last year
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆67Updated last month
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆18Updated last year
- The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity". Th…☆42Updated 2 months ago
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆109Updated last month
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆56Updated last year
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆50Updated last month
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆63Updated 6 months ago
- Official repository of MMDU dataset☆80Updated 3 months ago
- ☆17Updated 10 months ago
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆42Updated 7 months ago
- ☆24Updated 8 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆38Updated 3 months ago
- A huge dataset for Document Visual Question Answering☆15Updated 5 months ago
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models☆40Updated 6 months ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆18Updated 2 months ago
- A collection of visual instruction tuning datasets.☆76Updated 10 months ago
- [ACL 2024] ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.☆110Updated 4 months ago
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆45Updated 8 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆56Updated 2 months ago
- Dataset pruning for ImageNet and LAION-2B.☆69Updated 6 months ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆80Updated last year
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆62Updated 7 months ago
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆89Updated last week
- ☆19Updated last year
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆78Updated 3 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆77Updated 11 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆68Updated this week