zengxingchen / ChartQA-MLLM
[IEEE VIS 2024] LLaVA-Chart: Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning
☆67Updated 3 months ago
Alternatives and similar repositories for ChartQA-MLLM:
Users that are interested in ChartQA-MLLM are comparing it to the libraries listed below
- ☆86Updated 2 weeks ago
- [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding☆75Updated 3 weeks ago
- Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"☆53Updated 5 months ago
- The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆152Updated last month
- This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.☆39Updated 4 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆70Updated 5 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆55Updated 6 months ago
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆184Updated 3 weeks ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆73Updated last month
- [Preprint] A Generalizable and Purely Unsupervised Self-Training Framework☆50Updated last week
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆60Updated 9 months ago
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆30Updated 3 months ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆84Updated 9 months ago
- This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]☆62Updated last week
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆108Updated this week
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆42Updated 2 months ago
- ☆55Updated 2 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆35Updated 6 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆93Updated 6 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆50Updated 4 months ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆76Updated 3 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆35Updated 3 months ago
- MMR1: Advancing the Frontiers of Multimodal Reasoning☆155Updated last month
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆125Updated 2 months ago
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆203Updated 5 months ago
- [arXiv 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"☆10Updated 3 weeks ago
- ☆54Updated last month
- [NeurIPS 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.☆103Updated 2 weeks ago
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆132Updated 5 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆100Updated 2 months ago