zhiqic / ChartReaderLinks
[ICCV 2023] ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
☆25Updated last year
Alternatives and similar repositories for ChartReader
Users that are interested in ChartReader are comparing it to the libraries listed below
Sorting:
- Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023☆46Updated last year
- ☆67Updated last year
- ☆45Updated last year
- ☆81Updated last year
- ☆32Updated last year
- Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"☆91Updated last year
- The WordScape repository contains code for the WordScape pipeline to create datasets to train document understanding models.☆37Updated last year
- Dataset and scripts for HRDoc☆38Updated 2 years ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆272Updated 10 months ago
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆133Updated 2 weeks ago
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆28Updated 2 years ago
- ☆143Updated 2 years ago
- A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo☆34Updated last year
- Context-Aware Chart Element Detection☆48Updated last month
- Dataset introduced in PlotQA: Reasoning over Scientific Plots☆80Updated 2 years ago
- InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)☆158Updated last year
- [ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation☆64Updated 3 months ago
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆98Updated 7 months ago
- ☆25Updated last year
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆72Updated 3 weeks ago
- [ACL 2024] ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.☆130Updated last year
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆268Updated last year
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆96Updated 9 months ago
- ☆33Updated 6 months ago
- ☆224Updated 6 months ago
- Fully automated end-to-end framework to extract data from bar plots and other figures in scientific research papers using modules such as…☆121Updated 4 years ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆67Updated 6 months ago
- Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer☆62Updated 2 years ago
- Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)☆125Updated last year
- ☆117Updated last year