A huge dataset for Document Visual Question Answering
☆20Jul 29, 2024Updated last year
Alternatives and similar repositories for docmatix
Users that are interested in docmatix are comparing it to the libraries listed below
Sorting:
- ☆12Jun 20, 2023Updated 2 years ago
- The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.☆26Feb 22, 2024Updated 2 years ago
- ☆45Jul 18, 2022Updated 3 years ago
- MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension☆52Dec 3, 2024Updated last year
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆53Dec 12, 2024Updated last year
- ☆51May 28, 2024Updated last year
- ☆24Dec 26, 2024Updated last year
- Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer☆62Jan 11, 2023Updated 3 years ago
- ☆25May 13, 2024Updated last year
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆30Apr 27, 2024Updated last year
- Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"☆32May 28, 2025Updated 9 months ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆30Jul 18, 2023Updated 2 years ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Jun 26, 2024Updated last year
- [ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text☆413May 5, 2025Updated 9 months ago
- ☆85Aug 18, 2024Updated last year
- ☆41Sep 9, 2025Updated 5 months ago
- ☆32Apr 14, 2024Updated last year
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆44Apr 18, 2025Updated 10 months ago
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆152Jan 13, 2025Updated last year
- A large-scale training and benchmarking framework for rPPG.☆10Nov 26, 2024Updated last year
- Implementing BERT + CRF with PyTorch for Chinese NER.☆10Mar 7, 2022Updated 3 years ago
- ☆40Aug 18, 2021Updated 4 years ago
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆269Jun 12, 2024Updated last year
- A simple script to add pdf-files to Zotero via CLI☆12May 17, 2020Updated 5 years ago
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- ☆12May 20, 2025Updated 9 months ago
- Everything you need to reproduce "Better plain ViT baselines for ImageNet-1k" in PyTorch, and more☆12Feb 16, 2026Updated last week
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆14Aug 8, 2025Updated 6 months ago
- ☆10Dec 21, 2020Updated 5 years ago
- A collection of papers tackling automatic fact-checking (particularly of AI-generated content)☆14Nov 3, 2023Updated 2 years ago
- Improving transparency of large language models' reasoning☆14Nov 25, 2025Updated 3 months ago
- Official codebase for our paper "Do Language Models Use Their Depth Efficiently?"☆29Jun 25, 2025Updated 8 months ago
- Hi, I'm Harmony the Hummingbird! Let's work together on whatever you care about.☆12May 3, 2024Updated last year
- The code of source-only training for our method☆11Mar 3, 2022Updated 3 years ago
- ☆18Mar 2, 2025Updated 11 months ago
- [ICML2023] Instant Soup Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models. Ajay Jaiswal, Shiwei Liu, Ti…☆11Nov 28, 2023Updated 2 years ago
- (Siggraph Asia 2023) Project Page of "HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image"☆10Dec 9, 2023Updated 2 years ago
- ☆10Oct 20, 2022Updated 3 years ago
- Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models☆11Jan 23, 2024Updated 2 years ago