☆17Jun 12, 2024Updated last year
Alternatives and similar repositories for pdfvqa
Users that are interested in pdfvqa are comparing it to the libraries listed below
Sorting:
- ☆21Apr 2, 2025Updated 11 months ago
- ☆69Jan 9, 2024Updated 2 years ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆63May 15, 2025Updated 9 months ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- Contrast-guided Feature Adjustment Module for Visual Information Extraction☆30May 23, 2023Updated 2 years ago
- Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.☆32Feb 26, 2025Updated last year
- Fruits-360 images resized to 100x100 pixels☆16Feb 22, 2026Updated last week
- Baselines for all tasks from Long Code Arena benchmarks 🏟️☆39Mar 30, 2025Updated 11 months ago
- DroidAgent: Intent-Driven Mobile GUI Testing with Autonomous LLM Agents☆58Mar 12, 2024Updated last year
- 携程酒店接口sdk☆19Jun 29, 2015Updated 10 years ago
- A template for a Djinni library that can be used in Java/Kotlin, ObjC/Swift and C#☆11Oct 6, 2022Updated 3 years ago
- ☆12Jan 11, 2026Updated last month
- A simple Streamlit frontend for a pre-trained MobileNet CNN model + OpenCV for face mask detection in images.☆10Mar 25, 2023Updated 2 years ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- Integrating neurosymbolic representations into LLMs for interpretability, steering, and running symbolic algorithms☆14Feb 2, 2026Updated last month
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- [EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"☆15Aug 26, 2025Updated 6 months ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- Continuous quality evaluation of ML algorithms via CI/CD and GitHub Actions.☆16Jan 15, 2020Updated 6 years ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆95Jan 7, 2025Updated last year
- Spring boot application with spring security and jwt integration☆10Sep 18, 2020Updated 5 years ago
- Code for the arxiv paper: Complex Claim Verification with Evidence Retrieved in the Wild☆13Nov 27, 2023Updated 2 years ago
- Solution of the telegram ML competition 2023☆14May 26, 2024Updated last year
- Code for our project CROWN (Conversational Passage Ranking by Reasoning over Word Networks)☆10Jan 11, 2024Updated 2 years ago
- Official implementation of OpenTab (ICLR2024)☆13Mar 27, 2024Updated last year
- ☆10Dec 3, 2021Updated 4 years ago
- A Fast Image Converter thats supports common image formats. It's using WebAssembly for all conversions so no image is sent to the server…☆11Jul 10, 2025Updated 7 months ago
- resnet_cifar10_cifar100_imagenet☆14Oct 30, 2018Updated 7 years ago
- ☆11Sep 14, 2020Updated 5 years ago
- Amlogic G12A Mali support for Mali Bifrost based SoCs, for Mainline Linux only☆11Jan 28, 2023Updated 3 years ago
- 定制爬虫工具(sqlserver版),通过正则表达式自定义抓取模版,通过自定义数据模型入库☆10Sep 5, 2017Updated 8 years ago
- LLM inference in C/C++☆24Updated this week
- Corpus to accompany: "Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding"☆11Apr 11, 2025Updated 10 months ago
- Easily setup production-ready standalone Clickhouse and monitor it☆10Jan 14, 2025Updated last year
- ☆11Oct 15, 2022Updated 3 years ago
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)☆11Nov 6, 2024Updated last year