[ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
☆16Apr 4, 2024Updated last year
Alternatives and similar repositories for DocumentCLIP
Users that are interested in DocumentCLIP are comparing it to the libraries listed below
Sorting:
- [EACL'23] COVID-VTS: Fact Extraction and Verification on Short Video Platforms☆11Sep 26, 2023Updated 2 years ago
- [EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning☆104Jul 18, 2024Updated last year
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 3 years ago
- [KDD 2023] Multi-Grained Multimodal Interaction Network for Entity Linking☆28Sep 17, 2023Updated 2 years ago
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆95Jan 7, 2025Updated last year
- DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.☆25Mar 17, 2021Updated 4 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated last year
- This is a data repository for the ACL 2020 paper: "Let Me Choose: From Verbal Context to Font Selection"☆10May 5, 2020Updated 5 years ago
- ☆11May 24, 2024Updated last year
- Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding☆11May 23, 2024Updated last year
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- ☆10Sep 7, 2022Updated 3 years ago
- LAGr: Label Aligned Graphs for Better Systematic Generalization in Semantic Parsing☆10Jun 1, 2022Updated 3 years ago
- ☆10Jul 21, 2023Updated 2 years ago
- Official code for the paper: "Perception and Semantic Aware Regularization for Sequential Confidence Calibration (CVPR2023)"☆10May 15, 2024Updated last year
- Interactive 3D Avatar Profile Viewer generated in Ready Player Me☆10Aug 27, 2022Updated 3 years ago
- Extract data insights and visualisations with natural language☆13May 27, 2024Updated last year
- The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models☆12Oct 28, 2024Updated last year
- RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering☆10Nov 27, 2022Updated 3 years ago
- Automatize local data analysis with team of tool-using GPT agents☆15Apr 1, 2024Updated last year
- Save books from Znanium for offline reading☆19Aug 16, 2025Updated 6 months ago
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation☆19Nov 28, 2022Updated 3 years ago
- ☆12Apr 27, 2023Updated 2 years ago
- Scaffolding for multi-user Elm applications via Gulp, Express, and SockJS.☆10Apr 10, 2015Updated 10 years ago
- Open ChatGLM Eyes to See the World☆13Mar 30, 2023Updated 2 years ago
- ☆13Jul 28, 2024Updated last year
- 用Paddle复现Recipes for building an open-domain chatbot论文☆11Nov 1, 2021Updated 4 years ago
- ☆12Jul 15, 2021Updated 4 years ago
- This repository is created on top of two repositories i.e., yolov7 face detection and yolov7 blurring object☆15Jan 21, 2023Updated 3 years ago
- ☆12Jun 11, 2023Updated 2 years ago
- Convert pdf to pages of images☆13Apr 18, 2020Updated 5 years ago
- A Python package for extracting confidence scores from LLM models outputs, particularly using log probabilities.☆19Sep 15, 2024Updated last year
- ☆40Updated this week
- This is a repository for the ACL 2020 paper: "Let Me Choose: From Verbal Context to Font Selection"☆12Nov 21, 2022Updated 3 years ago
- ☆13Jul 10, 2024Updated last year
- ☆10May 31, 2021Updated 4 years ago
- Resources for our AAAI 2022 paper: "Unsupervised Editing for Counterfactual Stories".☆11Oct 25, 2022Updated 3 years ago
- ☆12Oct 21, 2019Updated 6 years ago
- Code example for pretraining an LLM with vanilla PyTorch training loop☆10Jun 6, 2024Updated last year