"Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023
☆16Nov 28, 2024Updated last year
Alternatives and similar repositories for TGDoc
Users that are interested in TGDoc are comparing it to the libraries listed below
Sorting:
- The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.☆26Feb 22, 2024Updated 2 years ago
- The largest VQA dataset for Vietnamese. Related to the text content in the image.☆19Apr 9, 2025Updated 10 months ago
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta☆16Nov 11, 2024Updated last year
- ☆22Jun 30, 2023Updated 2 years ago
- SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image Enhancement Using tokens-to-token Transformer Network☆24Dec 9, 2023Updated 2 years ago
- [CVPR 2022 Challenge Rank 1st] The official code for V2L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval…☆29Jul 30, 2022Updated 3 years ago
- Official PyTorch implementation for ACM MM22 "UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior"☆25Aug 5, 2024Updated last year
- Dataset created for the Power Line Insulators Inspection Detections☆10Jul 2, 2020Updated 5 years ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆44Sep 24, 2024Updated last year
- ☆11Oct 31, 2024Updated last year
- ☆18Feb 16, 2025Updated last year
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 11 months ago
- Photorealism model use RealVisXL v4.0☆12Feb 20, 2024Updated 2 years ago
- Crawler based on a modified browser to detect online tracking.☆11Jul 19, 2023Updated 2 years ago
- ☆11Aug 17, 2014Updated 11 years ago
- ☆22Dec 11, 2025Updated 2 months ago
- Delving into the Continuous Domain Adaptation (ACM MM22)☆12Jul 10, 2022Updated 3 years ago
- Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetson☆13Jul 26, 2025Updated 7 months ago
- A simple model for classifying papers by academic venue (AI/ML/ACL), given a title and abstract. Bare-metal PyTorch port of https://gith…☆12Mar 22, 2018Updated 7 years ago
- Calculation of the entropy of the batch of images (whole image or patches)☆10Oct 15, 2021Updated 4 years ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- [WACV 2023] Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization☆13Mar 9, 2024Updated last year
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation☆19Nov 28, 2022Updated 3 years ago
- Long Context Research☆26Jan 26, 2026Updated last month
- ☆21Jun 16, 2025Updated 8 months ago
- An offline evaluation framework for sequence-based recommender systems☆13May 17, 2019Updated 6 years ago
- Graph Convolutional Module for Temporal Action Localization in Videos☆10Jul 4, 2020Updated 5 years ago
- ☆10Oct 20, 2025Updated 4 months ago
- VisuRiddles: Fine-grained Perception is a important thing for Multimodal Large Models in Riddles Solving☆18Oct 22, 2025Updated 4 months ago
- ☆14Nov 2, 2022Updated 3 years ago
- CBench, Benchmarking System for Question Answering Over Knowledge Graphs Systems.☆12Sep 16, 2022Updated 3 years ago
- Chat app for django built with django-channels☆10Dec 26, 2022Updated 3 years ago
- ☆14Sep 6, 2024Updated last year
- Change Detection towards Bitemporal Quality Difference via Hierarchical Correlation Distillation☆10Apr 30, 2024Updated last year
- ☆13Apr 23, 2025Updated 10 months ago
- Official implementation of "Diffusion models meet image counter-forensics"☆11Jan 22, 2024Updated 2 years ago
- ☆12Nov 22, 2022Updated 3 years ago
- A local search system implementation using Elasticsearch for Wikipedia data indexing and retrieval.☆12May 17, 2025Updated 9 months ago
- Accepted at IJCAI-2022☆11Sep 3, 2022Updated 3 years ago