harrytea/TGDoc

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/harrytea/TGDoc)

harrytea / TGDoc

"Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023

☆16

Alternatives and similar repositories for TGDoc

Users that are interested in TGDoc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

harrytea / UDoc-GAN
View on GitHub
Official PyTorch implementation for ACM MM22 "UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior"
☆25Aug 5, 2024Updated last year
InternScience / SimChart9K
View on GitHub
The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.
☆26Feb 22, 2024Updated 2 years ago
Line-Kite / GraphLayoutLM
View on GitHub
☆14Sep 6, 2024Updated last year
kyegomez / VisionLLaMA
View on GitHub
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
☆15Nov 11, 2024Updated last year
pyxy-org / pyxy
View on GitHub
HTML in Python
☆14Jul 19, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
qhnhynmm / ViOCRVQA-Dataset
View on GitHub
The largest VQA dataset for Vietnamese. Related to the text content in the image.
☆19Apr 9, 2025Updated last year
scaleapi / PRBench
View on GitHub
Open source codebase for PRBench
☆18Jan 15, 2026Updated 6 months ago
EAI-MCC / Awesome-ObjectGoal-Navigation
View on GitHub
Collections of object goal navigation papers in recent top-tier conferences.
☆14Sep 24, 2022Updated 3 years ago
aimagelab / FourBi
View on GitHub
Binarizing Documents by Leveraging both Space and Frequency. (ICDAR 2024)
☆18May 15, 2025Updated last year
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 2 months ago
YukunLi99 / AdaptSAM
View on GitHub
☆22Jun 30, 2023Updated 3 years ago
floatingsun / transformer_layers_as_painters
View on GitHub
transformer layers behavior as painters🧑‍🎨
☆15May 6, 2025Updated last year
Hypatiaalegra / LogicGame-Data
View on GitHub
Dev and Test Data of LogicGame benchmark
☆19Mar 31, 2025Updated last year
mit1208 / Document-AI
View on GitHub
☆19Feb 5, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
RisabBiswas / T2T-BinFormer
View on GitHub
SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image Enhancement Using tokens-to-token Transformer Network
☆24Dec 9, 2023Updated 2 years ago
deepopinion / anls_star_metric
View on GitHub
Official implementation of the ANLS* metric
☆25Jul 13, 2026Updated last week
bzluan / TextCoT
View on GitHub
[ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"
☆45Feb 27, 2026Updated 4 months ago
xmed-lab / SC-Cor
View on GitHub
ECCV 2022: Learning Shadow Correspondence for Video Shadow Detection
☆14Jul 18, 2022Updated 4 years ago
harrytea / Awesome-Document-Understanding
View on GitHub
Document Artifical Intelligence
☆201Sep 28, 2025Updated 9 months ago
FlagOpen / FlagAI
View on GitHub
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
☆18Nov 20, 2024Updated last year
arijitray1993 / SAT
View on GitHub
Spatial Aptitude Training for Multimodal Langauge Models
☆33Feb 8, 2026Updated 5 months ago
aim-uofa / ConvNova
View on GitHub
☆13Apr 23, 2025Updated last year
TencentARC / TaCA
View on GitHub
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
☆16Jun 20, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Taha0229 / self-reflective-RAG
View on GitHub
Exploring SOTA Advanced RAG techniques: This project implements a self reflective RAG, seamlessly integrating multiple knowledge sources …
☆20Jul 8, 2024Updated 2 years ago
MelosY / CAM
View on GitHub
☆27Feb 20, 2024Updated 2 years ago
dgcnz / edge
View on GitHub
Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetson
☆14Jul 26, 2025Updated 11 months ago
tobiastoft91 / VGGish_AudioClassifer_02456
View on GitHub
Acoustic Scene Classification using transfer learning on VGGish pre-trained model
☆11Jan 3, 2018Updated 8 years ago
CuteBoiz / Ubuntu_Installation
View on GitHub
Thing To-Do After Install Ubuntu
☆12Sep 9, 2023Updated 2 years ago
ThreeRiversAINexus / sample-agents
View on GitHub
☆21May 14, 2025Updated last year
WangWenhao0716 / V2L
View on GitHub
[CVPR 2022 Challenge Rank 1st] The official code for V2L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval…
☆29Jul 30, 2022Updated 3 years ago
KRR-Oxford / HierarchyTransformers
View on GitHub
Language Models as Hierarchy Encoders
☆43Jan 6, 2026Updated 6 months ago
Ikomia-dev / onnx-donut
View on GitHub
Export Donut model to onnx and run it with onnxruntime
☆23Nov 21, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yh-hust / VisuRiddles
View on GitHub
VisuRiddles: Fine-grained Perception is a important thing for Multimodal Large Models in Riddles Solving
☆20Jun 9, 2026Updated last month
Oksitaine / RealVisXL-v4.0
View on GitHub
Photorealism model use RealVisXL v4.0
☆12Feb 20, 2024Updated 2 years ago
hilmansw / PDF-Summarizer
View on GitHub
PDF Summarizer using Streamlit, LangChain, and OpenAI frameworks.
☆24Oct 18, 2023Updated 2 years ago
DS3Lab / WordScape
View on GitHub
The WordScape repository contains code for the WordScape pipeline to create datasets to train document understanding models.
☆42Dec 7, 2023Updated 2 years ago
ruocwang / mixture-of-prompts
View on GitHub
[ICML 2024] One Prompt is Not Enough: Automated Construction of a Mixture-of-Expert Prompts - TurningPoint AI
☆31Sep 25, 2024Updated last year
RUCAIBox / GPO
View on GitHub
About The official GitHub page for ''Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with …
☆30Dec 12, 2024Updated last year
eufouria / toxic-text-classification
View on GitHub
API for toxic text classification, utilized pre-trained Distilbert and trained on Kaggle datasets. It helps identify and handle toxic con…
☆14Apr 30, 2024Updated 2 years ago