ZZZHANG-jx/DocKylin

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZZZHANG-jx/DocKylin)

ZZZHANG-jx / DocKylin

[AAAI 2025] DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

☆36

Alternatives and similar repositories for DocKylin

Users that are interested in DocKylin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shi-yx / URaG
View on GitHub
Official implementation of URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding (AAAI 2026…
☆43Feb 4, 2026Updated 5 months ago
lcy0604 / CTRNet-plus
View on GitHub
The official implement of CTRNet++.
☆15Dec 30, 2024Updated last year
ZZZHANG-jx / WMeter-Reader
View on GitHub
[TIM 2025] Towards Accurate Readings of Water Meters by Eliminating Transition Error: New Dataset and Effective Solution
☆19Mar 5, 2025Updated last year
TenMilesLotus / DTSM
View on GitHub
Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator
☆13Apr 28, 2024Updated 2 years ago
SCUT-DLVCLab / OCR-Reasoning
View on GitHub
[ICLR 2026] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning
☆77May 26, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ZZZHANG-jx / Marior
View on GitHub
[ACM MM 2022] Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild
☆26Aug 12, 2022Updated 3 years ago
whlscut / DocLayLLM
View on GitHub
[CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding
☆30Dec 18, 2025Updated 6 months ago
ZZZHANG-jx / GCDRNet
View on GitHub
[TAI 2023] Appearance Enhancement for Camera-captured Document Images in the Wild
☆58Aug 28, 2025Updated 10 months ago
HCIILAB / M5HisDoc
View on GitHub
☆34Dec 18, 2025Updated 6 months ago
SCUT-DLVCLab / RFUND
View on GitHub
[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking f…
☆21Dec 4, 2024Updated last year
FelixHertlein / doc-matcher
View on GitHub
Inference, training and evaluation code for our paper "DocMatcher: Document Image Dewarping via Structural and Textual Line Matching" (WA…
☆55Jul 1, 2025Updated last year
HCIILAB / LAST
View on GitHub
Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition
☆28Aug 29, 2023Updated 2 years ago
SCUT-DLVCLab / GPT-4V_OCR
View on GitHub
Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)
☆128Nov 13, 2023Updated 2 years ago
Dawars / DocMAE
View on GitHub
Unofficial implementation of DocMAE (WIP): Document Image Rectification via Self-supervised Representation Learning
☆20Dec 20, 2023Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
ZZZHANG-jx / Awesome-Image-based-Meter-Recognition-Reading
View on GitHub
☆30Nov 8, 2024Updated last year
ZZZHANG-jx / DocAligner
View on GitHub
[PR 2025] DocAligner: Automating the Annotation of Photographed Documents Through Real-virtual Alignment
☆109Aug 4, 2025Updated 11 months ago
xhli-git / DocSAM
View on GitHub
☆32Apr 8, 2025Updated last year
mxin262 / Bridging-Text-Spotting
View on GitHub
(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
☆75Jun 11, 2024Updated 2 years ago
amazon-science / visfocus
View on GitHub
☆24Apr 29, 2025Updated last year
SCUT-DLVCLab / TongGu-LLM
View on GitHub
[EMNLP 2024] TongGu, a classical Chinese language model.
☆68Sep 28, 2024Updated last year
RylonW / DocNLC
View on GitHub
Official code for DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degra…
☆44Mar 20, 2026Updated 3 months ago
adlnlp / mmvqa
View on GitHub
☆19Sep 11, 2024Updated last year
SCUT-DLVCLab / Document-AI-Recommendations
View on GitHub
Algorithms, papers, datasets, performance comparisons for Document AI.
☆209Mar 1, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
IBM / KVP10k
View on GitHub
Repository for the KVP10k dataset
☆23Sep 18, 2025Updated 9 months ago
soumitri2001 / SURDS-SSL-OSV
View on GitHub
SURDS: Self-Supervised Attention-guided Reconstruction and Dual Triplet Loss for Writer Independent Offline Signature Verification", ICPR…
☆16Jul 22, 2022Updated 3 years ago
HCIILAB / SCUT-EnsText
View on GitHub
☆69Apr 18, 2024Updated 2 years ago
LayTextLLM / LayTextLLM
View on GitHub
☆103Dec 23, 2024Updated last year
Line-Kite / GraphLayoutLM
View on GitHub
☆14Sep 6, 2024Updated last year
RichSu95 / Document_Binarization_Collection
View on GitHub
This repository is a concise collection of well known deep learning based document binarization models.
☆30Dec 24, 2022Updated 3 years ago
Yushu-Li / OWTTT
View on GitHub
[ICCV 2023 Oral] Official repository for “On the Robustness of Open-World Test-Time Training: Self-Training with Dynamic Prototype Expans…
☆46Dec 18, 2024Updated last year
yh-hust / DocSeeker
View on GitHub
[CVPR 2026 Highlight] DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding
☆17Jun 4, 2026Updated last month
yeungchenwa / HDR
View on GitHub
[AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents
☆111Jun 28, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
HCIILAB / M6Doc
View on GitHub
☆163May 8, 2025Updated last year
mhashas / Document-Image-Unwarping-pytorch
View on GitHub
Pytorch implementation and extension of "DocUnet: Document Image Unwarping via A Stacked U-Net"
☆117Jul 2, 2020Updated 6 years ago
mxin262 / ESTextSpotter
View on GitHub
(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
☆78Apr 9, 2024Updated 2 years ago
shengfly / writer-identification
View on GitHub
☆11Jun 3, 2025Updated last year
fh2019ustc / Awesome-Document-Image-Rectification
View on GitHub
A comprehensive list of awesome document image rectification papers.
☆555Apr 15, 2026Updated 3 months ago
tanguymagne / UVDoc-Dataset
View on GitHub
Code for the paper "UVDoc: Neural Grid-based Document Unwarping" - Dataset capture and creation
☆35May 27, 2024Updated 2 years ago
MAEHCM / ICL-D3IE
View on GitHub
Code for ICCV 2023 Paper : “ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction”
☆54Aug 8, 2023Updated 2 years ago