AlibabaResearch/AdvancedLiterateMachinery

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AlibabaResearch/AdvancedLiterateMachinery)

AlibabaResearch / AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

☆1,833

Alternatives and similar repositories for AdvancedLiterateMachinery

Users that are interested in AdvancedLiterateMachinery are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hikopensource / DAVAR-Lab-OCR
View on GitHub
OCR toolbox from Davar-Lab
☆762Jun 29, 2026Updated 3 weeks ago
X-PLUG / mPLUG-DocOwl
View on GitHub
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
☆2,408May 30, 2025Updated last year
cv-small-snails / Awesome-Table-Recognition
View on GitHub
A curated list of resources dedicated to table recognition
☆404Dec 12, 2024Updated last year
JiaquanYe / TableMASTER-mmocr
View on GitHub
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
☆470Jul 4, 2022Updated 4 years ago
SCUT-DLVCLab / Document-AI-Recommendations
View on GitHub
Algorithms, papers, datasets, performance comparisons for Document AI.
☆209Mar 1, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Ucas-HaoranWei / Vary
View on GitHub
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
☆1,889Dec 30, 2024Updated last year
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,154Feb 10, 2025Updated last year
poloclub / unitable
View on GitHub
UniTable: Towards a Unified Table Foundation Model
☆533Apr 21, 2026Updated 2 months ago
MaxKinny / TabRecSet
View on GitHub
A large scale camera-taken table detection and recognition dataset.
☆150Apr 9, 2026Updated 3 months ago
microsoft / table-transformer
View on GitHub
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…
☆2,930Jun 24, 2024Updated 2 years ago
wenwenyu / TCM
View on GitHub
Turning a CLIP Model into a Scene Text Detector (CVPR2023) | Turning a CLIP Model into a Scene Text Spotter (TPAMI)
☆202Jun 17, 2024Updated 2 years ago
Yuliang-Liu / MultimodalOCR
View on GitHub
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
☆870Updated this week
Tan-Junwen / awesome-table-structure-recognition
View on GitHub
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…
☆232Sep 9, 2024Updated last year
opendatalab / DocLayout-YOLO
View on GitHub
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆2,232Apr 14, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
HCIILAB / M6Doc
View on GitHub
☆163May 8, 2025Updated last year
namtuanly / MTL-TabNet
View on GitHub
MTL-TabNet: Multi-task Learning based Model for Image-based Table Recognition
☆103May 30, 2024Updated 2 years ago
Yuliang-Liu / Monkey
View on GitHub
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
☆1,948Jun 2, 2026Updated last month
clovaai / donut
View on GitHub
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
☆6,905Jul 11, 2024Updated 2 years ago
IBM / SynthTabNet
View on GitHub
Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files
☆154Sep 17, 2025Updated 10 months ago
WenmuZhou / TableGeneration
View on GitHub
通过浏览器渲染生成表格图像
☆238Apr 10, 2024Updated 2 years ago
large-ocr-model / large-ocr-model.github.io
View on GitHub
☆189Feb 27, 2024Updated 2 years ago
doc-analysis / DocBank
View on GitHub
DocBank: A Benchmark Dataset for Document Layout Analysis
☆652Aug 12, 2024Updated last year
RapidAI / TableStructureRec
View on GitHub
整理目前开源的最优表格识别模型，完善前后处理，模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…
☆954Aug 3, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
LayTextLLM / LayTextLLM
View on GitHub
☆103Dec 23, 2024Updated last year
InternScience / StructEqTable-Deploy
View on GitHub
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆276Dec 6, 2025Updated 7 months ago
HCIILAB / Scene-Text-Recognition-Recommendations
View on GitHub
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
☆353Nov 29, 2023Updated 2 years ago
wangwen-whu / WTW-Dataset
View on GitHub
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure …
☆184Sep 15, 2021Updated 4 years ago
FreeOCR-AI / layoutreader
View on GitHub
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
☆322Aug 15, 2025Updated 11 months ago
baudm / parseq
View on GitHub
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
☆726May 29, 2024Updated 2 years ago
Mountchicken / Union14M
View on GitHub
[ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective
☆206Nov 1, 2023Updated 2 years ago
clovaai / synthtiger
View on GitHub
Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021
☆578Jun 14, 2024Updated 2 years ago
ZZR8066 / SEMv2
View on GitHub
☆71Jun 26, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
open-mmlab / mmocr
View on GitHub
OpenMMLab Text Detection, Recognition and Understanding Toolbox
☆4,747Nov 27, 2024Updated last year
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,167Jan 23, 2026Updated 5 months ago
jpWang / LiLT
View on GitHub
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…
☆366Oct 31, 2022Updated 3 years ago
doc-analysis / XFUND
View on GitHub
XFUND: A Multilingual Form Understanding Benchmark
☆223Jul 15, 2022Updated 4 years ago
ibm-aur-nlp / PubTabNet
View on GitHub
☆483Jul 8, 2025Updated last year
mxin262 / ESTextSpotter
View on GitHub
(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
☆78Apr 9, 2024Updated 2 years ago
facebookresearch / nougat
View on GitHub
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆10,046Feb 21, 2025Updated last year