microsoft/table-transformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/table-transformer)

microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.

☆2,929

Alternatives and similar repositories for table-transformer

Users that are interested in table-transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

poloclub / unitable
View on GitHub
UniTable: Towards a Unified Table Foundation Model
☆533Apr 21, 2026Updated 2 months ago
cv-small-snails / Awesome-Table-Recognition
View on GitHub
A curated list of resources dedicated to table recognition
☆404Dec 12, 2024Updated last year
JiaquanYe / TableMASTER-mmocr
View on GitHub
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
☆470Jul 4, 2022Updated 4 years ago
deepdoctection / deepdoctection
View on GitHub
A Repo For Document AI
☆3,191Jun 20, 2026Updated 3 weeks ago
DevashishPrasad / CascadeTabNet
View on GitHub
This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table …
☆1,549Aug 27, 2021Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
IBM / SynthTabNet
View on GitHub
Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files
☆154Sep 17, 2025Updated 10 months ago
ibm-aur-nlp / PubTabNet
View on GitHub
☆483Jul 8, 2025Updated last year
AlibabaResearch / AdvancedLiterateMachinery
View on GitHub
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…
☆1,832Mar 17, 2026Updated 4 months ago
Layout-Parser / layout-parser
View on GitHub
A Unified Toolkit for Deep Learning Based Document Image Analysis
☆5,762Aug 15, 2024Updated last year
clovaai / donut
View on GitHub
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
☆6,903Jul 11, 2024Updated 2 years ago
Tan-Junwen / awesome-table-structure-recognition
View on GitHub
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…
☆232Sep 9, 2024Updated last year
phamquiluan / table-transformer
View on GitHub
CVPR 2022: Table Structure Recognition
☆40Apr 19, 2022Updated 4 years ago
hikopensource / DAVAR-Lab-OCR
View on GitHub
OCR toolbox from Davar-Lab
☆762Jun 29, 2026Updated 3 weeks ago
MaxKinny / TabRecSet
View on GitHub
A large scale camera-taken table detection and recognition dataset.
☆150Apr 9, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Academic-Hammer / SciTSR
View on GitHub
Table structure recognition dataset of the paper: Complicated Table Structure Recognition
☆383Jul 7, 2020Updated 6 years ago
doc-analysis / TableBank
View on GitHub
TableBank: A Benchmark Dataset for Table Detection and Recognition
☆1,080Aug 12, 2024Updated last year
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,165Jan 23, 2026Updated 5 months ago
mindee / doctr
View on GitHub
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning. Ongo…
☆6,185Updated this week
facebookresearch / nougat
View on GitHub
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆10,047Feb 21, 2025Updated last year
jpWang / LiLT
View on GitHub
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…
☆366Oct 31, 2022Updated 3 years ago
xavctn / img2table
View on GitHub
img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
☆882Jul 12, 2026Updated last week
wangwen-whu / WTW-Dataset
View on GitHub
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure …
☆184Sep 15, 2021Updated 4 years ago
namtuanly / MTL-TabNet
View on GitHub
MTL-TabNet: Multi-task Learning based Model for Image-based Table Recognition
☆103May 30, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
jsvine / pdfplumber
View on GitHub
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
☆10,565Jun 17, 2026Updated last month
Psarpei / Multi-Type-TD-TSR
View on GitHub
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition
☆289Sep 5, 2022Updated 3 years ago
Unstructured-IO / unstructured
View on GitHub
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…
☆15,160Updated this week
doc-analysis / DocBank
View on GitHub
DocBank: A Benchmark Dataset for Document Layout Analysis
☆652Aug 12, 2024Updated last year
WenmuZhou / TableGeneration
View on GitHub
通过浏览器渲染生成表格图像
☆238Apr 10, 2024Updated 2 years ago
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,119Updated this week
tstanislawek / awesome-document-understanding
View on GitHub
A curated list of resources for Document Understanding (DU) topic
☆1,525Jun 2, 2023Updated 3 years ago
shabie / docformer
View on GitHub
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…
☆290Feb 13, 2023Updated 3 years ago
ibm-aur-nlp / PubLayNet
View on GitHub
☆1,051Jul 9, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
NielsRogge / Transformers-Tutorials
View on GitHub
This repository contains demos I made with the Transformers library by HuggingFace.
☆11,674Apr 20, 2026Updated 3 months ago
InternScience / StructEqTable-Deploy
View on GitHub
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆276Dec 6, 2025Updated 7 months ago
sachinraja13 / TabStructNet
View on GitHub
☆132Mar 24, 2023Updated 3 years ago
whn09 / table_structure_recognition
View on GitHub
Table detection (TD) and table structure recognition (TSR) using Yolov5/Yolov8, and you can get the same (even better) result compared wi…
☆52Jul 3, 2024Updated 2 years ago
ZZR8066 / SEMv2
View on GitHub
☆71Jun 26, 2024Updated 2 years ago
opendatalab / DocLayout-YOLO
View on GitHub
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆2,230Apr 14, 2025Updated last year
conjuncts / gmft
View on GitHub
Lightweight, performant, deep table extraction
☆539Jul 5, 2026Updated 2 weeks ago