conjuncts/gmft

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/conjuncts/gmft)

conjuncts / gmft

Lightweight, performant, deep table extraction

☆539

Alternatives and similar repositories for gmft

Users that are interested in gmft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xavctn / img2table
View on GitHub
img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
☆885Jul 12, 2026Updated 2 weeks ago
microsoft / table-transformer
View on GitHub
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…
☆2,931Jun 24, 2024Updated 2 years ago
stephane-caron / matplotlive
View on GitHub
Stream live plots to a matplotlib figure
☆79Jul 3, 2026Updated 3 weeks ago
poloclub / unitable
View on GitHub
UniTable: Towards a Unified Table Foundation Model
☆534Apr 21, 2026Updated 3 months ago
VikParuchuri / tabled
View on GitHub
Detect and extract tables to markdown and csv
☆748Jan 24, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Filimoa / open-parse
View on GitHub
Improved file parsing for LLM’s
☆3,163May 17, 2026Updated 2 months ago
opendatalab / PDF-Extract-Kit
View on GitHub
A Comprehensive Toolkit for High-Quality PDF Content Extraction
☆9,810Jan 3, 2025Updated last year
CosmosShadow / gptpdf
View on GitHub
Using GPT to parse PDF
☆3,561Apr 17, 2025Updated last year
datalab-to / marker
View on GitHub
Convert PDF to markdown + JSON quickly with high accuracy
☆37,970Jul 20, 2026Updated last week
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,167Updated this week
Tan-Junwen / awesome-table-structure-recognition
View on GitHub
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…
☆232Sep 9, 2024Updated last year
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,208Feb 10, 2025Updated last year
Mustafa-Esoofally / podcast-engine-groq
View on GitHub
☆167Oct 31, 2024Updated last year
facebookresearch / nougat
View on GitHub
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆10,053Feb 21, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
360AILAB-NLP / 360LayoutAnalysis
View on GitHub
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
☆305Sep 10, 2024Updated last year
yoheinakajima / prettygraph
View on GitHub
An experimental UI for text-to-knowledge-graph generation
☆779May 2, 2024Updated 2 years ago
yobix-ai / extractous
View on GitHub
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
☆1,768Dec 21, 2024Updated last year
datalab-to / pdftext
View on GitHub
Extract structured text from pdfs quickly
☆710Jul 8, 2026Updated 2 weeks ago
D-Star-AI / dsRAG
View on GitHub
High-performance retrieval engine for unstructured data
☆1,589Nov 10, 2025Updated 8 months ago
Unstructured-IO / unstructured
View on GitHub
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…
☆15,210Updated this week
deepdoctection / deepdoctection
View on GitHub
A Repo For Document AI
☆3,199Jun 20, 2026Updated last month
InternScience / StructEqTable-Deploy
View on GitHub
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆276Dec 6, 2025Updated 7 months ago
run-llama / llama_cloud_services
View on GitHub
Knowledge Agents and Management in the Cloud
☆4,257May 18, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
camelot-dev / camelot
View on GitHub
A Python library to extract tabular data from PDFs
☆3,792Updated this week
AIAnytime / On-device-real-time-RAG-App
View on GitHub
On-device real-time RAG App built using Jina Reader, Mediapipe, Gemma 2b IT LLM.
☆15Apr 15, 2024Updated 2 years ago
nlmatics / llmsherpa
View on GitHub
Developer APIs to Accelerate LLM Projects
☆1,746Oct 18, 2024Updated last year
EndoTheDev / Awesome-Ollama
View on GitHub
An opinionated list of awesome Ollama web and desktop uis, frameworks, libraries, software and resources.
☆478Jun 25, 2026Updated last month
sinaptik-ai / panda-etl
View on GitHub
No-code ETL and data pipelines with AI and NLP
☆315Feb 20, 2025Updated last year
AlibabaResearch / AdvancedLiterateMachinery
View on GitHub
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…
☆1,832Mar 17, 2026Updated 4 months ago
shibing624 / ChatPilot
View on GitHub
ChatPilot: Chat Agent Web UI，实现Chat对话前端，支持Google搜索、文件网址对话（RAG）、代码解释器功能，复现了Kimi Chat(文件，拖进来；网址，发出来)。
☆600Jan 27, 2026Updated 6 months ago
Zyphra / transformers_zamba2
View on GitHub
☆49Feb 5, 2025Updated last year
katanaml / sparrow
View on GitHub
Structured data extraction, instruction calling and agentic workflows with ML, LLM and Vision LLM
☆5,188Jun 30, 2026Updated 3 weeks ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
opendatalab / DocLayout-YOLO
View on GitHub
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆2,236Apr 14, 2025Updated last year
adithya-s-k / marker-api
View on GitHub
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
☆977Oct 15, 2024Updated last year
opendatalab / magic-doc
View on GitHub
☆549Jul 26, 2024Updated 2 years ago
Layout-Parser / layout-parser
View on GitHub
A Unified Toolkit for Deep Learning Based Document Image Analysis
☆5,767Aug 15, 2024Updated last year
zzzgydi / webscraper
View on GitHub
Scrape the webpage convert it into Markdown, and enhance AI search applications.
☆258May 11, 2024Updated 2 years ago
huridocs / pdf-document-layout-analysis
View on GitHub
A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…
☆1,275Jul 13, 2026Updated 2 weeks ago
FreeOCR-AI / layoutreader
View on GitHub
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
☆323Aug 15, 2025Updated 11 months ago