Lightweight, performant, deep table extraction
☆537Feb 22, 2026Updated 4 months ago
Alternatives and similar repositories for gmft
Users that are interested in gmft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing☆876May 10, 2026Updated last month
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,921Jun 24, 2024Updated 2 years ago
- Stream live plots to a matplotlib figure☆80Apr 18, 2025Updated last year
- UniTable: Towards a Unified Table Foundation Model☆532Apr 21, 2026Updated 2 months ago
- Detect and extract tables to markdown and csv☆749Jan 24, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Improved file parsing for LLM’s☆3,162May 17, 2026Updated last month
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,757Jan 3, 2025Updated last year
- Convert PDF to markdown + JSON quickly with high accuracy☆36,494Jun 23, 2026Updated last week
- Using GPT to parse PDF☆3,556Apr 17, 2025Updated last year
- OCR, layout analysis, reading order, table recognition in 90+ languages☆21,010Jun 13, 2026Updated 3 weeks ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,146Feb 10, 2025Updated last year
- A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…☆232Sep 9, 2024Updated last year
- Implementation of Nougat Neural Optical Understanding for Academic Documents☆10,029Feb 21, 2025Updated last year
- ☆168Oct 31, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆305Sep 10, 2024Updated last year
- An experimental UI for text-to-knowledge-graph generation☆780May 2, 2024Updated 2 years ago
- Extract structured text from pdfs quickly☆700Jun 10, 2026Updated 3 weeks ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆275Dec 6, 2025Updated 6 months ago
- High-performance retrieval engine for unstructured data☆1,588Nov 10, 2025Updated 7 months ago
- Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…☆15,066Jun 24, 2026Updated last week
- A Repo For Document AI☆3,185Jun 20, 2026Updated 2 weeks ago
- Knowledge Agents and Management in the Cloud☆4,252May 18, 2026Updated last month
- A Python library to extract tabular data from PDFs☆3,767Jun 24, 2026Updated last week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Developer APIs to Accelerate LLM Projects☆1,747Oct 18, 2024Updated last year
- An opinionated list of awesome Ollama web and desktop uis, frameworks, libraries, software and resources.☆474Jun 25, 2026Updated last week
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,831Mar 17, 2026Updated 3 months ago
- No-code ETL and data pipelines with AI and NLP☆317Feb 20, 2025Updated last year
- ChatPilot: Chat Agent Web UI,实现Chat对话前端,支持Google搜索、文件网址对话(RAG)、代码解释器功能,复现了Kimi Chat(文件,拖进来;网址,发出来)。☆600Jan 27, 2026Updated 5 months ago
- Structured data extraction, instruction calling and agentic workflows with ML, LLM and Vision LLM☆5,170Jun 27, 2026Updated last week
- ☆550Jul 26, 2024Updated last year
- Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.☆973Oct 15, 2024Updated last year
- ☆50Feb 5, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆7,618Dec 12, 2025Updated 6 months ago
- Scrape the webpage convert it into Markdown, and enhance AI search applications.☆258May 11, 2024Updated 2 years ago
- A Unified Toolkit for Deep Learning Based Document Image Analysis☆5,750Aug 15, 2024Updated last year
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.☆1,759Dec 21, 2024Updated last year
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆2,206Apr 14, 2025Updated last year
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆321Aug 15, 2025Updated 10 months ago
- Toolkit for linearizing PDFs for LLM datasets/training☆18,650Mar 25, 2026Updated 3 months ago