Lightweight, performant, deep table extraction
☆535Feb 22, 2026Updated 3 months ago
Alternatives and similar repositories for gmft
Users that are interested in gmft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing☆863May 10, 2026Updated 2 weeks ago
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,905Jun 24, 2024Updated last year
- Stream live plots to a matplotlib figure☆81Apr 18, 2025Updated last year
- UniTable: Towards a Unified Table Foundation Model☆531Apr 21, 2026Updated last month
- Detect and extract tables to markdown and csv☆753Jan 24, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Improved file parsing for LLM’s☆3,158May 17, 2026Updated last week
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,669Jan 3, 2025Updated last year
- Convert PDF to markdown + JSON quickly with high accuracy☆35,381May 5, 2026Updated 2 weeks ago
- Using GPT to parse PDF☆3,555Apr 17, 2025Updated last year
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,756May 6, 2026Updated 2 weeks ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,127Feb 10, 2025Updated last year
- A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…☆228Sep 9, 2024Updated last year
- Implementation of Nougat Neural Optical Understanding for Academic Documents☆9,974Feb 21, 2025Updated last year
- ☆169Oct 31, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆306Sep 10, 2024Updated last year
- An experimental UI for text-to-knowledge-graph generation☆782May 2, 2024Updated 2 years ago
- Extract structured text from pdfs quickly☆686Jun 11, 2025Updated 11 months ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆277Dec 6, 2025Updated 5 months ago
- High-performance retrieval engine for unstructured data☆1,583Nov 10, 2025Updated 6 months ago
- Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…☆14,749May 18, 2026Updated last week
- A Repo For Document AI☆3,169May 15, 2026Updated last week
- A Python library to extract tabular data from PDFs☆3,695Apr 15, 2026Updated last month
- Knowledge Agents and Management in the Cloud☆4,254May 18, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Developer APIs to Accelerate LLM Projects☆1,751Oct 18, 2024Updated last year
- An opinionated list of awesome Ollama web and desktop uis, frameworks, libraries, software and resources.☆465Jan 17, 2025Updated last year
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,829Mar 17, 2026Updated 2 months ago
- No-code ETL and data pipelines with AI and NLP☆318Feb 20, 2025Updated last year
- ChatPilot: Chat Agent Web UI,实现Chat对话前端,支持Google搜索、文件网址对话(RAG)、代码解释器功能,复现了Kimi Chat(文件,拖进来;网址,发出来)。☆600Jan 27, 2026Updated 3 months ago
- Structured data extraction and instruction calling with ML, LLM and Vision LLM☆5,158Updated this week
- ☆551Jul 26, 2024Updated last year
- ☆52Feb 5, 2025Updated last year
- Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.☆970Oct 15, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,819Dec 12, 2025Updated 5 months ago
- Scrape the webpage convert it into Markdown, and enhance AI search applications.☆257May 11, 2024Updated 2 years ago
- A Unified Toolkit for Deep Learning Based Document Image Analysis☆5,735Aug 15, 2024Updated last year
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.☆1,750Dec 21, 2024Updated last year
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆2,166Apr 14, 2025Updated last year
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆320Aug 15, 2025Updated 9 months ago
- Toolkit for linearizing PDFs for LLM datasets/training☆17,336Mar 25, 2026Updated 2 months ago