chatdoc-com/OCRFlux

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chatdoc-com/OCRFlux)

chatdoc-com / OCRFlux

OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex layout handling, complicated table parsing and cross-page content merging.

☆2,523

Alternatives and similar repositories for OCRFlux

Users that are interested in OCRFlux are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yuliang-Liu / MonkeyOCR
View on GitHub
A lightweight LMM-based Document Parsing Model
☆6,607Updated this week
studio-dots-ai / dots.ocr
View on GitHub
Multilingual Document Layout Parsing in a Single Vision-Language Model
☆9,028Mar 24, 2026Updated 4 months ago
NanoNets / docext
View on GitHub
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
☆2,032Mar 17, 2026Updated 4 months ago
allenai / olmocr
View on GitHub
Toolkit for linearizing PDFs for LLM datasets/training
☆19,182Mar 25, 2026Updated 4 months ago
opendatalab / MinerU
View on GitHub
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
☆75,694Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bytedance / Dolphin
View on GitHub
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
☆9,039Mar 25, 2026Updated 4 months ago
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,158Feb 10, 2025Updated last year
panyanyany / Twocast
View on GitHub
AI Podcast Generator for bilingual episodes, Multi Languages, Alternative to NotebookLLM；真人对话AI播客生成器，多语言，多音色
☆1,248Jul 1, 2025Updated last year
opendatalab / OmniDocBench
View on GitHub
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
☆1,914Updated this week
alibaba / Logics-Parsing
View on GitHub
☆1,394May 13, 2026Updated 2 months ago
aoguai / LiYing
View on GitHub
LiYing is an automated photo processing program designed for automating the post-processing workflow of ID photos in general photo studio…
☆3,242Jun 28, 2026Updated 3 weeks ago
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,149Updated this week
MarkPDFdown / markpdfdown
View on GitHub
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
☆1,930Jan 25, 2026Updated 6 months ago
RapidAI / RapidOCR
View on GitHub
📄 Awesome OCR multiple programing languages toolkits based on ONNX Runtime, OpenVINO, MNN, PaddlePaddle, TensorRT and PyTorch.
☆7,257Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
PaddlePaddle / PaddleOCR
View on GitHub
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/…
☆86,238Updated this week
datalab-to / marker
View on GitHub
Convert PDF to markdown + JSON quickly with high accuracy
☆37,843Updated this week
opendatalab / PDF-Extract-Kit
View on GitHub
A Comprehensive Toolkit for High-Quality PDF Content Extraction
☆9,806Jan 3, 2025Updated last year
getomni-ai / zerox
View on GitHub
OCR & Document Extraction using vision models
☆12,258May 20, 2025Updated last year
opendatalab / DocLayout-YOLO
View on GitHub
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆2,235Apr 14, 2025Updated last year
oomol-lab / pdf-craft
View on GitHub
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
☆6,024Jun 27, 2026Updated 3 weeks ago
Tencent-Hunyuan / HunyuanOCR
View on GitHub
HunyuanOCR-1.5: Making Lightweight OCR VLMs Faster and Better
☆1,881Updated this week
sjzar / chatlog
View on GitHub
chat log tool, easily use your own chat data. 聊天记录工具，轻松使用自己的聊天数据
☆9,189Oct 20, 2025Updated 9 months ago
PDFMathTranslate / PDFMathTranslate
View on GitHub
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，…
☆35,783May 25, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CosmosShadow / gptpdf
View on GitHub
Using GPT to parse PDF
☆3,561Apr 17, 2025Updated last year
google / langextract
View on GitHub
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive vi…
☆37,814Updated this week
funstory-ai / BabelDOC
View on GitHub
Yet Another Document Translator
☆8,996Jul 16, 2026Updated last week
infiniflow / ragflow
View on GitHub
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…
☆85,976Updated this week
index-tts / index-tts
View on GitHub
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
☆22,147Jul 14, 2026Updated last week
refly-ai / refly
View on GitHub
The first open-source agent skills builder. Define skills by vibe workflow, run on Claude Code, Cursor, Codex & more. Build Clawdbot 🦞· …
☆7,462Mar 25, 2026Updated 4 months ago
CherryHQ / cherry-studio
View on GitHub
AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs
☆48,978Updated this week
Tencent / POINTS-Reader
View on GitHub
☆197Dec 7, 2025Updated 7 months ago
docling-project / docling
View on GitHub
Get your documents ready for gen AI
☆63,762Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
hiroi-sora / Umi-OCR
View on GitHub
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片，PDF文档识别，排除水印/页眉页脚，扫描/生成二维码。内置多国语言库。
☆46,223Nov 20, 2025Updated 8 months ago
xxnuo / MTranServer
View on GitHub
Offline translation model server with low resource consumption, fast speed, and private deployment capability. 低资源占用速度快可私有部署的离线翻译模型服务器
☆4,629Mar 8, 2026Updated 4 months ago
landing-ai / agentic-doc
View on GitHub
Legacy Python library for Agentic Document Extraction (ADE). Use the landingai-ade library for all new projects.
☆2,395Mar 24, 2026Updated 4 months ago
shcherbak-ai / contextgem
View on GitHub
ContextGem: Effortless LLM extraction from documents
☆1,863Jun 6, 2026Updated last month
Alibaba-NLP / DeepResearch
View on GitHub
Tongyi Deep Research, the Leading Open-source Deep Research Agent
☆19,719Feb 27, 2026Updated 4 months ago
krillinai / KrillinAI
View on GitHub
AI video translation & dubbing tool for humans and AI Agents, powered by LLMs. Full pipeline: download, transcribe, translate, TTS dub, r…
☆10,546Updated this week
coze-dev / coze-studio
View on GitHub
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. C…
☆21,248Apr 20, 2026Updated 3 months ago