chatdoc-com / OCRFluxLinks
OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex layout handling, complicated table parsing and cross-page content merging.
☆2,341Updated 2 months ago
Alternatives and similar repositories for OCRFlux
Users that are interested in OCRFlux are comparing it to the libraries listed below
Sorting:
- MultiAgentPPT 是一个集成了 A2A(Agent2Agent)+ MCP(Model Context Protocol)+ ADK(Agent Development Kit) 架构的智能化演示文稿生成系统,支持通过多智能体协作和流式并发机制☆1,376Updated last month
- A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具☆1,603Updated last month
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,785Updated 2 months ago
- E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with ded…☆1,234Updated last year
- ☆713Updated 2 weeks ago
- Transcribe and summarize video content using AI. Open-source, multi-platform, and supports multiple languages.☆1,419Updated this week
- PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.☆3,301Updated last month
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆5,432Updated 3 weeks ago
- PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker☆1,547Updated this week
- 开源免费的 Wispr Flow 替代方案 | 集成FunASR本地模型和可配置大语言模型的下一代中文桌面语音工作流☆1,591Updated 3 weeks ago
- 一个基于LLM的演示文稿生成平台,能够自动将文 档内容转换为专业的PPT演示文稿。平台支持多种AI模型,提供丰富的模板和样式选择,让用户能够创建高质量的演示文稿。☆1,317Updated last week
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆721Updated last week
- A Unicode-based text digital watermarking tool for embedding invisible copyright marks and metadata in text content.☆744Updated 3 months ago
- UltraRAG 2.0: Less Code, Lower Barrier, Faster Deployment! MCP-based low-code RAG framework, enabling researchers to build complex pipeli…☆1,768Updated this week
- Fogsight is an AI agent and animation engine powered by Large Language Models.☆1,280Updated 2 months ago
- LiYing is an automated photo processing program designed for automating the post-processing workflow of ID photos in general photo studio…☆2,977Updated 2 weeks ago
- AI Podcast Generator for bilingual episodes, Multi Languages, Alternative to NotebookLLM;真人对话AI播客生成器,多语言,多音色☆1,087Updated 4 months ago
- A quick vibe coded app for deepseek OCR☆1,242Updated last week
- PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [EMNLP 2025]☆2,148Updated last week
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,111Updated this week
- A MCP (Model Context Protocol) server for PowerPoint manipulation using python-pptx. This server provides tools for creating, editing, an…☆1,157Updated 2 weeks ago
- AI Prompt Optimization Platform is a professional prompt engineering tool designed to help users optimize AI model prompts, enhancing the…☆573Updated 4 months ago
- This is a 12306 ticket search server based on the Model Context Protocol (MCP).☆630Updated 3 weeks ago
- 基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle d…☆1,501Updated 4 months ago
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆6,576Updated last week
- Youtu-GraphRAG boosts cost efficiency, inference accuracy, and cross-domain adaptability, pushing the boundaries of performance in comple…☆862Updated this week
- AingDesk是一款简单好用的AI助手 ,支持知识库、模型API、分享、联网搜索、智能体,它还在飞快成长中。 AingDesk is a simple and easy-to-use AI assistant that supports knowledge bases, m…☆2,363Updated 3 months ago
- A context-aware AI assistant for your desktop. Ready to respond intelligently, seamlessly integrating multiple LLMs and MCP tools.☆1,427Updated last week
- 基于 DeepSeek-OCR 的文档解析工具。该工具能够高效地处理 PDF 文档和图片,提供强大的光学字符识别(OCR)功能,支持多语种文字识别、表格解析、图表分析等多种功能。☆88Updated last week
- ⭐零门槛的3D桌面伴侣!支持接入QQ、B站直播、RAG、联网、长期记忆、 酒馆角色卡、claude code 、浏览器控制、Dify、 Home Assistant、MCP、A2A、Comfyui、数字人口播等功能!⭐ A 3D desktop companion with…☆1,128Updated last week