Yuliang-Liu / MonkeyOCRLinks
A lightweight LMM-based Document Parsing Model
☆5,743Updated last week
Alternatives and similar repositories for MonkeyOCR
Users that are interested in MonkeyOCR are comparing it to the libraries listed below
Sorting:
- LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual…☆808Updated last week
- OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex lay…☆2,270Updated last month
- AI-Powered Python & Python-Powered AI (Python-Use)☆2,023Updated last week
- Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai☆4,588Updated this week
- "Your Fully-Automated Personal AI Assistant"☆1,118Updated 2 months ago
- "VideoRAG: Chat with Your Videos"☆1,121Updated last week
- 基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle d…☆1,379Updated 3 months ago
- Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing re…☆1,422Updated this week
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆4,466Updated 2 weeks ago
- Convert files (PDF, image, Word, PPT, Excel, notebooks, code snippets) to markdown using powerful multimodal LLM☆296Updated 4 months ago
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,729Updated 3 weeks ago
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆5,983Updated this week
- ScreenCoder — Turn any UI screenshot into clean, editable HTML/CSS with full control. Fast, accurate, and easy to customize.☆2,346Updated last month
- Build multimodal language agents for fast prototype and production☆2,552Updated 6 months ago
- 开源的端到端产品级通用智能体☆6,745Updated 3 weeks ago
- UltraRAG 2.0: Less Code, Lower Barrier, Faster Deployment! MCP-based low-code RAG framework, enabling researchers to build complex pipeli…☆1,565Updated 2 weeks ago
- PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [EMNLP 2025]☆2,015Updated this week
- "MiniRAG: Making RAG Simpler with Small and Open-Sourced Language Models"☆1,427Updated last month
- The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.☆5,811Updated 3 weeks ago
- Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)☆1,920Updated 2 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,857Updated 7 months ago
- MultiAgentPPT 是一个集成了 A2A(Agent2Agent)+ MCP(Model Context Protocol)+ ADK(Agent Development Kit) 架构的智能化演示文稿生成系统,支持通过多智能体协作和流式并发机制☆1,322Updated this week
- Ragflow-Plus 是 Ragflow 的二次开发版本,使其更为简洁实用☆1,015Updated 2 weeks ago
- ☆2,460Updated last week
- ☆2,335Updated last month
- AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.☆1,069Updated this week
- PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation☆2,037Updated last week
- BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…☆9,636Updated last week
- Official repository of Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning☆551Updated this week
- 🚀 全网效果最好的移动端【实时对话数字人】。 支持本地部署、多模态交互(语音、文本、表情),响应速度低于 1.5 秒,适用于直播、教学、客服、金融、政务等对隐私与实时性要求极高的场景。开箱即用,开发者友好。☆7,461Updated last week