Yuliang-Liu / MonkeyOCRLinks
A lightweight LMM-based Document Parsing Model
☆6,406Updated 3 weeks ago
Alternatives and similar repositories for MonkeyOCR
Users that are interested in MonkeyOCR are comparing it to the libraries listed below
Sorting:
- LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual…☆892Updated 2 months ago
- OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex lay…☆2,416Updated 4 months ago
- AI-Powered Python & Python-Powered AI (Python-Use)☆3,174Updated this week
- An Autonomous Agentic Framework for Reflective PowerPoint Generation☆2,966Updated this week
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,824Updated 4 months ago
- UltraRAG v2: A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines☆2,386Updated this week
- "Your Fully-Automated Personal AI Assistant"☆1,317Updated 2 months ago
- 基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle d…☆1,625Updated 2 months ago
- BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…☆10,792Updated this week
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆5,936Updated this week
- 🚀 The best real-time interactive AI avatar(digital human) with on-premise deployment and <1.5 s latency.☆7,704Updated 2 weeks ago
- ☆1,374Updated this week
- Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai☆4,811Updated this week
- [KDD'2026] "VideoRAG: Chat with Your Videos"☆1,591Updated this week
- [EMNLP-2024] Build multimodal language agents for fast prototype and production☆2,617Updated 9 months ago
- ☆810Updated 2 months ago
- Ragflow-Plus 是 Ragflow 的二次开发版本,使其更为简洁实用☆1,157Updated 2 weeks ago
- Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)☆1,940Updated 2 months ago
- ScreenCoder — Turn any UI screenshot into clean, editable HTML/CSS with full control. Fast, accurate, and easy to customize.☆2,523Updated 2 months ago
- Convert files (PDF, image, Word, PPT, Excel, notebooks, code snippets) to markdown using powerful multimodal LLM☆315Updated 7 months ago
- 🚀 EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents☆2,417Updated last week
- AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.☆1,237Updated 3 weeks ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,039Updated 10 months ago
- 🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.☆12,017Updated 2 months ago
- AingDesk是一款简单好用的AI助手,支持知识库、模型API、分享、联网搜索、智能体,它还在飞快成长中。 AingDesk is a simple and easy-to-use AI assistant that supports knowledge bases, m…☆2,439Updated 6 months ago
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…☆906Updated 4 months ago
- A quick vibe coded app for deepseek OCR☆1,536Updated last month
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,314Updated 2 weeks ago
- PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation☆2,343Updated 3 months ago
- ☆2,498Updated 4 months ago