Yuliang-Liu / MonkeyOCRLinks
A lightweight LMM-based Document Parsing Model
☆5,424Updated this week
Alternatives and similar repositories for MonkeyOCR
Users that are interested in MonkeyOCR are comparing it to the libraries listed below
Sorting:
- LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual…☆792Updated last week
- OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex lay…☆2,077Updated last week
- 🌐 WebAgent for Information Seeking built by Tongyi Lab: WebWalker & WebDancer & WebSailor & WebShaper https://arxiv.org/abs/2507.15061 h…☆5,836Updated this week
- 基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle d…☆1,308Updated last month
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆1,510Updated this week
- ☆1,918Updated last week
- "Your Fully-Automated Personal AI Assistant"☆1,079Updated last month
- PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides☆1,841Updated last week
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,601Updated last month
- Convert files (PDF, image, Word, PPT, Excel, notebooks, code snippets) to markdown using powerful multimodal LLM☆282Updated 3 months ago
- "Vimo: Chat with Your Videos"☆861Updated last week
- Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai☆4,338Updated this week
- AI-Powered Python & Python-Powered AI (Python-Use)☆1,641Updated this week
- Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing re…☆957Updated this week
- Build multimodal language agents for fast prototype and production☆2,543Updated 4 months ago
- BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…☆9,328Updated this week
- Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)☆1,906Updated 3 weeks ago
- "RAG-Anything: All-in-One RAG System"☆2,385Updated this week
- PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation☆1,920Updated 3 months ago
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆248Updated this week
- Next-Generation Interactive Intelligent Programming Assistant☆863Updated 10 months ago
- 🚀 EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents☆1,064Updated this week
- "MiniRAG: Making RAG Simpler with Small and Open-Sourced Language Models"☆1,296Updated 2 months ago
- AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.☆899Updated this week
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post…☆785Updated last week
- ☆10,917Updated last month
- Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework☆807Updated last month
- LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.☆2,228Updated 3 months ago
- PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.☆3,150Updated 3 weeks ago
- cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,算法链路全流程,支持大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台,自动化标注,deepseek等…☆1,761Updated last week