opendatalab / MinerULinks
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
☆35,037Updated this week
Alternatives and similar repositories for MinerU
Users that are interested in MinerU are comparing it to the libraries listed below
Sorting:
- Convert PDF to markdown + JSON quickly with high accuracy☆25,826Updated this week
- FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data process…☆24,718Updated this week
- Production-ready platform for agentic workflow development.☆103,066Updated this week
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆7,865Updated 5 months ago
- RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.☆54,932Updated this week
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.☆28,436Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,641Updated this week
- 💬 MaxKB is an open-source AI assistant for enterprise. It seamlessly integrates RAG pipelines, supports robust workflows, and provides M…☆16,818Updated this week
- PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Doc…☆24,685Updated last week
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆39,889Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆12,848Updated this week
- OCR & Document Extraction using vision models☆11,315Updated 3 weeks ago
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations☆14,370Updated last week
- Memory for AI Agents; Announcing OpenMemory MCP - local and secure memory management.☆34,513Updated this week
- Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…☆8,030Updated this week
- 🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 4 / Gemini / Ollama / DeepSe…☆62,449Updated this week
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆22,047Updated last month
- A generative speech model for daily dialogue.☆36,799Updated 3 weeks ago
- Integrate the DeepSeek API into popular softwares☆32,851Updated last month
- Question and Answer based on Anything.☆13,266Updated 2 months ago
- A simple screen parsing tool towards pure vision based GUI agent☆22,426Updated 2 months ago
- A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.☆14,558Updated this week
- LLM API 管理 & 分发系统,支持 OpenAI、Azure、Anthropic Claude、Google Gemini、DeepSeek、字节豆包、ChatGLM、文心一言、讯飞星火、通义千问、360 智脑、腾讯混元等主流模型,统一 API 适配,可用于 key …☆25,679Updated 3 months ago
- No Code Web Data Extraction Platform • Turn Websites To APIs & Spreadsheets In Minutes☆13,043Updated this week
- No fortress, purely open ground. OpenManus is Coming.☆46,732Updated last week
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,580Updated last week
- Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain…☆35,274Updated 2 months ago
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.☆40,545Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆9,512Updated 2 weeks ago
- Python tool for converting files and office documents to Markdown.☆59,041Updated 2 weeks ago