ocrmypdf / OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆28,763Updated 2 weeks ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Convert PDF to markdown + JSON quickly with high accuracy☆24,928Updated this week
- #1 Locally hosted web application that allows you to perform various operations on PDF files☆58,368Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,346Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆12,311Updated this week
- A browser extension for automating your browser by connecting blocks☆17,552Updated last month
- Integrate the DeepSeek API into popular softwares☆32,180Updated 2 weeks ago
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆37,936Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆21,960Updated last month
- 🔥 Open Source No Code Web Data Extraction Platform • Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes 🔥☆12,537Updated this week
- The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on…☆32,446Updated this week
- Jan is an open source alternative to ChatGPT that runs 100% offline on your computer☆28,867Updated this week
- RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.☆51,806Updated this week
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆33,273Updated this week
- Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your p…☆43,286Updated this week
- Streamlit — A faster way to build and share data apps.☆39,245Updated this week
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆59,385Updated last week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆24,258Updated last week
- FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data process…☆23,943Updated this week
- AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording☆14,547Updated this week
- 🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用☆37,972Updated last month
- 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…☆23,820Updated last month
- PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Doc…☆22,731Updated this week
- Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.☆20,037Updated 3 weeks ago
- The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.☆43,905Updated this week
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆48,995Updated this week
- A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.☆34,342Updated this week
- SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither track…☆18,907Updated this week
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆7,122Updated this week
- Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, m…☆96,483Updated this week
- OCR & Document Extraction using vision models☆11,131Updated 2 weeks ago