ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆31,340Updated 2 weeks ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,062Updated last year
- OCR & Document Extraction using vision models☆11,861Updated 4 months ago
- An open-source cross-platform alternative to AirDrop☆68,422Updated this week
- An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.☆99,312Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆23,622Updated 3 weeks ago
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆45,154Updated last week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,882Updated 7 months ago
- Toolkit for linearizing PDFs for LLM datasets/training☆14,208Updated this week
- User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)☆36,838Updated 3 weeks ago
- Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/…☆56,676Updated this week
- Use your locally running AI models to assist you in your web browsing☆7,162Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆29,013Updated last week
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.☆33,809Updated this week
- Video translation and dubbing tool powered by LLMs. The video translator offers 100 language translations and one-click full-process depl…☆8,590Updated 3 weeks ago
- #1 Locally hosted web application that allows you to perform various operations on PDF files☆68,046Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,656Updated this week
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,756Updated 7 months ago
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆8,151Updated last week
- A modern, open-source, self-hosted knowledge management and note-taking platform designed for privacy-conscious users and organizations.☆44,747Updated this week
- PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Doc…☆28,368Updated last week
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,698Updated 3 months ago
- ⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡☆13,678Updated this week
- Integrate the DeepSeek API into popular softwares☆33,992Updated last week
- Peer-to-peer file transfers in your browser☆9,510Updated last week
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆8,717Updated 9 months ago
- ✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows☆85,997Updated last week
- 🤱🏻 Turn any webpage into a desktop app with one command. 🤱🏻 一键打包网页生成轻量桌面应用。☆42,528Updated last week
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆15,035Updated 4 months ago
- Convert PDF to HTML without losing text or format.☆5,213Updated 2 months ago
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆70,812Updated this week