ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆31,746Updated last week
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Toolkit for linearizing PDFs for LLM datasets/training☆15,975Updated this week
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆48,785Updated this week
- best way to save what you love☆37,265Updated last month
- #1 Locally hosted web application that allows you to perform various operations on PDF files☆69,724Updated this week
- Tesseract Open Source OCR Engine (main repository)☆70,908Updated last month
- Python tool for converting files and office documents to Markdown.☆82,911Updated 3 weeks ago
- Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/…☆64,192Updated this week
- OCR & Document Extraction using vision models☆11,948Updated 5 months ago
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.☆35,416Updated this week
- A browser extension for automating your browser by connecting blocks☆20,555Updated 3 weeks ago
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,386Updated last year
- A self-hosted dashboard that puts all your feeds in one place☆29,737Updated 3 weeks ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,004Updated 9 months ago
- Get your documents ready for gen AI☆43,856Updated this week
- A Smart Ethernet Switch for Earth☆16,138Updated this week
- Yet Another Document Translator☆5,700Updated this week
- Elegant reading of real-time and hottest news☆13,974Updated 2 weeks ago
- Convert PDF to markdown + JSON quickly with high accuracy☆29,799Updated last week
- User-friendly AI Interface (Supports Ollama, OpenAI API, ...)☆115,188Updated this week
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆15,271Updated 6 months ago
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆8,439Updated last week
- A simple screen parsing tool towards pure vision based GUI agent☆23,832Updated 2 months ago
- Integrate the DeepSeek API into popular softwares☆34,428Updated last month
- A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search☆21,247Updated this week
- Generate audiobooks from e-books☆5,662Updated 8 months ago
- Generate audiobooks from e-books, voice cloning & 1107+ languages!☆15,451Updated this week
- 📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.☆5,253Updated last week
- Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.☆156,090Updated this week
- Access your entire server infrastructure from your local desktop☆11,904Updated last week
- SOTA Open Source TTS☆24,057Updated last week