ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆32,099Updated this week
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Convert PDF to markdown + JSON quickly with high accuracy☆30,547Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆16,322Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,028Updated 2 months ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,047Updated 11 months ago
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆15,480Updated 7 months ago
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆8,727Updated last week
- Python scraper based on AI☆22,099Updated this week
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆51,012Updated this week
- Open source Python library for converting PDF to DOCX.☆3,240Updated 7 months ago
- An open-source cross-platform alternative to AirDrop☆72,370Updated last month
- An open-source, self-hosted note-taking service. Your thoughts, your data, your control — no tracking, no ads, no subscription fees.☆47,563Updated this week
- Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/…☆66,742Updated last week
- best way to save what you love☆37,813Updated last week
- Perplexica is an AI-powered answering engine. It is an Open source alternative to Perplexity AI☆27,822Updated this week
- OCR & Document Extraction using vision models☆11,997Updated 7 months ago
- Comfortably monitor your Internet traffic 🕵️♂️☆32,234Updated this week
- Python tool for converting files and office documents to Markdown.☆84,547Updated 3 weeks ago
- 🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.☆41,939Updated last week
- Powerful AI Client☆37,896Updated last month
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,627Updated 3 weeks ago
- #1 PDF Application on GitHub that lets you edit PDFs on any device anywhere☆71,589Updated this week
- A browser extension for automating your browser by connecting blocks☆20,829Updated 2 months ago
- Convert PDF to HTML without losing text or format.☆5,352Updated 5 months ago
- OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。☆40,858Updated last month
- There can be more than Notion and Miro. AFFiNE(pronounced [ə‘fain]) is a next-gen knowledge base that brings planning, sorting and creati…☆61,121Updated this week
- A Python library for reading and writing PDF, powered by QPDF☆2,558Updated last week
- An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.☆104,621Updated last week
- 🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers, Knowledge Base (file upload / RAG ), one c…☆69,437Updated this week
- 🤱🏻 Turn any webpage into a desktop app with one command.☆44,231Updated this week
- 猫抓 浏览器 资源嗅探扩展 / cat-catch Browser Resource Sniffing Extension☆17,356Updated last week