ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆32,255Updated 3 weeks ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Toolkit for linearizing PDFs for LLM datasets/training☆16,759Updated this week
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆52,070Updated this week
- Python tool for converting files and office documents to Markdown.☆85,387Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆30,905Updated last week
- 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…☆26,448Updated 2 weeks ago
- A self-hosted dashboard that puts all your feeds in one place☆31,246Updated last month
- A community-supported supercharged document management system: scan, index and archive all your documents☆35,716Updated this week
- Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.☆40,075Updated this week
- #1 PDF Application on GitHub that lets you edit PDFs on any device anywhere☆73,028Updated this week
- 🔥 MaxKB is an open-source platform for building enterprise-grade agents. 强大易用的开源企业级智能体平台。☆19,839Updated this week
- 🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.☆42,462Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,112Updated 2 months ago
- AI Agent + Coding Agent + 300+ assistants: agentic AI desktop with autonomous coding, intelligent automation, and unified access to front…☆37,809Updated this week
- Production-ready platform for agentic workflow development.☆126,441Updated this week
- Universal markup converter☆41,497Updated this week
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆8,881Updated this week
- Convert PDF to HTML without losing text or format.☆5,372Updated 6 months ago
- AI-Powered Photos App for the Decentralized Web 🌈💎✨☆39,086Updated this week
- [EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,…☆31,295Updated last month
- An open-source RAG-based tool for chatting with your documents.☆24,842Updated 6 months ago
- Virtual whiteboard for sketching hand-drawn like diagrams☆114,643Updated this week
- OCR & Document Extraction using vision models☆12,021Updated 8 months ago
- The fastest knowledge base for growing teams. Beautiful, realtime collaborative, feature packed, and markdown compatible.☆36,740Updated this week
- A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.☆40,602Updated this week
- A browser extension for automating your browser by connecting blocks☆20,900Updated 2 months ago
- Peer-to-peer file transfers in your browser☆9,863Updated last week
- 🧡 Folo is the AI Reader☆36,631Updated 2 weeks ago
- Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.☆159,631Updated this week
- Open source real-time translation app for Android that runs locally☆9,539Updated 2 weeks ago
- Docmost is an open-source collaborative wiki and documentation software. It is an open-source alternative to Confluence and Notion.☆18,560Updated last week