ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆32,099Updated this week
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Toolkit for linearizing PDFs for LLM datasets/training☆16,322Updated last week
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,627Updated 3 weeks ago
- A browser extension for automating your browser by connecting blocks☆20,829Updated 2 months ago
- #1 PDF Application on GitHub that lets you edit PDFs on any device anywhere☆71,589Updated this week
- OCR & Document Extraction using vision models☆11,997Updated 7 months ago
- Turn any website into clean data pipelines & structured APIs in minutes!☆14,080Updated last week
- Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.☆15,937Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,028Updated 2 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,041Updated 10 months ago
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆15,480Updated 7 months ago
- Open source Python library for converting PDF to DOCX.☆3,240Updated 7 months ago
- 🤱🏻 Turn any webpage into a desktop app with one command.☆44,231Updated this week
- Elegant reading of real-time and hottest news☆15,532Updated last week
- 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…☆26,171Updated last month
- Comfortably monitor your Internet traffic 🕵️♂️☆32,234Updated this week
- Web Extension for saving a faithful copy of a complete web page in a single HTML file☆19,839Updated last week
- Yet Another Document Translator☆6,240Updated 2 weeks ago
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.☆98,284Updated this week
- Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ in…☆164,263Updated last week
- Tesseract Open Source OCR Engine (main repository)☆71,581Updated this week
- Easy P2P file transfer powered by WebRTC - inspired by Apple AirDrop☆10,687Updated 10 months ago
- Industry leading face manipulation platform☆26,180Updated last week
- A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations☆16,264Updated this week
- An extremely fast Python package and project manager, written in Rust.☆75,603Updated this week
- A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.☆40,123Updated this week
- SOTA Open Source TTS☆24,402Updated 3 weeks ago
- The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra☆20,078Updated 2 weeks ago
- screen sharing for developers https://screego.net/☆10,094Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆30,547Updated this week
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆51,012Updated this week