ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆31,925Updated last week
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,509Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,959Updated last month
- A browser extension for automating your browser by connecting blocks☆20,721Updated last month
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆49,845Updated last week
- OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。☆40,416Updated 2 weeks ago
- AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording☆16,098Updated 3 months ago
- A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.☆39,567Updated this week
- Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.☆15,786Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆30,183Updated 2 weeks ago
- Turn any website into clean, contextualized data pipelines for your workflows☆13,950Updated last week
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆13,727Updated 2 months ago
- Toolkit for linearizing PDFs for LLM datasets/training☆16,115Updated last week
- 🤱🏻 Turn any webpage into a desktop app with one command. 一键打包网页生成轻量桌面应用☆43,683Updated 2 weeks ago
- 🧡 Folo is the AI Reader☆36,140Updated last week
- 📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.☆5,353Updated last week
- Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes…☆29,138Updated last week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,026Updated 9 months ago
- A privacy-first, open-source platform for knowledge management and collaboration. Download link: http://github.com/logseq/logseq/release…☆39,704Updated this week
- An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and And…☆24,336Updated last week
- A community-supported supercharged document management system: scan, index and archive all your documents☆34,626Updated this week
- Optimized implementation for color-icon-matrix barcodes☆5,682Updated this week
- #1 PDF Application on GitHub that lets you edit PDFs on any device anywhere☆70,724Updated this week
- ⬛️ CLI tool and library for saving complete web pages as a single HTML file☆14,540Updated 3 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆91,464Updated 3 months ago
- An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.☆103,405Updated last week
- Get your documents ready for gen AI☆45,950Updated this week
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆8,989Updated 11 months ago
- Convert PDF to HTML without losing text or format.☆5,327Updated 4 months ago
- 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…☆25,854Updated 3 weeks ago
- 🧡 Everything is RSSible☆40,240Updated this week