ocrmypdf / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆30,479Updated 3 weeks ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,882Updated this week
- A browser extension for automating your browser by connecting blocks☆19,398Updated last week
- Toolkit for linearizing PDFs for LLM datasets/training☆13,346Updated this week
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆40,561Updated last week
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆43,378Updated last week
- PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Doc…☆25,940Updated last week
- OCR & Document Extraction using vision models☆11,603Updated 2 months ago
- Convert PDF to markdown + JSON quickly with high accuracy☆26,737Updated last week
- 🔥 Open-source no code web data extraction platform. Instantly turn any website into API or spreadsheet 🔥☆13,303Updated this week
- Python tool for converting files and office documents to Markdown.☆69,708Updated last month
- Comfortably monitor your Internet traffic 🕵️♂️☆29,378Updated this week
- SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither track…☆20,724Updated this week
- #1 Locally hosted web application that allows you to perform various operations on PDF files☆63,673Updated last week
- An open-source cross-platform alternative to AirDrop☆64,938Updated this week
- User-friendly AI Interface (Supports Ollama, OpenAI API, ...)☆103,856Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,739Updated 5 months ago
- Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI☆23,247Updated this week
- User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)☆35,928Updated last week
- SOTA Open Source TTS☆22,463Updated last week
- Get your documents ready for gen AI☆34,716Updated this week
- very good whiteboard SDK / infinite canvas SDK☆40,902Updated this week
- Elegant reading of real-time and hottest news☆12,128Updated last week
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆8,236Updated 6 months ago
- Yet Another Document Translator☆4,675Updated this week
- Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languag…☆52,029Updated this week
- A video translation and dubbing tool powered by LLMs, offering professional-grade translations and one-click full-process deployment. It…☆8,117Updated last week
- Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.☆147,730Updated this week
- Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown☆81,549Updated this week
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆49,629Updated this week
- 📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.☆4,672Updated this week