ocrmypdf / OCRmyPDF-EasyOCR
OCRmyPDF EasyOCR plugin
☆50Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for OCRmyPDF-EasyOCR
- Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.☆106Updated this week
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆62Updated this week
- A post-processing tool for scanned sheets of paper.☆72Updated 8 months ago
- Building scantailor and its dependencies☆55Updated last year
- web based editor for subtitles and transcripts☆110Updated 2 months ago
- Ergonomic line-by-line transcription of scanned text.☆47Updated 3 years ago
- User contributed (non Google) OCR models for Tesseract☆22Updated 2 weeks ago
- ScanTailor Universal - a fork based on Enhanced+Featured+Master versions of ST☆184Updated 2 months ago
- Logical structure analysis for visually structured documents☆82Updated 2 years ago
- 📑 Python Package to reconstruct the original continuous text from PDFs with language models☆33Updated last year
- Benchmarking PDF libraries☆222Updated last year
- Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.☆28Updated 10 months ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆369Updated 2 months ago
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆66Updated 3 weeks ago
- A curated list of resources around PDF files☆107Updated 3 months ago
- ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones …☆206Updated 2 months ago
- Provides OCR (Optical Character Recognition) services through web applications☆237Updated 9 months ago
- Apply different text recognition services to images of handwritten documents.☆172Updated last year
- Toolkit for training/converting LibreTranslate compatible language models 🚂☆48Updated 2 weeks ago
- Python library to extract tabular data from images and scanned PDFs☆261Updated 3 months ago
- Library used to deskew a scanned document☆418Updated last month
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆141Updated last year
- A deep learning toolkit specialized for handwritten document analysis☆207Updated 2 months ago
- Conversions between various OCR formats☆71Updated last year
- Tools to process books in a cloud based pipeline system☆50Updated this week
- A Python library to extract tabular data from PDFs☆46Updated this week
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆21Updated last year
- Make a searchable pdf via Google Cloud Vision OCR☆14Updated 4 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆181Updated 3 weeks ago
- `pdfstructure` detects, splits and organizes the documents text content into its natural structure as envisioned by the author.☆100Updated 7 months ago