ocropus / ocropus4-eval
Tools for evaluating OCR performance relative to ground truth.
☆10Updated last year
Alternatives and similar repositories for ocropus4-eval:
Users that are interested in ocropus4-eval are comparing it to the libraries listed below
- DFKI Layout Detection for OCR-D☆47Updated 3 months ago
- ☆67Updated 11 months ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated last month
- Ergonomic line-by-line transcription of scanned text.☆51Updated 4 years ago
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated last year
- Layout Analysis Dataset with Segmonto (LADaS)☆19Updated 2 weeks ago
- ☆21Updated 3 weeks ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆54Updated last year
- ☆12Updated 9 months ago
- Corpus Build OCR platform☆8Updated 2 years ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆22Updated last year
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- User contributed (non Google) OCR models for Tesseract☆24Updated 3 months ago
- Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.☆11Updated last year
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆23Updated last year
- Scrollership through 20m pubmed abstracts.☆26Updated last year
- Chrome Extension for exploring Hugging Face datasets 🔎☆49Updated 5 months ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆65Updated this week
- Libraries, Archives and Museums (LAM)☆82Updated 2 years ago
- Small python package to measure OCR quality and other related metrics.☆21Updated last year
- ☆10Updated 5 years ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated 11 months ago
- Jupyter Notebooks and an R Notebook for encoding Pokémon embeddings and creating data visualizations.☆19Updated 7 months ago
- A cog model for the all-mpnet-base-v2 sentence-transformers embedding model.☆11Updated last year
- examples and guides to using Nomic Atlas☆27Updated 2 weeks ago
- Master repository which includes most other OCR-D repositories as submodules☆72Updated last week
- Glyph Miner, a system for extracting glyphs from early typeset prints☆34Updated 8 years ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated 6 months ago