mindee / doctrLinks
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
☆5,569Updated this week
Alternatives and similar repositories for doctr
Users that are interested in doctr are comparing it to the libraries listed below
Sorting:
- A Repo For Document AI☆2,990Updated this week
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,766Updated last year
- Open-source framework for analyzing the life cycle of digital skills using survival analysis and epidemiological modeling☆13Updated 5 months ago
- A curated list of resources for Document Understanding (DU) topic☆1,469Updated 2 years ago
- Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022☆6,606Updated last year
- This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table …☆1,549Updated 4 years ago
- img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing☆803Updated 2 months ago
- ☆984Updated last year
- OpenMMLab Text Detection, Recognition and Understanding Toolbox☆4,662Updated 11 months ago
- A Unified Toolkit for Deep Learning Based Document Image Analysis☆5,550Updated last year
- Links to awesome OCR projects☆3,056Updated last year
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆1,729Updated 6 months ago
- Mindee API Helper Library for Node.js☆25Updated this week
- OCR engine for all the languages☆900Updated last week
- Mindee API Helper Library for Python☆42Updated 2 weeks ago
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,232Updated last year
- A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.☆1,467Updated last month
- Official implementation of Character Region Awareness for Text Detection (CRAFT)☆3,322Updated last year
- Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.☆521Updated 4 years ago
- A Python library to extract tabular data from PDFs☆3,497Updated this week
- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extra…☆2,753Updated this week
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,789Updated 6 months ago
- Structured data extraction and instruction calling with ML, LLM and Vision LLM☆5,017Updated last week
- ☆1,025Updated 3 months ago
- A synthetic data generator for text recognition☆3,582Updated last year
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,762Updated 8 months ago
- Transforms PDF, Documents and Images into Enriched Structured Data☆6,012Updated last year
- Library used to deskew a scanned document☆489Updated 3 weeks ago
- UniTable: Towards a Unified Table Foundation Model☆510Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,952Updated 8 months ago