Layout-Parser / layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
☆5,221Updated 8 months ago
Alternatives and similar repositories for layout-parser:
Users that are interested in layout-parser are comparing it to the libraries listed below
- docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.☆4,622Updated last week
- ☆972Updated 3 years ago
- A Repo For Document AI☆2,810Updated 3 weeks ago
- This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table …☆1,525Updated 3 years ago
- A curated list of resources for Document Understanding (DU) topic☆1,403Updated last year
- Document Layout Analysis resources repos for development with PdfPig.☆612Updated last year
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,595Updated 10 months ago
- A machine learning software for extracting information from scholarly documents☆4,002Updated 2 weeks ago
- Text preprocessing, representation and visualization from zero to hero.☆2,903Updated last year
- Community maintained fork of pdfminer - we fathom PDF☆6,431Updated last week
- Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022☆6,217Updated 9 months ago
- OpenMMLab Text Detection, Recognition and Understanding Toolbox☆4,518Updated 5 months ago
- A Python library to extract tabular data from PDFs☆3,284Updated this week
- Efficient few-shot learning with Sentence Transformers☆2,477Updated 3 weeks ago
- DocBank: A Benchmark Dataset for Document Layout Analysis☆608Updated 8 months ago
- Open source annotation tool for machine learning practitioners.☆9,957Updated 5 months ago
- A Python library for reading and writing PDF, powered by QPDF☆2,332Updated 2 weeks ago
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆26,534Updated 7 months ago
- The scripts for training Detectron2-based Layout Models on popular layout analysis datasets☆210Updated last year
- extract text from any document. no muss. no fuss.☆4,112Updated 5 months ago
- ☆949Updated 7 months ago
- Software that makes labeling PDFs easy.☆415Updated 11 months ago
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.☆6,728Updated 3 weeks ago
- Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.☆7,661Updated last month
- Top2Vec learns jointly embedded topic, document and word vectors.☆3,032Updated 5 months ago
- Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.☆520Updated 4 years ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,486Updated this week
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the…☆2,030Updated 8 months ago
- TableBank: A Benchmark Dataset for Table Detection and Recognition☆1,048Updated 8 months ago
- Minimal keyword extraction with BERT☆3,839Updated last month