dell-research-harvard / effocr
A model(ing framework) for sample efficient OCR
☆53Updated last year
Related projects ⓘ
Alternatives and complementary repositories for effocr
- A Large Dataset of Historical Japanese Documents with Complex Layouts☆32Updated 2 years ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆35Updated 11 months ago
- TeX compilation service that makes use of arXiv.org's AutoTeX library.☆27Updated 5 months ago
- Table detection (TD) and table structure recognition (TSR) using Yolov5/Yolov8, cand you can get the same (even better) result compared w…☆41Updated 4 months ago
- Datasets and Evaluation Scripts for CompHRDoc☆25Updated 7 months ago
- [MM'2024] Official implementation of "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Ext…☆24Updated last month
- [ICDAR 2023] (Oral) An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation☆70Updated 2 months ago
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆91Updated 2 months ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆41Updated 7 months ago
- Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset☆24Updated last year
- Object Detection Model for Scanned Documents☆83Updated last year
- A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers☆55Updated 2 months ago
- An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Informat…☆52Updated 10 months ago
- OCR & Ground Truth Resources☆74Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆18Updated 2 weeks ago
- ☆106Updated 9 months ago
- Official repository accompaying the ICDAR 2023 paper☆10Updated last year
- An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"☆74Updated last year
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆274Updated last year
- Inference, training and evaluation code for our models from the paper "Inv3D: a high-resolution 3D invoice dataset for template-guided si…☆42Updated 9 months ago
- Code and data for the paper at http://arxiv.org/abs/2004.07317☆16Updated 4 years ago
- [ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)☆38Updated last year
- ☆77Updated last year
- RoDLA: Benchmarking the Robustness of Document Layout Analysis Models☆28Updated 7 months ago
- Document Image Binarization☆73Updated last month
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆181Updated 2 years ago
- DFKI Layout Detection for OCR-D☆47Updated 2 weeks ago
- ☆67Updated this week
- A large scale camera-taken table detection and recognition dataset.☆113Updated last year
- Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.☆165Updated this week