mateuszwosinski / ocr-with-bertLinks
Improving quality of OCR with typo recognition and correction using pretrained BERT model.
☆10Updated 3 years ago
Alternatives and similar repositories for ocr-with-bert
Users that are interested in ocr-with-bert are comparing it to the libraries listed below
Sorting:
- DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confiden…☆26Updated 4 years ago
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 6 years ago
- Detect textlines in document images☆93Updated last year
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆36Updated last year
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated last month
- DFKI Layout Detection for OCR-D☆47Updated last month
- ☆23Updated 5 years ago
- GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning☆27Updated 4 years ago
- Finetune multiple pre-trained Transformer-based models to solve Vietnamese Fake News Detection problem (ReINTEL) in VLSP2020 shared task☆18Updated 4 years ago
- ☆17Updated 4 years ago
- ☆25Updated 7 years ago
- Handwritten Number Recognition using CNN and Character Segmentation☆18Updated 7 years ago
- Robust Cross-lingual Embeddings from Parallel Sentences☆22Updated 4 years ago
- The largest VQA dataset for Vietnamese. Related to the text content in the image.☆16Updated last month
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- 'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy.☆22Updated 7 years ago
- Evaluation of the Layoutlm model on the CORD dataset☆32Updated 3 years ago
- Neural Search System on Arxiv AI/ML Papers☆54Updated 3 years ago
- English Handwriting Recognition with CRNN and CTC Loss☆22Updated 6 years ago
- Arabic edition of ALBERT pretrained language models☆16Updated 4 years ago
- handwritten text recognition on IAM handwriting dataset☆15Updated 5 years ago
- This repository contains an implementation of the "Representation Learning for Information Extraction from Form-like Documents" paper.☆25Updated 4 years ago
- A Unet based deeplearning model to line/box/spurious artifacts from text images. Unsupervised training.☆59Updated 5 years ago
- A set of methods for finding an appropriate number of topics in a text collection☆16Updated last month
- ☆22Updated 4 years ago
- Library for converting from RGB / GrayScale image to base64 and back.☆19Updated 2 years ago
- Document processing using transformers☆21Updated 2 years ago
- OCR-D-compliant page segmentation☆67Updated 3 weeks ago
- ☆22Updated 5 years ago
- Streamlit Named Entity Recognition (NER) annotation custom component☆38Updated 2 years ago