☆11Nov 14, 2021Updated 4 years ago
Alternatives and similar repositories for post-ocr-correction
Users that are interested in post-ocr-correction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python 3 library for processing historical English☆68Aug 10, 2024Updated last year
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 2 years ago
- Post-processing OCR errors with seq2seq models☆28Jul 30, 2020Updated 5 years ago
- The amazing 🐕will normalize non-standard Finnish/Swedish and dialectalize standard Finnish!☆30Aug 10, 2024Updated last year
- Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages☆15Apr 11, 2020Updated 5 years ago
- bin files☆13Jan 30, 2025Updated last year
- ☆141Mar 5, 2024Updated 2 years ago
- Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gen…☆13Oct 6, 2025Updated 5 months ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆24Oct 26, 2022Updated 3 years ago
- OCR post correction for old German corpus☆20Aug 29, 2022Updated 3 years ago
- ☆16Sep 28, 2023Updated 2 years ago
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- Tools for assessing Finnish poetry: rhymes, meter, hyphenation of Finnish and so on.☆13Dec 13, 2023Updated 2 years ago
- ☆10Jul 21, 2017Updated 8 years ago
- An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.☆13Oct 15, 2020Updated 5 years ago
- Simple horizontal conveyor belt animated ticker.☆12Nov 23, 2022Updated 3 years ago
- Linguistic Analysis Command-Line Tool☆14Sep 23, 2019Updated 6 years ago
- ☆11Jan 20, 2020Updated 6 years ago
- Coursera - RNN Programming Assignment: In this project, we will construct a speech dataset and implement an algorithm for trigger word de…☆10Aug 29, 2021Updated 4 years ago
- ParCourE - Parallel Corpus Explorer☆12Dec 27, 2021Updated 4 years ago
- Distribution of word meanings in Wikipedia for English, Italian, French, German and Spanish.☆10Jan 4, 2021Updated 5 years ago
- Personal blog site using Wagtail CMS☆19Dec 27, 2022Updated 3 years ago
- Morphological analysis for Udmurt.☆12Feb 17, 2026Updated last month
- Jupyter notebooks for the articles on Medium about translating a cook book☆13Nov 18, 2019Updated 6 years ago
- A Python library for converting HTML files into PDF with Chrome's engine.☆21Aug 10, 2024Updated last year
- материалы курса по питону для студентов дпо-программы "компьютерная лингвистика" в НИУ ВШЭ (2020-2021)☆11Feb 21, 2022Updated 4 years ago
- Part-of-speech tagging using BERT☆10Nov 14, 2019Updated 6 years ago
- This project is a clone version of a Famous Developers community website Stackoverflow.☆10Jul 4, 2021Updated 4 years ago
- Arabic Dialect Identification on AOC data.☆24Mar 2, 2019Updated 7 years ago
- Umbrella repository that describes the collections contained in any given release of ELTeC☆13Jan 26, 2022Updated 4 years ago
- Automatic Detection of Potentially Idiomatic Expressions☆12Feb 19, 2021Updated 5 years ago
- Interlinear glossing with JS & CSS☆20Aug 23, 2015Updated 10 years ago
- Using the function read.table() to break file into chunks to loop and process them. This allows processing files of any size beyond what …☆10Aug 19, 2014Updated 11 years ago
- The NLG tool for Finnish☆24Dec 13, 2023Updated 2 years ago
- Copy BibTeX on Google Scholar Search page with a single click☆20Nov 5, 2023Updated 2 years ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆39Dec 2, 2023Updated 2 years ago
- Legacy version of CNN neural net toolkit (now called dynet)☆19Oct 8, 2016Updated 9 years ago
- Presentations, tutorials and data for the OCR workshop at LMU☆16Jun 2, 2017Updated 8 years ago
- german sentiment analysis☆13Mar 8, 2017Updated 9 years ago