☆11Nov 14, 2021Updated 4 years ago
Alternatives and similar repositories for post-ocr-correction
Users that are interested in post-ocr-correction are comparing it to the libraries listed below
Sorting:
- Python 3 library for processing historical English☆68Aug 10, 2024Updated last year
- Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages☆15Apr 11, 2020Updated 5 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 2 years ago
- OCR post correction for old German corpus☆19Aug 29, 2022Updated 3 years ago
- ☆141Mar 5, 2024Updated last year
- A part-of-speech tagger with support for domain adaptation and external resources.☆24Oct 26, 2022Updated 3 years ago
- ☆25Apr 18, 2020Updated 5 years ago
- The amazing 🐕will normalize non-standard Finnish/Swedish and dialectalize standard Finnish!☆30Aug 10, 2024Updated last year
- Writing Observer and Learning Observer: A system for monitoring learning process data, with an initial focus on writing process data from…☆12Feb 24, 2026Updated last week
- Post-processing OCR errors with seq2seq models☆28Jul 30, 2020Updated 5 years ago
- german sentiment analysis☆13Mar 8, 2017Updated 8 years ago
- Template and steps to build your personal blog using Jekyll and Minimal Mistake☆10Feb 24, 2020Updated 6 years ago
- Using the function read.table() to break file into chunks to loop and process them. This allows processing files of any size beyond what …☆10Aug 19, 2014Updated 11 years ago
- материалы курса по питону для студентов дпо-программы "компьютерная лингвистика" в НИУ ВШЭ (2020-2021)☆11Feb 21, 2022Updated 4 years ago
- ☆10Jul 6, 2023Updated 2 years ago
- python library☆12Nov 25, 2025Updated 3 months ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆39Dec 2, 2023Updated 2 years ago
- Twitter Dataset and Finetuned Transformer Model for Turkish Sentiment Analysis☆14Jul 29, 2022Updated 3 years ago
- ☆12Jun 29, 2025Updated 8 months ago
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- A short demo of (r)Ollama☆11Oct 17, 2024Updated last year
- golang package to provide lightweight internal pub/sub for goroutines☆29Jan 23, 2014Updated 12 years ago
- Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gen…☆13Oct 6, 2025Updated 4 months ago
- ☆13Apr 24, 2023Updated 2 years ago
- ☆11Mar 31, 2023Updated 2 years ago
- Vossian Antonomasia☆10Oct 17, 2025Updated 4 months ago
- Automatic Detection of Potentially Idiomatic Expressions☆12Feb 19, 2021Updated 5 years ago
- An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.☆13Oct 15, 2020Updated 5 years ago
- Postman collections for Redfish requests against HPE servers☆13Apr 18, 2021Updated 4 years ago
- ☆10Apr 3, 2024Updated last year
- A high level pool for maintaining pools of *sql.DB databases (e.g: thousands of SQLite files)☆10Oct 29, 2016Updated 9 years ago
- convert PubLayNet data into METS/PAGE-XML☆10Mar 17, 2020Updated 5 years ago
- Coursera - RNN Programming Assignment: In this project, we will construct a speech dataset and implement an algorithm for trigger word de…☆10Aug 29, 2021Updated 4 years ago
- The source codes for D2AGE model. Distance-aware DAG Embedding for Proximity Search on Heterogeneous Graphs.☆12Feb 20, 2018Updated 8 years ago
- AI assistance☆12Jan 5, 2023Updated 3 years ago
- ☆12Oct 17, 2022Updated 3 years ago
- Code for "ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer"☆15Jul 17, 2024Updated last year
- Authentication and lookups for language (ISO 639-1) and region (ISO 3166-1 alpha-2) codes☆30Jan 23, 2015Updated 11 years ago
- Easily download base64 strings as image files in React.☆10Jul 18, 2023Updated 2 years ago