CopenhagenCityArchives / CorrectOCRLinks
Machine Learning-assisted correction of OCR errors in historical corpora
☆10Updated 8 months ago
Alternatives and similar repositories for CorrectOCR
Users that are interested in CorrectOCR are comparing it to the libraries listed below
Sorting:
- Neural Elastic Inference and Search☆19Updated 5 years ago
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 4 years ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated 2 years ago
- A few end to end examples that use data-describe☆16Updated 2 years ago
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Updated 4 years ago
- Code examples for Google Natural Language API.☆13Updated 5 years ago
- Given a text, wrap it into phrases and send them to Yandex's search engine. If it yields a "did you mean:", substitute the original phras…☆11Updated 6 years ago
- ☆14Updated 6 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Text classification automl☆21Updated 3 years ago
- A web app built with Streamlit that summarizes input text☆13Updated 4 years ago
- Loan Risk Prediction Neural Network and API☆17Updated 4 years ago
- Tool for the Automatic Assessment of Lexical Diversity☆12Updated 4 years ago
- TensorFlow materials☆13Updated 4 years ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated 2 months ago
- ☆13Updated 2 years ago
- A tool designed to extract numerical data from scanned historical weather documents.☆13Updated 7 months ago
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 6 years ago
- Using Machine Learning to Create High-Res Fine Art☆12Updated last year
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆10Updated 3 years ago
- ☆34Updated 5 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 6 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 3 years ago
- This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified an…☆23Updated 4 years ago
- Corpus Build OCR platform☆8Updated 2 years ago
- Using Machine Learning to Create Funny Memes☆25Updated 2 years ago
- text-data pre-processing utility☆13Updated 3 years ago
- Uses Beautiful Soup to read Wiki pages, Gensim to summarize, NLTK to process, and extracts keywords based on entropy: everything in one b…☆9Updated 4 years ago