mawanda-jun / IntelligentOCRLinks
An intelligent OCR to detect tables and pure text inside PDFs and obtaing a csv file and a txt from it
☆14Updated 6 years ago
Alternatives and similar repositories for IntelligentOCR
Users that are interested in IntelligentOCR are comparing it to the libraries listed below
Sorting:
- Play the card game Baccarat☆14Updated last year
- This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified an…☆23Updated 4 years ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Machine Learning-assisted correction of OCR errors in historical corpora☆10Updated 8 months ago
- Document Image Classification☆11Updated 7 years ago
- Build a deep learning model for predicting the named entities from text.☆56Updated 6 years ago
- A DeepWalk implementation for ontologies using NetworkX and Gensim☆19Updated 8 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆39Updated last year
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Updated 4 years ago
- Text classification automl☆21Updated 3 years ago
- Desktop Version of Docuburst☆19Updated 8 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 6 years ago
- Neural Elastic Inference and Search☆19Updated 5 years ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆70Updated this week
- Indexing GDELT database into Elasticsearch, entire database including the -each 15 minutes- real time events☆13Updated 5 years ago
- Given a text, wrap it into phrases and send them to Yandex's search engine. If it yields a "did you mean:", substitute the original phras…☆11Updated 6 years ago
- PDF Table Extractor - repository to hold revisable version of code from https://www.cvast.tuwien.ac.at/projects/pdf2table by Burcu Yildiz☆38Updated last year
- This is a REST Server endpoint built using Flask and Python.☆24Updated 2 years ago
- Code examples for Google Natural Language API.☆13Updated 5 years ago
- Neural Solr = Solr 9 + Mighty Inference + Node☆17Updated 3 years ago
- Graphical techniques for text mining.☆19Updated 10 years ago
- A simple viewer and inspection tool for text boxes in PDF documents☆95Updated 3 years ago
- Classification and detection of polarizing events in the news☆17Updated 10 years ago
- This repository contains the DFKI Product Corpus, a dataset of 174 documents annotated for product and company named entities, and the re…☆12Updated 10 months ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼☆22Updated 5 months ago
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 4 years ago
- 版面分析+OCR☆11Updated 3 years ago
- Text preprocessing tools in python.☆27Updated 7 years ago
- Tool for the Automatic Assessment of Lexical Diversity☆12Updated 4 years ago