mathigatti / img2txt
Easy formatted text extraction from images using Google Vision API
β41Updated 3 years ago
Alternatives and similar repositories for img2txt
Users that are interested in img2txt are comparing it to the libraries listed below
Sorting:
- πGUI for training spaCy modelsβ54Updated 4 years ago
- Extract dates from textβ64Updated 4 years ago
- Analyze XML extracted from PDFs (e.g. from TET or PDFMiner)β20Updated 7 years ago
- β14Updated 3 years ago
- A system for reading scanned documents and grouping them into high level topicsβ16Updated 4 years ago
- Python tools for Tesseract OCR trainingβ25Updated 3 years ago
- β19Updated 3 years ago
- Corpus and a baseline neural network system for Named Entity Recognition in Hindi-English Code-Mixed social media text.β45Updated 4 years ago
- β32Updated 6 years ago
- A set of NLP tools created during my medium NLP Explanation series.β31Updated last year
- CorrectLy - Open Source Spelling & Grammar correctionβ40Updated 2 years ago
- A python library for extracting text from PDFs without losing the formatting of the PDF content.β77Updated 3 years ago
- Scripts for building a geo-located web corpus using Common Crawl dataβ11Updated 3 weeks ago
- Generate multiple choice fill-in-the-blank questions from any article.β13Updated 2 years ago
- Named entity recognition for the legal domainβ42Updated 3 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data π§Όβ22Updated 3 months ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around tβ¦β33Updated 2 years ago
- Analysis of original Lovecraft novels vs. Lovecraft-inspired boardgame text.β28Updated 4 years ago
- NLP command-line assistant powered by OpenAIβ21Updated last year
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP modelsβ¦β36Updated 3 years ago
- Keyword extraction with spaCyβ31Updated 3 years ago
- SpacyV3 Text Categorizer Tutorialβ17Updated 4 years ago
- python package for calculating famous measures in computational linguisticsβ14Updated 6 months ago
- A tool designed to extract numerical data from scanned historical weather documents.β13Updated 5 months ago
- DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confidenβ¦β26Updated 4 years ago
- β34Updated 5 years ago
- Calculates the word error rate of two strings, and the result is written into beautify HTML.β20Updated 5 years ago
- An ongoing series of notebooks aimed at helping fellow NLP enthusiasts think about applying new tools and techniques to practical tasks.β18Updated 4 years ago
- Document Search Engine project with TF-IDF abd Google universal sentence encoder modelβ54Updated 2 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfacesβ39Updated 2 weeks ago