text2knowledge / docTR-LabelerLinks
A OCR labeling tool - made for docTR
☆17Updated 3 weeks ago
Alternatives and similar repositories for docTR-Labeler
Users that are interested in docTR-Labeler are comparing it to the libraries listed below
Sorting:
- A Python library to extract tabular data from PDFs☆66Updated 10 months ago
- A Python module for retrieving script types of writing systems including alphabets, abjads, abugidas, syllabaries, logographs, featurals …☆15Updated last year
- Confection: the sweetest config system for Python☆193Updated last month
- A python package to simulate typographical errors.☆38Updated 2 years ago
- A Python implementation of Lunr.js 🌖☆204Updated 11 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)☆255Updated 7 months ago
- 🕊️ Radically lightweight command-line interfaces☆108Updated 4 months ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆21Updated last year
- Efficient string matching with regular expressions☆146Updated this week
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆158Updated last month
- A visual labeling system implemented in Jupyter widgets.☆154Updated last year
- A fun party trick to run Python code from another venv into this one.☆218Updated 10 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆170Updated 3 years ago
- 🔢 Work with static vector models☆36Updated 9 months ago
- An easy way to chunk spaCy docs.☆22Updated last year
- A spaCy wrapper for GliNER☆129Updated last year
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated last year
- A different, but useful, textcat approach.☆18Updated last year
- Small python package to measure OCR quality and other related metrics.☆26Updated last year
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆69Updated last month
- ☆83Updated this week
- Generalist and Lightweight Model for Text Classification☆169Updated last week
- ☆68Updated 3 years ago
- Train huggingface models on top of Prodigy annotations☆21Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- Embedding Vector Oriented Clustering☆167Updated last week
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆130Updated last month
- ☆21Updated 2 years ago
- A bit of extra usability for sqlite☆219Updated last week
- Next-generation Punkt sentence boundary detection with zero dependencies☆27Updated 2 months ago