MaxHalford / orcLinks
π§ Parsing structured information from OCR outputs
β20Updated 2 years ago
Alternatives and similar repositories for orc
Users that are interested in orc are comparing it to the libraries listed below
Sorting:
- A Python library aimed at dissecting and augmenting NER training data.β60Updated 2 years ago
- β43Updated 2 years ago
- Vespa application making an index of the CORD-19 dataset.β40Updated 6 months ago
- Source code and data for Like a Good Nearest Neighborβ30Updated last year
- A framework for evaluating semantic search across custom datasets, metrics, and embedding backends.β38Updated 8 months ago
- A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficientlyβ¦β108Updated last year
- β68Updated 3 years ago
- Python package for deduplication/entity resolution using active learningβ83Updated last year
- Information extraction from English and German texts based on predicate logicβ141Updated 2 years ago
- Sentence transformers models for SpaCyβ108Updated 2 years ago
- Data Programming by Demonstration (DPBD) for Document Classificationβ35Updated 4 years ago
- β84Updated 2 years ago
- π« SpaCy wrapper for ConceptNet π«β95Updated last month
- Explainable Zero-Shot Topic Extractionβ65Updated last year
- β55Updated 2 years ago
- Few-shot Named Entity Recognitionβ121Updated 3 years ago
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.β30Updated 4 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β105Updated last year
- Accurate word segmentation for hashtags and text, powered by Transformers and Beam Search. A scalable alternative to heuristic splitters β¦β76Updated 3 weeks ago
- Super Simple Similarities Serviceβ155Updated 9 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β111Updated last year
- RaKUn 2.0 - A fast keyword detection algorithmβ70Updated 5 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β156Updated last year
- Train huggingface models on top of Prodigy annotationsβ21Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β61Updated 2 years ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.β59Updated last year
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.β21Updated last year
- Custom Natural Language Processing with big and small models π²π±β66Updated 4 years ago
- Experimental form data extraction for journalismβ78Updated 5 years ago