MaxHalford / orc
π§ Parsing structured information from OCR outputs
β19Updated last year
Alternatives and similar repositories for orc:
Users that are interested in orc are comparing it to the libraries listed below
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- Explainable Zero-Shot Topic Extractionβ62Updated 8 months ago
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progrβ¦β30Updated 2 weeks ago
- β43Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated last year
- spaCy match and replace, maintaining conjugationβ35Updated 2 years ago
- Data Programming by Demonstration (DPBD) for Document Classificationβ35Updated 3 years ago
- Generalist and Lightweight Model for Text Classificationβ123Updated 2 weeks ago
- Python package for deduplication/entity resolution using active learningβ78Updated 8 months ago
- A spaCy wrapper for GliNERβ112Updated 3 months ago
- Experimental form data extraction for journalismβ77Updated 4 years ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year
- 𧬠A VS Code extension for annotating data with Prodigyβ30Updated 3 years ago
- π« SpaCy wrapper for ConceptNet π«β92Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.β59Updated 11 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β30Updated 8 months ago
- π Logging utilities for spaCyβ12Updated last year
- β54Updated last year
- Bag of, not words, but tricks!β68Updated last year
- Small python package to measure OCR quality and other related metrics.β21Updated last year
- A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficientlyβ¦β108Updated 7 months ago
- An easy way to chunk spaCy docs.β19Updated 8 months ago
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.β30Updated 3 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.β79Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated 11 months ago
- Information extraction from English and German texts based on predicate logicβ135Updated last year
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extractionβ70Updated 8 months ago
- Generate reports for spaCy models.β29Updated 2 years ago
- Plug-and-play NLP pipelines without training.β50Updated this week
- XAI based human-in-the-loop framework for automatic rule-learning.β48Updated 9 months ago