maxent-ai / ocrpy
OCR, Archive, Index and Search: Implementation agnostic OCR framework.
β222Updated last year
Alternatives and similar repositories for ocrpy:
Users that are interested in ocrpy are comparing it to the libraries listed below
- π βοΈ ETL processes for medical and scientific papersβ373Updated 3 weeks ago
- Labelling platform for text using weak supervision.β260Updated 2 years ago
- Gain clues from clustering!β311Updated 6 months ago
- Neural Searchβ326Updated 7 months ago
- just a bunch of useful embeddings for scikit-learn pipelinesβ480Updated last week
- π Semantic search for headlines and story textβ359Updated last year
- spock is a framework that helps manage complex parameter configurations during research and development of Python applicationsβ129Updated last year
- Doubt your data, find bad labels.β508Updated 6 months ago
- ποΈ Highlight text in documentsβ99Updated last month
- Super lightweight function registries for your libraryβ176Updated 7 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entiβ¦β244Updated last year
- Custom recipe and utilities for document processingβ198Updated 2 years ago
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and teβ¦β315Updated last year
- Python package to generate image embeddings with CLIP without PyTorch/TensorFlowβ143Updated 2 years ago
- Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scoresβ100Updated last year
- Information extraction from English and German texts based on predicate logicβ135Updated last year
- An open-source AutoML Library based on PyTorchβ306Updated 3 weeks ago
- Check if you have training samples in your test setβ64Updated 2 years ago
- Label data at scale. Fun and precision included.β322Updated this week
- Smarter Manual Annotation for Resource-constrained collection of Training dataβ224Updated last month
- Natural language Pandas queries and data generation powered by GPT-3β198Updated 9 months ago
- πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCyβ301Updated last year
- Confection: the sweetest config system for Pythonβ182Updated 7 months ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated 11 months ago
- Blazing fast framework for fine-tuning similarity learning modelsβ648Updated 3 weeks ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engineβ242Updated last year
- Lightweight Experiment & Resource Monitoring πΊβ185Updated last year
- Software that makes labeling PDFs easy.β404Updated 8 months ago
- Weakly Supervised End-to-End Learning (NeurIPS 2021)β157Updated last year
- SpikeX - SpaCy Pipes for Knowledge Extractionβ397Updated 3 years ago