OCR, Archive, Index and Search: Implementation agnostic OCR framework.
☆224Nov 3, 2023Updated 2 years ago
Alternatives and similar repositories for ocrpy
Users that are interested in ocrpy are comparing it to the libraries listed below
Sorting:
- This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows☆15Jun 1, 2021Updated 4 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Feb 7, 2023Updated 3 years ago
- Highly concurrent and fast content processing for Mighty Inference Server☆10Feb 6, 2023Updated 3 years ago
- eSNN - Learning similarity measure from data☆12Nov 28, 2019Updated 6 years ago
- Compare different encoding methods to see how well they perform on a classification task. Determine if a reddit comment is from /r/StarWa…☆13Mar 14, 2022Updated 3 years ago
- ☆31Dec 15, 2023Updated 2 years ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- Zero and Few shot named entity & relationships recognition☆401Sep 17, 2025Updated 5 months ago
- A python package for benchmarking interpretability techniques on Transformers.☆215Sep 29, 2024Updated last year
- Doubt your data, find bad labels.☆517Jul 15, 2024Updated last year
- Basic Memory library for Haystack NLP agents☆22Dec 28, 2024Updated last year
- The most accurate natural language detection library for Python, suitable for short text and mixed-language text☆1,639Nov 21, 2025Updated 3 months ago
- An easy way to extract information from documents☆1,786May 3, 2023Updated 2 years ago
- Set-oriented Operations in Pandas☆24May 27, 2020Updated 5 years ago
- Python lib for remo - the app for annotations and images management in Computer Vision☆187Jan 4, 2021Updated 5 years ago
- UnionML: the easiest way to build and deploy machine learning microservices☆336Nov 6, 2023Updated 2 years ago
- Active Learning for Text Classification in Python☆639Feb 1, 2026Updated last month
- Extracts a latent knowledge graph from text and index/query it in elasticsearch or solr☆21Jan 28, 2022Updated 4 years ago
- Natural language Pandas queries and data generation powered by GPT-3☆200Apr 13, 2024Updated last year
- Explore the DALL·E 2 API in Python☆55Dec 14, 2022Updated 3 years ago
- 📊 Semantic search for headlines and story text☆359Sep 23, 2023Updated 2 years ago
- Topic Inference with Zeroshot models☆61Jun 12, 2023Updated 2 years ago
- ☆41Oct 9, 2024Updated last year
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆63Apr 30, 2024Updated last year
- The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact…☆1,470Dec 9, 2024Updated last year
- Confection: the sweetest config system for Python☆193Feb 9, 2026Updated 3 weeks ago
- A library containing general purpose Python utils.☆14Feb 22, 2023Updated 3 years ago
- Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool☆14Dec 12, 2025Updated 2 months ago
- Convert ALTO XML to plain text + minimal metadata☆17Oct 17, 2024Updated last year
- Simple terminal interface for chatgpt☆10Dec 6, 2022Updated 3 years ago
- FastAPI-like interface plugin for Flask☆43Dec 3, 2025Updated 3 months ago
- ☆10Nov 23, 2020Updated 5 years ago
- Satisfy all your REST API testing needs.☆10Oct 17, 2022Updated 3 years ago
- ☆12Updated this week
- PyNLP Lib is an open source Python NLP library that provides functionality for both web and local development☆50Oct 23, 2022Updated 3 years ago
- Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing☆789Jul 22, 2025Updated 7 months ago
- Collection of NLP model explanations and accompanying analysis tools☆144Jun 26, 2023Updated 2 years ago
- GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition☆31Jan 31, 2022Updated 4 years ago
- A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently…☆108Sep 10, 2024Updated last year