ljvmiranda921 / prodigy-pdf-custom-recipe
Custom recipe and utilities for document processing
☆199Updated 2 years ago
Alternatives and similar repositories for prodigy-pdf-custom-recipe
Users that are interested in prodigy-pdf-custom-recipe are comparing it to the libraries listed below
Sorting:
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆245Updated last year
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆214Updated 3 months ago
- Gain clues from clustering!☆313Updated 10 months ago
- Spacy NER annotator using ipywidgets☆121Updated last year
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated last year
- Streamline scikit-learn model comparison.☆145Updated 2 years ago
- Quote extraction for modular journalism (JournalismAI collab 2021)☆228Updated 3 years ago
- SpikeX - SpaCy Pipes for Knowledge Extraction☆398Updated 3 years ago
- STriP Net: Semantic Similarity of Scientific Papers (S3P) Network☆85Updated 2 years ago
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆72Updated last year
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- 💥 Explosion Assets☆44Updated last year
- All the goto functions you need to handle NLP use-cases, integrated in NLPretext☆140Updated last month
- just a bunch of useful embeddings for scikit-learn pipelines☆497Updated last month
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆154Updated 11 months ago
- Few-shot Named Entity Recognition☆123Updated 3 years ago
- ✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3☆322Updated last year
- A Simple Bulk Labelling Tool☆579Updated 4 months ago
- A python library for extracting text from PDFs without losing the formatting of the PDF content.☆77Updated 3 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆161Updated 2 years ago
- Explainable Zero-Shot Topic Extraction☆62Updated 8 months ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆262Updated 6 months ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)☆152Updated 2 years ago
- A guide book on data science for busy and equally lazy Data Scientists 😄☆131Updated 3 weeks ago
- Label data using HuggingFace's transformers and automatically get a prediction service☆189Updated last year
- Public runnable examples of using John Snow Labs' OCR for Apache Spark.☆91Updated last week
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆91Updated 3 years ago
- A Python library for calculating a large variety of metrics from text☆337Updated 5 months ago