code-kern-ai / embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
☆21Updated last year
Alternatives and similar repositories for embedders:
Users that are interested in embedders are comparing it to the libraries listed below
- With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.☆22Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 3 months ago
- ☆30Updated 2 years ago
- Aim-spaCy integration☆34Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- ☆43Updated 2 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- sequence tagging with spaCy and crfsuite☆19Updated 2 years ago
- Plug-and-play document processing pipelines. No training. Batteries included.☆57Updated last week
- CLI-based tool to automatically build ML models from training data into a servable Docker container☆58Updated 2 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆92Updated last year
- No Teacher BART distillation experiment for NLI tasks☆26Updated 4 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated last year
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆36Updated 3 years ago
- ☆22Updated 3 years ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 3 years ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.☆24Updated 10 months ago
- spaCy entry points for Curated Transformers☆29Updated 7 months ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated last year
- ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.☆35Updated 4 years ago
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 4 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 5 months ago
- Interpretable feature construction from taxonomies for text classification☆18Updated 3 years ago
- Explainable Zero-Shot Topic Extraction☆62Updated 8 months ago
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆30Updated 3 weeks ago