code-kern-ai / embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
☆21Updated last year
Alternatives and similar repositories for embedders:
Users that are interested in embedders are comparing it to the libraries listed below
- With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.☆22Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated last month
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- ☆30Updated 2 years ago
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 3 years ago
- ☆54Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆90Updated last year
- ☆22Updated 2 years ago
- ☆42Updated last year
- Generate reports for spaCy models.☆29Updated 2 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- A small repository to test Captum Explainable AI with a trained Flair transformers-based text classifier.☆26Updated 3 years ago
- Aim-spaCy integration☆34Updated last year
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus☆14Updated last year
- Just another sentiment wrapper.☆17Updated 3 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- Tools for interactive visual exploration of semantic embeddings.☆30Updated 6 months ago
- A PyPI package for easy text annotation in a Jupyter Notebook.☆28Updated 3 years ago
- Library for fast text representation and classification.☆28Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated 11 months ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- OptimSeed - Seed Word Selection for Weakly-Supervised Text Classification [NAACL SRW 2021]☆14Updated 3 years ago
- Using short models to classify long texts☆21Updated last year
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 3 months ago
- Python package for deduplication/entity resolution using active learning☆76Updated 6 months ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆31Updated 9 months ago