pmbaumgartner / clabel
A utility for labeling clusters of text data.
β28Updated 3 years ago
Related projects β
Alternatives and complementary repositories for clabel
- Python package for deduplication/entity resolution using active learningβ78Updated 2 months ago
- 𧬠A VS Code extension for annotating data with Prodigyβ30Updated 2 years ago
- MoodCatπΌ classifies the mood of English sentences.β14Updated 2 years ago
- A curated list of ML awesome frameworks & libraries for text dataβ16Updated last year
- β29Updated 2 years ago
- Generate reports for spaCy models.β28Updated 2 years ago
- allennlp + streamlit demoβ22Updated 5 years ago
- Neural Solr = Solr 9 + Mighty Inference + Nodeβ16Updated 2 years ago
- spaCy match and replace, maintaining conjugationβ34Updated last year
- spaCy entry points for Curated Transformersβ25Updated last month
- Streamlit demo app to demonstrate the features of transformers interpret with multiple models.β25Updated 3 years ago
- ALMa (Active Learning Manager) Keeps track of labeled and unlabeled data for active learningβ42Updated 4 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.β37Updated 5 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.β22Updated last year
- π A Prodigy plugin for evaluating spaCy pipelinesβ12Updated 7 months ago
- The NLP Bias Identification Toolkitβ35Updated last year
- Examples of vector DB indexing and query with various vector databases.β12Updated last month
- Comparing Polars to Pandas and a small introductionβ43Updated 3 years ago
- Set-oriented Operations in Pandasβ24Updated 4 years ago
- πΈ Train floret vectorsβ18Updated last year
- Easily clean text with spaCy!β32Updated 8 months ago
- It's a cooler way to store simple linear models.β28Updated 4 months ago
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or fβ¦β24Updated 3 years ago
- β70Updated last year
- βοΈ Parallel and distributed training with spaCy and Rayβ54Updated last year
- Efficient BM25 with DuckDB π¦β29Updated last month
- Have UV deal with all your Jupyter deps.β18Updated 2 months ago
- β29Updated 11 months ago