fritshermans / pyminhashLinks
MinHash implementation in Python
☆12Updated last year
Alternatives and similar repositories for pyminhash
Users that are interested in pyminhash are comparing it to the libraries listed below
Sorting:
- Pipeline components that support partial_fit.☆46Updated last year
- State-of-the-art question answering with HuggingFace and Streamlit☆19Updated 5 years ago
- Super Simple Similarities Service☆154Updated 6 months ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Bag of, not words, but tricks!☆68Updated last year
- It's a cooler way to store simple linear models.☆27Updated last year
- ☆30Updated 3 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 5 years ago
- ☆43Updated 2 years ago
- A PyPI package for easy text annotation in a Jupyter Notebook.☆28Updated 4 years ago
- Python package for deduplication/entity resolution using active learning☆81Updated last year
- GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition☆31Updated 3 years ago
- Source code and data for Like a Good Nearest Neighbor☆30Updated 9 months ago
- Generate reports for spaCy models.☆29Updated 3 years ago
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.☆30Updated 3 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆56Updated 2 years ago
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).☆75Updated last year
- ☆55Updated last year
- Fast fuzzy text search☆11Updated 2 years ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 3 years ago
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆72Updated last year
- Just another sentiment wrapper.☆18Updated 3 years ago
- 🐍 Material for PyData Global 2021 Presentation: Effective Testing for Machine Learning Projects☆82Updated 3 years ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆19Updated 3 years ago
- Dutch abusive language data☆11Updated 2 years ago
- NLP tool to extract emotional phrase from tweets 🤩☆40Updated 4 years ago
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated last year
- MoodCat😼 classifies the mood of English sentences.☆14Updated 3 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆95Updated 2 years ago
- Knowledge pills on Neural Search☆26Updated 2 years ago