jfilter / clean-text
🧹 Python package for text cleaning
☆964Updated last year
Alternatives and similar repositories for clean-text:
Users that are interested in clean-text are comparing it to the libraries listed below
- Fuzzy string matching, grouping, and evaluation.☆751Updated last month
- skweak: A software toolkit for weak supervision applied to NLP tasks☆923Updated 4 months ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆412Updated last month
- NLP, before and after spaCy☆2,216Updated last year
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,359Updated 2 weeks ago
- SpikeX - SpaCy Pipes for Knowledge Extraction☆397Updated 3 years ago
- Toolkit to help understand "what lies" in word embeddings. Also benchmarking!☆473Updated last year
- 1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.☆888Updated this week
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆730Updated 5 months ago
- Fuzzy matching and more functionality for spaCy.☆255Updated 6 months ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,214Updated last year
- Fixes contractions such as `you're` to `you are`☆313Updated 2 years ago
- A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/ca…☆479Updated last year
- A Python library for calculating a large variety of metrics from text☆324Updated last month
- Compute Sentence Embeddings Fast!☆618Updated last year
- 👑 spaCy building blocks and visualizers for Streamlit apps☆825Updated 6 months ago
- 🍳 Recipes for the Prodigy, our fully scriptable annotation tool☆486Updated 5 months ago
- Super Fast String Matching in Python☆363Updated this week
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/☆726Updated last month
- A Python package implementing a new interpretable machine learning model for text classification (with visualization tools for Explainabl…☆341Updated 2 weeks ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆383Updated last year
- A simple component to display annotated text in Streamlit apps.☆532Updated last week
- NeuSpell: A Neural Spelling Correction Toolkit☆684Updated last year
- Single-document unsupervised keyword extraction☆1,671Updated last year
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆513Updated 3 months ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python☆269Updated last year
- Top2Vec learns jointly embedded topic, document and word vectors.☆2,982Updated 2 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Updated last year
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆826Updated 5 months ago
- A spaCy pipeline and model for NLP on unstructured legal text.☆645Updated 6 months ago