philipperemy / name-datasetLinks
The Python library for names.
☆945Updated 5 months ago
Alternatives and similar repositories for name-dataset
Users that are interested in name-dataset are comparing it to the libraries listed below
Sorting:
- All languages stopwords collection☆456Updated last year
- Company Name Processor written in Python☆341Updated last year
- Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.☆857Updated 2 years ago
- Extract embedded metadata from HTML markup☆929Updated 3 weeks ago
- Heuristic based boilerplate removal tool☆797Updated 7 months ago
- Article extraction benchmark: dataset and evaluation scripts☆329Updated this week
- ✔️Contextual word checker for better suggestions (not actively maintained)☆417Updated 7 months ago
- Single-document unsupervised keyword extraction☆1,788Updated 3 weeks ago
- Process Common Crawl data with Python and Spark☆440Updated this week
- Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning☆322Updated 2 months ago
- Just the facts -- web page content extraction☆1,273Updated 2 months ago
- Machine-readable lists of lemma-token pairs in 23 languages.☆342Updated 3 years ago
- Ultimate Website Sitemap Parser☆227Updated 2 weeks ago
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆845Updated 2 weeks ago
- Default English stopword lists from many different sources☆308Updated 2 years ago
- 🧹 Python package for text cleaning☆991Updated 2 years ago
- NLP, before and after spaCy☆2,231Updated 2 years ago
- 📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more☆387Updated last year
- Compact Language Detector 2☆873Updated 4 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆526Updated 11 months ago
- 📛 Fuzzy Name Matching with Machine Learning☆265Updated last year
- Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.☆1,073Updated 2 years ago
- a python library for parsing unstructured western names into name components.☆609Updated 4 months ago
- Fixes contractions such as `you're` to `you are`☆317Updated 2 years ago
- Fuzzy string matching, grouping, and evaluation.☆782Updated 2 months ago
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizati…☆671Updated 3 months ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python☆272Updated 2 years ago
- A spaCy pipeline and model for NLP on unstructured legal text.☆662Updated last year
- Text databases of last names from various countries☆281Updated 2 years ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,318Updated 3 weeks ago