philipperemy / name-datasetLinks
The Python library for names.
β920Updated 3 months ago
Alternatives and similar repositories for name-dataset
Users that are interested in name-dataset are comparing it to the libraries listed below
Sorting:
- π§Ή Python package for text cleaningβ979Updated 2 years ago
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/β749Updated this week
- All languages stopwords collectionβ451Updated last year
- Heuristic based boilerplate removal toolβ786Updated 4 months ago
- π Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and moreβ385Updated 10 months ago
- Company Name Processor written in Pythonβ339Updated last year
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithmβ¦β834Updated 2 months ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.β1,304Updated last month
- Article extraction benchmark: dataset and evaluation scriptsβ318Updated last year
- βοΈContextual word checker for better suggestions (not actively maintained)β414Updated 5 months ago
- Process Common Crawl data with Python and Sparkβ437Updated last month
- Just the facts -- web page content extractionβ1,270Updated last week
- Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.β851Updated 2 years ago
- Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learningβ318Updated this week
- Default English stopword lists from many different sourcesβ303Updated 2 years ago
- Super Fast String Matching in Pythonβ370Updated 4 months ago
- Fixes contractions such as `you're` to `you are`β318Updated 2 years ago
- π Fuzzy Name Matching with Machine Learningβ264Updated last year
- Fuzzy string matching, grouping, and evaluation.β771Updated last week
- CLASSLA Fork of the Official Stanford NLP Python Library for Many Human Languagesβ42Updated 2 months ago
- Machine-readable lists of lemma-token pairs in 23 languages.β341Updated 3 years ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?β524Updated 8 months ago
- Python port of Boilerpipe libraryβ88Updated 11 months ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Pythonβ272Updated 2 years ago
- ππ―pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.β863Updated 11 months ago
- Spelling corrector in pythonβ484Updated 2 weeks ago
- A python utility for downloading Common Crawl dataβ242Updated 2 years ago
- Offline database of synonyms/thesaurusβ198Updated last year
- β840Updated 2 years ago
- Single-document unsupervised keyword extractionβ1,751Updated 2 weeks ago