philipperemy / name-datasetLinks
The Python library for names.
β910Updated last month
Alternatives and similar repositories for name-dataset
Users that are interested in name-dataset are comparing it to the libraries listed below
Sorting:
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/β744Updated 2 weeks ago
- π Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and moreβ380Updated 8 months ago
- Heuristic based boilerplate removal toolβ780Updated 3 months ago
- π§Ή Python package for text cleaningβ980Updated 2 years ago
- Blazingly fast cleaning swear words (and their leetspeak) in stringsβ220Updated last year
- Text databases of last names from various countriesβ280Updated 2 years ago
- Gather modern English word frequencies from all enwiki articles.β213Updated last year
- Article extraction benchmark: dataset and evaluation scriptsβ316Updated last year
- Fuzzy string matching, grouping, and evaluation.β764Updated last month
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithmβ¦β827Updated last month
- The most accurate natural language detection library for Python, suitable for short text and mixed-language textβ1,376Updated last week
- Spelling corrector in pythonβ482Updated 5 months ago
- π Fuzzy Name Matching with Machine Learningβ264Updated 11 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Pythonβ1,010Updated last year
- β830Updated 2 years ago
- Python port of Boilerpipe libraryβ88Updated 9 months ago
- Smarter Manual Annotation for Resource-constrained collection of Training dataβ229Updated 6 months ago
- Process Common Crawl data with Python and Sparkβ431Updated last week
- A modern, interlingual wordnet interface for Pythonβ247Updated last week
- Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.β849Updated 2 years ago
- A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational eβ¦β896Updated last year
- A dataset of multinational first names and last namesβ26Updated 2 years ago
- Super Fast String Matching in Pythonβ369Updated 2 months ago
- Clean personally identifiable information from dirty dirty text.β408Updated last year
- spellchecking library for pythonβ609Updated 11 months ago
- A tool for converting PDF into hOCR with text, tables, and figures being recognized and preserved.β448Updated last year
- ππ―pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.β851Updated 9 months ago
- Fuzzy matching and more functionality for spaCy.β256Updated 11 months ago
- Full text geoparsing as a Python libraryβ752Updated 3 years ago
- Catalog of abusive language data (PLoS 2020)β312Updated 11 months ago