philipperemy / name-dataset
The Python library for names.
☆819Updated last month
Related projects: ⓘ
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆791Updated 2 weeks ago
- Port of Google's language-detection library to Python.☆1,709Updated 7 months ago
- ✔️Contextual word checker for better suggestions☆405Updated 6 months ago
- 📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more☆349Updated last week
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆782Updated last month
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,129Updated 3 months ago
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/☆696Updated 6 months ago
- 🧹 Python package for text cleaning☆946Updated last year
- Process Common Crawl data with Python and Spark☆400Updated last week
- All languages stopwords collection☆420Updated 8 months ago
- Heuristic based boilerplate removal tool☆717Updated 4 months ago
- Simple PDF text extraction☆859Updated 4 months ago
- Spelling corrector in python☆449Updated 9 months ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆724Updated last month
- Text databases of last names from various countries☆271Updated last year
- Fuzzy string matching, grouping, and evaluation.☆736Updated 4 months ago
- NLP, before and after spaCy☆2,206Updated 11 months ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html☆807Updated last month
- ☆159Updated 3 months ago
- Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning☆294Updated this week
- Super Fast String Matching in Python☆362Updated 4 months ago
- Full text geoparsing as a Python library☆742Updated 3 years ago
- Single-document unsupervised keyword extraction☆1,626Updated 8 months ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆225Updated last year
- Extract embedded metadata from HTML markup☆839Updated last month
- Python bindings to libpostal for fast international address parsing/normalization☆762Updated 2 months ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆505Updated last year
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,040Updated 2 weeks ago
- 🍳 Recipes for the Prodigy, our fully scriptable annotation tool☆477Updated last month
- extract text from any document. no muss. no fuss.☆3,865Updated this week