philipperemy / name-dataset
The Python library for names.
☆861Updated 3 months ago
Alternatives and similar repositories for name-dataset:
Users that are interested in name-dataset are comparing it to the libraries listed below
- Fuzzy string matching, grouping, and evaluation.☆751Updated last month
- Heuristic based boilerplate removal tool☆744Updated 8 months ago
- Company Name Processor written in Python☆332Updated 8 months ago
- Process Common Crawl data with Python and Spark☆412Updated last month
- Super Fast String Matching in Python☆363Updated this week
- The most accurate natural language detection library for Python, suitable for short text and mixed-language text☆1,221Updated 3 weeks ago
- A fast, robust Python library to check for offensive language in strings.☆636Updated 6 months ago
- Article extraction benchmark: dataset and evaluation scripts☆300Updated 9 months ago
- Single-document unsupervised keyword extraction☆1,671Updated last year
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆978Updated 11 months ago
- ☆807Updated last year
- LexNLP by LexPredict☆706Updated 8 months ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,210Updated 2 months ago
- A python based HTML to text conversion library, command line client and Web service.☆282Updated 2 weeks ago
- Information extraction from English and German texts based on predicate logic☆389Updated 2 years ago
- Examples for using the dedupe library☆409Updated 5 months ago
- Textpipe: clean and extract metadata from text☆301Updated 3 years ago
- 📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more☆366Updated 4 months ago
- A dataset of multinational first names and last names☆26Updated last year
- NLP, before and after spaCy☆2,216Updated last year
- Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.☆839Updated last year
- A Python library for calculating a large variety of metrics from text☆324Updated last month
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/☆726Updated last month
- All languages stopwords collection☆427Updated last year
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆149Updated last year
- ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of diff…☆88Updated 3 years ago
- 📛 Fuzzy Name Matching with Machine Learning☆262Updated 7 months ago
- Port of Google's language-detection library to Python.☆1,746Updated last year
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆399Updated 2 months ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆412Updated last month