philipperemy / name-datasetLinks
The Python library for names.
β916Updated 2 months ago
Alternatives and similar repositories for name-dataset
Users that are interested in name-dataset are comparing it to the libraries listed below
Sorting:
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?β524Updated 8 months ago
- ππ―pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.β856Updated 10 months ago
- π Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and moreβ385Updated 9 months ago
- A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational eβ¦β897Updated last year
- βοΈContextual word checker for better suggestions (not actively maintained)β414Updated 4 months ago
- All languages stopwords collectionβ450Updated last year
- Fuzzy matching and more functionality for spaCy.β256Updated 11 months ago
- Heuristic based boilerplate removal toolβ785Updated 4 months ago
- The most accurate natural language detection library for Python, suitable for short text and mixed-language textβ1,404Updated 2 weeks ago
- A dataset of multinational first names and last namesβ26Updated 2 years ago
- Single-document unsupervised keyword extractionβ1,742Updated 2 weeks ago
- Abydos NLP/IR library for Pythonβ186Updated 2 years ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Pythonβ272Updated last year
- A Python library for calculating a large variety of metrics from textβ340Updated 6 months ago
- Fixes contractions such as `you're` to `you are`β319Updated 2 years ago
- β171Updated 3 months ago
- Fuzzy string matching, grouping, and evaluation.β764Updated last month
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/β747Updated 2 weeks ago
- A spaCy pipeline and model for NLP on unstructured legal text.β654Updated 11 months ago
- Article extraction benchmark: dataset and evaluation scriptsβ317Updated last year
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.β221Updated 2 years ago
- Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.β1,071Updated this week
- π Semantic search for headlines and story textβ360Updated last year
- Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learningβ315Updated 4 months ago
- β836Updated 2 years ago
- spaCy pipeline object for negating concepts in textβ281Updated last week
- A python based HTML to text conversion library, command line client and Web service.β311Updated 3 weeks ago
- Extract embedded metadata from HTML markupβ922Updated 3 months ago
- Catalog of abusive language data (PLoS 2020)β314Updated last year
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interfaceβ259Updated 9 months ago