FinNLP / humannamesLinks
📦 A list, huge one (~200K) of human male/female first/last names.
☆49Updated last year
Alternatives and similar repositories for humannames
Users that are interested in humannames are comparing it to the libraries listed below
Sorting:
- Distance/Similarity functions for Bag of Words, Strings, Vectors and more.☆24Updated last year
- Machine-readable lists of lemma-token pairs in 23 languages.☆340Updated 3 years ago
- English Part-of-speech (POS) tagger☆67Updated 2 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆24Updated 8 years ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆13Updated last year
- Multi-class classifier☆13Updated 2 years ago
- Generates free or fixed verse poetry from any text corpus using Ngram natural language generator (markov chains) + pos tagging + rhyme id…☆26Updated 10 years ago
- Word lists from the web.☆89Updated 9 years ago
- Sponsored content detection in YouTube videos☆12Updated 2 years ago
- English lemmatizer☆67Updated 2 years ago
- Script used to collect entry data from Urban Dictionary☆33Updated 9 years ago
- generate rules from lists of words☆16Updated 3 years ago
- Convert number words (eg. twenty one) to numeric digits (21)☆176Updated last year
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- English Lemma Database - Compiled by Referencing British National Corpus☆31Updated 8 months ago
- List of easy American-English words: The New Dale-Chall (1995)☆32Updated 2 years ago
- ☆14Updated 3 years ago
- varied english texts for modern NLP testing☆75Updated 2 years ago
- ⚙️ [Processor] A better English POS tagger written in JavaScript☆54Updated 8 years ago
- Language agnostic named entity recognizer☆39Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆68Updated 3 months ago
- 📦 English word lemmatizer☆15Updated 3 years ago
- English stopwords collection☆162Updated 8 years ago
- Gather modern English word frequencies from all enwiki articles.☆213Updated last year
- Naive Bayes Text Classifier☆40Updated 3 months ago
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆130Updated last year
- A simple repository to remove 'irrelevant for search' words, support for 51 languages☆27Updated 7 years ago
- A modern, interlingual wordnet interface for Python☆247Updated this week
- A static file containing a list of popular RSS feeds.☆12Updated 8 years ago