igorbrigadir / stopwords
Default English stopword lists from many different sources
☆293Updated last year
Alternatives and similar repositories for stopwords:
Users that are interested in stopwords are comparing it to the libraries listed below
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆138Updated 2 years ago
- Code and data for inducing domain-specific sentiment lexicons.☆196Updated 5 months ago
- Quickly extract multi-word phrases from a corpus☆190Updated 4 years ago
- ☆216Updated 6 years ago
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizati…☆664Updated 10 months ago
- Improving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)☆178Updated 7 years ago
- semi supervised guided topic model with custom guidedLDA☆501Updated 4 years ago
- Named Entity Recognition based on dictionaries☆242Updated 5 years ago
- Data for Automatic Keyphrase Extraction Task☆336Updated 6 years ago
- Retrofitting Word Vectors to Semantic Lexicons☆374Updated 5 years ago
- ☆228Updated 8 years ago
- Word Embeddings for Information Retrieval☆226Updated last year
- Guidelines.☆96Updated 5 months ago
- Collection of tools for building diachronic/historical word vectors☆423Updated last year
- Various Algorithms for Short Text Mining☆466Updated last week
- Short Text Topic Modeling, JAVA☆155Updated 4 years ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆728Updated 5 months ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆253Updated 4 months ago
- Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?☆512Updated 2 months ago
- Python library for Natural Language Preprocessing (NLPre)☆190Updated last year
- Hierarchical, multi-label topic modelling with LDA☆53Updated 2 years ago
- EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings (official implementation)☆432Updated last year
- A python module for English lemmatization and inflection.☆265Updated last year
- Computation of the semantic interpretability of topics produced by topic models.☆180Updated 7 years ago
- Semantic Orientation Calculator for Sentiment Analysis☆52Updated 2 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated last year
- Palmetto is a quality measuring tool for topics☆217Updated 11 months ago
- Python port of the Twokenize class of ark-tweet-nlp☆141Updated 6 years ago
- Named Entity Recognition data for Europeana Newspapers☆171Updated last year
- Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)☆339Updated 2 years ago