Alir3z4 / stop-words
List of common stop words in various languages.
☆337Updated 2 years ago
Alternatives and similar repositories for stop-words:
Users that are interested in stop-words are comparing it to the libraries listed below
- Default English stopword lists from many different sources☆298Updated 2 years ago
- English stopwords collection☆161Updated 8 years ago
- Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python☆268Updated last year
- All languages stopwords collection☆439Updated last year
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆255Updated 8 months ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆245Updated 2 years ago
- Python Implementations of Word Sense Disambiguation (WSD) Technologies.☆747Updated 2 years ago
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- 📂 Additional lookup tables and data resources for spaCy☆105Updated 3 months ago
- Quickly extract multi-word phrases from a corpus☆191Updated 4 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- FreeLing project source code☆255Updated last year
- (Official repo for pypi package) Python bindings for the Hunspell spellchecker engine☆186Updated 4 years ago
- Universal Dependencies online documentation☆283Updated this week
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated last year
- Language independent truecaser in Python.☆160Updated 3 years ago
- A fully customisable language detection pipeline for spaCy☆92Updated 6 years ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆375Updated 2 years ago
- Extract dates from text☆64Updated 4 years ago
- Keyword extraction using TextRank algorithm after pre-processing the text with lemmatization, filtering unwanted parts-of-speech and othe…☆114Updated 5 years ago
- 📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more☆378Updated 7 months ago
- LexRank algorithm for text summarization☆230Updated last year
- Unannotated Spanish 3 Billion Words Corpora☆101Updated 2 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- A multilingual, cross-domain temporal tagger developed at the Database Systems Research Group at Heidelberg University.☆342Updated 2 years ago
- Named Entity Recognition data for Europeana Newspapers☆171Updated 2 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆138Updated 2 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.☆151Updated last year
- Various Algorithms for Short Text Mining☆470Updated this week
- A modern, interlingual wordnet interface for Python☆244Updated this week