Xangis / extra-stopwords
Extra stopword lists for use with NLTK.
☆28Updated 11 months ago
Related projects: ⓘ
- A fully customisable language detection pipeline for spaCy☆93Updated 5 years ago
- Guess gender from first name in Python 2 and 3☆129Updated 2 years ago
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.☆74Updated 2 years ago
- Hunspell extension for spaCy 2.0.☆94Updated last month
- ☆159Updated 3 months ago
- Get list of common stop words in various languages in Python☆155Updated 6 months ago
- Language detection extension for spaCy 2.0+☆111Updated 5 years ago
- Language detection using Spacy and Fasttext☆53Updated 9 months ago
- Rosette API Client Library for Python☆39Updated 2 months ago
- Python bindings to the Compact Language Detector☆32Updated 4 years ago
- A Python implementation of the Metaphone and Double Metaphone algorithms☆80Updated 6 months ago
- Use ML-Annotate to label data for machine learning purposes☆104Updated 4 years ago
- Language Models for Zalando's flair library☆62Updated 4 years ago
- Soundex Phonetic Code Algorithm Demo for Indian Languages. Supports all indian languages and English. Provides intra-indic string compari…☆54Updated 5 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- Projects☆21Updated 7 years ago
- Useful decorators every Data Scientist should know☆28Updated last year
- Recipe for Spanish POS tagging using the CESS corpus with NLTK☆17Updated 7 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆144Updated 8 months ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆170Updated 2 years ago
- ☆65Updated 2 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- Detect Language API Python Client☆69Updated 2 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆180Updated last year
- Dataframe Integration with spaCy.☆100Updated 3 years ago
- geonamescache - a Python library for quick access to a subset of GeoNames data.☆100Updated last month
- ☆46Updated this week
- Running Prodigy for a team of annotators☆53Updated 3 years ago
- Extract dates from text☆64Updated 3 years ago
- 📝Natural language processing (NLP) utils: word embeddings (Word2Vec, GloVe, FastText, ...) and preprocessing transformers, compatible wi…☆60Updated last year