michmech / lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages.
☆333Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for lemmatization-lists
- English Lemma Database - Compiled by Referencing British National Corpus☆29Updated 2 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆145Updated this week
- A modern, interlingual wordnet interface for Python☆221Updated this week
- 📂 Additional lookup tables and data resources for spaCy☆98Updated last year
- All languages stopwords collection☆423Updated 10 months ago
- Universal Dependencies online documentation☆273Updated this week
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆230Updated 2 years ago
- A python module for English lemmatization and inflection.☆261Updated last year
- 🎀 JavaScript API for spaCy with Python REST API☆193Updated last year
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆249Updated 2 months ago
- Various utilities for processing the data.☆207Updated this week
- A cloud-based, open-source system for writing and publishing dictionaries.☆86Updated 10 months ago
- spaCy REST API, wrapped in a Docker container.☆265Updated last year
- FreeLing project source code☆255Updated last year
- UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files☆364Updated last week
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆112Updated 6 months ago
- Lexical database for ~70k English words with morphological variables☆38Updated 2 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated 3 months ago
- A compound word splitter for Python☆48Updated 3 years ago
- Lexical database of any language☆175Updated 2 years ago
- Automatically exported from code.google.com/p/universal-pos-tags☆128Updated 2 years ago
- A Python Wiktionary Parser☆360Updated 10 months ago
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- Sentence aligner☆108Updated 3 years ago
- The Open English WordNet☆476Updated last week
- PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, an…☆479Updated last year
- Quickly extract multi-word phrases from a corpus☆191Updated 4 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆149Updated last year
- A multilingual parallel corpus created from translations of the Bible.☆176Updated 2 months ago