Machine-readable lists of lemma-token pairs in 23 languages.
β361Jan 29, 2022Updated 4 years ago
Alternatives and similar repositories for lemmatization-lists
Users that are interested in lemmatization-lists are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ189Jun 6, 2025Updated 9 months ago
- π Additional lookup tables and data resources for spaCyβ113Jun 4, 2025Updated 9 months ago
- Repository for Frequency Word List Generator and processed filesβ1,463Feb 7, 2022Updated 4 years ago
- A python module for English lemmatization and inflection.β275Sep 14, 2023Updated 2 years ago
- Un generador de nombres de poblaciones usando una red neuronal LSTMβ14Mar 24, 2023Updated 2 years ago
- β14Jan 2, 2026Updated 2 months ago
- (AAAI'20) The source code for the paper "Joint Parsing and Generation for Abstractive Summarization".β24Apr 22, 2020Updated 5 years ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interfaceβ262Aug 21, 2025Updated 7 months ago
- Dice.com's relevancy feedback solr plugin created by Simon Hughes (Dice). Contains request handlers for doing MLT style recommendations, β¦β23May 12, 2021Updated 4 years ago
- Anki History Visualizationβ19Nov 28, 2024Updated last year
- Rust bindings for the spaCy library.β24Dec 11, 2022Updated 3 years ago
- Curated list of Linguistic Resources for doing NLP & CL on Spanishβ349Jan 9, 2024Updated 2 years ago
- UD Greekβ22Dec 5, 2025Updated 3 months ago
- German lemmatization with IWNLP as extension for spaCyβ27Jul 28, 2023Updated 2 years ago
- WordNet behind a REST interfaceβ13Apr 9, 2025Updated 11 months ago
- Autojump for Total Commander !!β13Nov 25, 2020Updated 5 years ago
- a service to read mdx/mdd file and provide http interfaceβ260Jul 2, 2021Updated 4 years ago
- Wiktionary dump file parser and multilingual data extractorβ1,122Mar 16, 2026Updated last week
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Itsβ¦β18Jan 15, 2026Updated 2 months ago
- Tools and Data for the CMU Pronouncing Dictionaryβ16Dec 9, 2018Updated 7 years ago
- generate a html or pdf or jpg file for specific words through a mdx dirctionaryβ41Dec 11, 2023Updated 2 years ago
- spaCy-to-naf converterβ21Jun 10, 2025Updated 9 months ago
- Preliminary spaCy models for Latinβ14Oct 20, 2022Updated 3 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Foβ¦β108Mar 9, 2026Updated 2 weeks ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict formatβ33Jul 5, 2019Updated 6 years ago
- Lemmatiser for Danish, Dutch, English, German, Polish, Romanian, Russian and tens of other languages, that uses affix rules (affix: prefiβ¦β36Jun 26, 2025Updated 8 months ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+β13Oct 18, 2025Updated 5 months ago
- A library for fetching and reading Tatoeba's weekly exportsβ24Feb 5, 2026Updated last month
- β16Sep 13, 2016Updated 9 years ago
- Access to lexical databasesβ151Feb 11, 2026Updated last month
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)β392Nov 7, 2023Updated 2 years ago
- Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languagesβ15Apr 11, 2020Updated 5 years ago
- wordpos for the web/browserβ43May 7, 2021Updated 4 years ago
- π JavaScript API for spaCy with Python REST APIβ201Sep 16, 2023Updated 2 years ago
- Code for morphological transformationsβ29Jun 3, 2017Updated 8 years ago
- About 6,500 Irish lemmas ordered by corpus frequency, with noise removed.β37May 11, 2018Updated 7 years ago
- words frequency top100k from BNC/ANC/COCA, dsl format, for goldendictβ66Dec 17, 2016Updated 9 years ago
- This plugin provides a useful feature for multi-languageβ14Jul 15, 2022Updated 3 years ago
- Access a database of word frequencies, in various natural languages.β1,640Jan 4, 2025Updated last year