michmech / lemmatization-listsView external linksLinks
Machine-readable lists of lemma-token pairs in 23 languages.
☆358Jan 29, 2022Updated 4 years ago
Alternatives and similar repositories for lemmatization-lists
Users that are interested in lemmatization-lists are comparing it to the libraries listed below
Sorting:
- ☆38Mar 30, 2024Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆185Jun 6, 2025Updated 8 months ago
- 📂 Additional lookup tables and data resources for spaCy☆113Jun 4, 2025Updated 8 months ago
- A python module for English lemmatization and inflection.☆273Sep 14, 2023Updated 2 years ago
- ☆46Mar 30, 2024Updated last year
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Dec 15, 2023Updated 2 years ago
- Repository for Frequency Word List Generator and processed files☆1,442Feb 7, 2022Updated 4 years ago
- ☆16Sep 13, 2016Updated 9 years ago
- Morphological Dictionaries for German Language☆30Apr 6, 2018Updated 7 years ago
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆19Updated this week
- Breaks a word into syllables using an LSTM-based neural network.☆20Aug 14, 2023Updated 2 years ago
- Dice.com's relevancy feedback solr plugin created by Simon Hughes (Dice). Contains request handlers for doing MLT style recommendations, …☆23May 12, 2021Updated 4 years ago
- Language Acquisition Research Tools☆43Nov 16, 2025Updated 2 months ago
- django-mdict是django实现的mdict词典查询工具。☆56Oct 21, 2024Updated last year
- Fast Python Vowpal Wabbit wrapper☆13Mar 31, 2021Updated 4 years ago
- Fast Network Scan for devices/services☆16Jul 23, 2015Updated 10 years ago
- Plugin for EPUB ebooks' edition☆13Dec 9, 2015Updated 10 years ago
- Calendar library for neovim☆16Jan 2, 2026Updated last month
- DELPH-IN Documentation☆29Feb 1, 2026Updated last week
- German lemmatization with IWNLP as extension for spaCy☆26Jul 28, 2023Updated 2 years ago
- ☆14Mar 30, 2023Updated 2 years ago
- BabelNet (and WordNet) sense embedding trained with Word2Vec and FastText☆10Sep 3, 2019Updated 6 years ago
- This repository contains the Potsdam Textbook Corpus (PoTeC) which is a natural reading eye-tracking corpus.☆14Dec 31, 2025Updated last month
- The XML schema and example XML files for DASH (ISO/IEC 23009-1)☆15Jan 28, 2026Updated 2 weeks ago
- Java implmentation of LemmaGen project☆11Feb 15, 2022Updated 3 years ago
- SCTE-35 Inserter for MPEGTS. SuperKabuki is SCTE-35 Packet Injection for Ad Insertion, powered by threefive.☆12Sep 13, 2024Updated last year
- Trained taggers, tokenizers, etc. for the CLTK☆10Feb 15, 2022Updated 3 years ago
- This plugin provides a useful feature for multi-language☆14Jul 15, 2022Updated 3 years ago
- Interpretable feature construction from taxonomies for text classification☆18Apr 4, 2022Updated 3 years ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆120Oct 20, 2025Updated 3 months ago
- Evaluate language models using multiple choice items☆13Updated this week
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 3 months ago
- SAGA - Phonetic transcription software for all Spanish variants.☆13Nov 12, 2020Updated 5 years ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆58Aug 4, 2025Updated 6 months ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Jul 5, 2019Updated 6 years ago
- a GUI to help visually tweaking Solr edismax☆19Apr 8, 2015Updated 10 years ago
- About 6,500 Irish lemmas ordered by corpus frequency, with noise removed.☆37May 11, 2018Updated 7 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆58Jul 1, 2021Updated 4 years ago
- Code for morphological transformations☆29Jun 3, 2017Updated 8 years ago