skywind3000 / lemma.en
English Lemma Database - Compiled by Referencing British National Corpus
☆29Updated last month
Related projects ⓘ
Alternatives and complementary repositories for lemma.en
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- Wikitionary in accessible JSON format☆34Updated last year
- Machine-readable lists of lemma-token pairs in 23 languages.☆332Updated 2 years ago
- hand-written dictionaries from the FreeDict project☆393Updated 3 weeks ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- The Open English WordNet☆473Updated last week
- CLDR text segmentation for JavaScript☆38Updated 6 months ago
- A list of vocabulary lists☆21Updated 4 years ago
- Gather modern English word frequencies from all enwiki articles.☆202Updated 8 months ago
- A modern, interlingual wordnet interface for Python☆217Updated last week
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆43Updated last week
- This packages up data for the Open Multilingual Wordnet☆43Updated last week
- A Python Wiktionary Parser☆359Updated 9 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆143Updated this week
- 🏆 • 5050 most frequent words in 109 languages☆35Updated last year
- British English pronunciation dictionary☆89Updated 7 years ago
- JavaScript Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.☆65Updated 3 years ago
- Verb forms dictionary☆60Updated 7 years ago
- Morphological Dictionaries for German Language☆28Updated 6 years ago
- Tokenizes Chinese texts into words.☆95Updated last year
- Java Wiktionary Library☆57Updated last year
- Tools for professional translators running GNU/Linux☆27Updated 2 years ago
- This repository contains code behind the visualization of the Wikimedia tool etytree at http://tools.wmflabs.org/etytree/☆50Updated 5 years ago
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆145Updated 7 months ago
- Lexical database of any language☆174Updated 2 years ago
- A tool to find grammar patterns in Chinese text☆24Updated 4 years ago
- Sentence aligner☆108Updated 3 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆229Updated 2 years ago
- An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship ty…☆75Updated 5 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago