IlyaSemenov / wikipedia-word-frequencyLinks
Gather modern English word frequencies from all enwiki articles.
☆218Updated last year
Alternatives and similar repositories for wikipedia-word-frequency
Users that are interested in wikipedia-word-frequency are comparing it to the libraries listed below
Sorting:
- A Python Wiktionary Parser☆361Updated 4 months ago
- A modern, interlingual wordnet interface for Python☆254Updated last week
- The Open English WordNet☆585Updated 2 weeks ago
- A list of vocabulary lists☆21Updated 5 years ago
- Machine-readable lists of lemma-token pairs in 23 languages.☆341Updated 3 years ago
- Sentence aligner☆115Updated 4 years ago
- Open Language Profiles — English profile datasets from CEFR-J☆134Updated 5 years ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆197Updated 4 years ago
- A python module for English lemmatization and inflection.☆268Updated last year
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆103Updated last month
- Morphological Dictionaries for German Language☆29Updated 7 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆248Updated 2 years ago
- Verb forms dictionary☆65Updated 7 years ago
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆30Updated 5 years ago
- LingPy: Python library for quantitative tasks in historical linguistics☆136Updated 4 months ago
- Crawler for linguistic corpora☆204Updated last year
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆30Updated 2 weeks ago
- Machine-Translation-based sentence alignment tool for parallel text☆310Updated 4 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated 2 years ago
- WordNet in JSON format.☆91Updated 4 years ago
- Universal Dependencies online documentation☆287Updated last week
- Improved Sentence Alignment in Linear Time and Space☆175Updated 2 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆47Updated 2 years ago
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆24Updated 3 years ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆38Updated 8 months ago
- Bitextor generates translation memories from multilingual websites☆294Updated 8 months ago
- A multilingual parallel corpus created from translations of the Bible.☆182Updated last month
- German Morphological Analyzer☆47Updated 3 years ago
- Python Finite-State Toolkit☆56Updated 3 weeks ago
- Translation Memory Open-source Purifier☆34Updated 2 years ago