nachocab / words-by-frequencyLinks
A repository of words in multiple languages sorted by their frequency
☆12Updated 2 years ago
Alternatives and similar repositories for words-by-frequency
Users that are interested in words-by-frequency are comparing it to the libraries listed below
Sorting:
- A simple phonetic respelling for the English language☆10Updated 2 months ago
- Extract data from German Wiktionary XML files.☆26Updated 2 weeks ago
- A list of vocabulary lists☆22Updated 5 years ago
- Unicode-only CJKV IDS data☆13Updated last year
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆24Updated 8 years ago
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆15Updated 6 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆108Updated 3 weeks ago
- Gather modern English word frequencies from all enwiki articles.☆227Updated last year
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆49Updated 2 years ago
- universal syllabification algorithms☆45Updated 2 years ago
- Offline etymological dictionary based on Wiktionary data☆22Updated 3 years ago
- Helsinki Finite-State Technology (library and application suite)☆136Updated last month
- A component-based CJK character search engine☆14Updated last year
- Hyphenation of English words☆13Updated 8 years ago
- Chinese lexicon containing definitions, character origins, and statistics, built for Dong Chinese (https://www.dong-chinese.com)☆56Updated last month
- A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the dat…☆161Updated 11 months ago
- Web front end for WikDict dictionaries☆21Updated last month
- A library for fetching and reading Tatoeba's weekly exports☆24Updated last week
- English Lemma Database - Compiled by Referencing British National Corpus☆33Updated last year
- Linguistic Reconstruction with LingPy☆15Updated last year
- A tool for transliterating Hebrew☆48Updated last week
- The Open English WordNet☆677Updated last week
- CLDF: Cross-Linguistic Data Formats - the specification☆61Updated 4 months ago
- Etymological graphs based on Wiktionary dumps☆23Updated 9 months ago
- Offline bilingual dictionaries made using data from Wiktionary☆62Updated 10 years ago
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆101Updated 2 years ago
- Trained taggers, tokenizers, etc. for the CLTK☆10Updated 3 years ago
- 🏆 • 5050 most frequent words in 109 languages☆48Updated 3 years ago
- Open source, updated Whitaker's Words Latin Dictionary and Morphology in Python☆59Updated 8 years ago
- Verb forms dictionary☆67Updated 8 years ago