hackerb9 / gwordlist
All the words from Google Books, sorted by frequency
☆109Updated last year
Related projects ⓘ
Alternatives and complementary repositories for gwordlist
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆50Updated last year
- The Open English WordNet☆476Updated last week
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- The 134,000+ words and their pronunciations in the CMU pronouncing dictionary☆67Updated 3 years ago
- Gather modern English word frequencies from all enwiki articles.☆204Updated 8 months ago
- Collaborative data curation for Glottolog☆152Updated this week
- Sentence aligner☆108Updated 3 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- The Unicode Cookbook for Linguists☆53Updated 4 years ago
- The World Atlas of Language Structures☆55Updated last month
- An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship ty…☆79Updated 6 months ago
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆44Updated 3 weeks ago
- Verb forms dictionary☆60Updated 7 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆62Updated 2 months ago
- WordNet in JSON format.☆91Updated 4 years ago
- A list of vocabulary lists☆21Updated 4 years ago
- Interactive visualization of Wiktionary words and etymologies.☆90Updated this week
- Lexical database for ~70k English words with morphological variables☆38Updated 2 years ago
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- Text to IPA converter in JavaScript☆52Updated 2 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆22Updated 7 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆27Updated 3 years ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆185Updated 4 years ago
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆48Updated last year
- A cloud-based, open-source system for writing and publishing dictionaries.☆86Updated 10 months ago
- Bitextor generates translation memories from multilingual websites☆291Updated last week
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆22Updated 4 years ago
- A modern, interlingual wordnet interface for Python☆221Updated last week