hackerb9 / gwordlist
All the words from Google Books, sorted by frequency
☆115Updated last year
Alternatives and similar repositories for gwordlist:
Users that are interested in gwordlist are comparing it to the libraries listed below
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code☆66Updated last year
- Offline bilingual dictionaries made using data from Wiktionary☆54Updated 10 years ago
- British English pronunciation dictionary☆95Updated 7 years ago
- Verb forms dictionary☆65Updated 7 years ago
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆34Updated 2 months ago
- Dictionary text which might be helpful for App developments☆60Updated 4 years ago
- The 134,000+ words and their pronunciations in the CMU pronouncing dictionary☆78Updated 3 years ago
- SCOWL (and friends).☆419Updated 3 weeks ago
- Sources of Collatinus software - Latin lemmatizer, morphological analyzer and scansion☆75Updated 2 weeks ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated 2 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆90Updated this week
- ☆97Updated 3 years ago
- Export UNIHAN's database to csv, json or yaml☆57Updated this week
- A modern, interlingual wordnet interface for Python☆244Updated this week
- A list of vocabulary lists☆21Updated 4 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆99Updated this week
- CLDR text segmentation for JavaScript☆38Updated last year
- Latin language dictionaries☆36Updated 4 years ago
- Public repository for Coptic SCRIPTORIUM Corpora Releases☆35Updated 3 weeks ago
- Collaborative data curation for Glottolog☆160Updated last week
- WordNet in JSON format.☆91Updated 4 years ago
- X-SAMPA to IPA converter☆25Updated 4 years ago
- Coquery is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpu…☆19Updated 2 years ago
- CMU US English Dictionary☆677Updated 4 months ago
- Various utilities for processing the data.☆209Updated this week
- Perseus Treebank Data☆72Updated 10 months ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆13Updated last year
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆63Updated 3 weeks ago